Tag: nlp

Research Papers

LLM Token: A Comprehensive Analysis of Large Language Model Tokenization

Research Team • May 27, 2024

This paper provides a comprehensive analysis of tokenization in large language models, exploring the fundamental mechanisms that enable LLMs to process and understand text. The research examines various tokenization strategies, their impact on model performance, and the implications for natural language processing tasks.

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit • December 6, 2017

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova • October 11, 2018

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.

Tag: nlp

Research Papers

LLM Token: A Comprehensive Analysis of Large Language Model Tokenization

Attention Is All You Need

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Blog Posts

Byte Pair Encoding (BPE) Explained with the Banana Bandana Example

Beyond the Prompt: Crafting Sophisticated AI Responses Through Contextual Integration

Tiny but Mighty: How Small Language Models Are Beating the Giants