1. 1
    Intro to Large Language Models

    The Busy Person’s Guide to Large Language Models: From Inner Workings to Future Possibilities (and Security Concerns) …

  2. 2
    Understanding Tokenization in Large Language Models: A Deep Dive – Part 1

    Tokenization is a fundamental yet often misunderstood process in the realm of large language models (LLMs). Despite its …

  3. 3
    Tokenization

    Natural Language Processing (NLP) has revolutionized the way machines understand human language. But before models can …

  4. 4
    Byte Pair Encoding (BPE): the tokenizer that made GPTs practical

    Introduction Byte Pair Encoding (BPE) is a subword tokenization scheme that gives us the best of both worlds: compact …

  5. 5
    Token Embeddings — what they are, why they matter, and how to build them (with working code)

    Introduction Token embeddings (aka vector embeddings) turn tokens — words, subwords, or characters — into numeric …

  6. 6
    Q K V : Query (Q), Key (K), and Value (V) Vectors in the Attention Mechanism

    Introduction In the attention mechanism used by Large Language Models (LLMs) like transformers (e.g., GPT), the core …

  7. 7
    Attention Mechanisms Explained: Self-Attention, Cross-Attention, Sparse Attention, MQA, GQA, and DeepSeek MLA

    A detailed guide to attention mechanisms in modern AI, from Bahdanau attention and Transformers to local attention, sparse attention, linear attention, multi-query attention, grouped-query attention, and DeepSeek's multi-head latent attention.

  8. 8
    Understanding LLM Architecture: Layers, Transformer Blocks, and Attention Heads

    A practical guide to the internal architecture of large language models, including embeddings, transformer blocks, self-attention, attention heads, MLP layers, residual connections, and execution parallelism.

  9. 9
    RoPE Explained: The Positional Encoding Trick Behind Modern Language Models

    A practical guide to Rotary Positional Embedding (RoPE), including why transformers need positional information, how RoPE rotates queries and keys, and why it became a standard choice in modern LLMs.