References: LLMs, Tokens, and Generation Basics

Large language model - Wikipedia - Comprehensive overview of LLM architecture, training, capabilities, and limitations. Establishes the foundational vocabulary used throughout the chapter for tokens, context windows, and autoregressive generation.
Lexical analysis - Wikipedia - Detailed explanation of how text is broken into tokens, including the formal distinction between characters, lexemes, and tokens. Provides the broader computer-science context for the BPE tokenization covered in Chapter 2.
Autoregressive model - Wikipedia - Coverage of the statistical structure that makes output-token generation sequential and explains why output tokens cost 3-5x input tokens.
Hands-On Large Language Models (1st Edition) - Jay Alammar and Maarten Grootendorst - O'Reilly - Chapters 1-3 cover the LLM mental model, tokenization, and embeddings with diagrams that align directly with this chapter's vocabulary.
Speech and Language Processing (3rd Edition Draft) - Daniel Jurafsky and James H. Martin - Stanford - The canonical academic reference for NLP fundamentals; chapters on neural language models give the rigorous foundation for the autoregressive generation discussion.
Anthropic Models Overview - Anthropic - Authoritative reference for Claude model identifiers, context window sizes, and capabilities used throughout the textbook's vendor-specific examples.
OpenAI Models Documentation - OpenAI - Reference list of GPT and o-series models with context window sizes and tokenizer notes; useful for verifying the per-vendor differences this chapter introduces.
Let's build GPT: from scratch, in code, spelled out - Andrej Karpathy - Two-hour video walkthrough of building a GPT-style model that grounds the abstract terms in working code; ideal for engineers who learn by watching implementation.
3Blue1Brown: But what is a GPT? - Grant Sanderson - Visual, animation-driven explanation of how transformer models generate tokens one at a time; complements the chapter's text with intuition-building diagrams.
Hugging Face NLP Course - Hugging Face - Free interactive course with chapters on tokenization, models, and the transformer architecture; the practical exercises reinforce the concepts introduced here.