Sampling Parameter Explorer
About This MicroSim
For the prompt "The red panda climbed the ___", a fixed base distribution over 10 candidate next tokens. Move temperature and top-p sliders to see the distribution reshape. Click "Sample 1 token" to draw once and highlight the result, or "Sample 100 times" to draw a histogram of empirical frequencies overlaid on the theoretical bars.
How to Use
- Default state. T=1.0, top-p=1.0. Distribution looks like the base distribution. "tree" is most likely.
- Drop temperature to 0.2. The distribution sharpens — "tree" gets even more likely.
- Raise temperature to 2.0. Distribution flattens — every token has roughly equal probability.
- Drop top-p to 0.6. Watch lower-probability tokens gray out — they're outside the nucleus and impossible to sample.
- Click Sample 100 times. Orange outlines show empirical frequencies; they should match the blue theoretical bars closely.
Bloom Level
Apply (L3) — demonstrate how temperature and top-p modify a fixed token probability distribution and predict the resulting selection behavior.
Iframe Embed Code
1 | |
Lesson Plan
Audience
Engineers debugging LLM output quality issues or building features sensitive to determinism.
Duration
10–15 minutes inside Chapter 2.
Prerequisites
Chapter 2 sections on Temperature and Top-P Sampling.
Activities
- Predict-then-verify temperature (5 min). Predict the shape at T=0.2, then confirm.
- Find the top-p threshold (5 min). Find the smallest top-p that still includes "branch."
- Sample 100 (5 min). Compare empirical vs theoretical at T=1.0. Note the small variance — 100 samples is barely enough to verify the distribution shape.
Practice Scenarios
| # | T | top-p | Predicted top-1 frequency |
|---|---|---|---|
| 1 | 1.0 | 1.0 | ~45% (base) |
| 2 | 0.2 | 1.0 | ~95% (sharpened) |
| 3 | 2.0 | 1.0 | ~15% (flat-ish) |
| 4 | 1.0 | 0.5 | ~70% ("tree" + "bamboo" only) |
| 5 | 0.0 | 1.0 | 100% (greedy — but our slider min is 0.05) |
Assessment
Learner can predict, given a temperature and top-p, which tokens have non-zero probability and roughly what the top-1 frequency will be.
References
- Chapter 2 — Temperature, Top P Sampling.
- The Curious Case of Neural Text Degeneration (Holtzman et al., 2019) — foundational nucleus sampling paper.
Senior Instructional Designer Quality Review
Reviewer perspective: 15+ years designing engineering and ML curricula for adult professional learners.
Overall verdict
Approve as-is for Chapter 2. Score: 89/100 (B+). Predict → modify → empirical-verify is the canonical L3 "demonstrate" interaction, and this sim implements it cleanly.
What works
- Bloom alignment. L3 "demonstrate" by manipulating a parameter and observing the result.
- Empirical sampling overlay. Closes the loop between theoretical probability and actual draws.
- Top-p nucleus visualization. Graying tokens outside the nucleus is the right way to teach the cutoff.
- Static base distribution. Allows direct comparison across parameter changes.
Gaps
- Sample variance not surfaced. With 100 samples, "tree" can range 35-55%. A small "expected variance ±X%" annotation would teach statistics. Score impact: −2.
- No cumulative-distribution view. Top-p selects based on cumulative probability; an optional CDF overlay would teach the mechanism. Score impact: −2.
- Cannot edit base distribution. Loading user-provided distributions would generalize. Score impact: −1.
Accessibility
Native sliders are keyboard-accessible. Color contrast (blue / gray / orange / green) is color-blind safe.
Cognitive load
2 sliders + 3 buttons + 10-bar chart. Tractable.
Recommendation
Approve. Open follow-up for sample-variance annotation (gap 1).