Session Token Accumulation
About This MicroSim
A stacked area chart of cumulative tokens consumed per harness iteration, broken down by source (system prompt, tool definitions, conversation history, tool results, outputs). Two cost overlay lines show no-cache vs with-cache cumulative dollars. The conversation-history band visibly grows quadratically with iteration count — every turn re-reads all previous turns.
How to Use
- Default state. 50 iterations. Look at the orange (conversation history) band — it dominates by iteration ~25.
- Disable caching. Cost overlay lines diverge — caching is essential to keeping long sessions affordable.
- Bump average tool result. The orange band grows even faster (more bytes per turn × triangular sum).
- Reduce iterations to 5. Conversation history is barely visible — sessions are cheap when short.
Bloom Level
Analyze (L4) — examine how harness sessions accumulate tokens turn by turn and identify which components grow linearly vs. quadratically.
Iframe Embed Code
1 | |
Lesson Plan
Audience
Engineers operating coding harnesses (Claude Code, Codex, Antigravity).
Duration
15–20 minutes inside Chapter 7.
Prerequisites
Chapter 7 sections on Session Token Accumulation, Per-Session Token Cost, Conversation Compaction.
Activities
- Identify the dominant band (3 min). At what iteration does conversation history overtake system prompt?
- Caching savings (5 min). Compare cost lines with caching on vs off.
- Compaction motivation (5 min). Notice that the orange band grows without limit — this is the structural reason auto-compaction exists.
Practice Scenarios
| # | Iterations | Sys prompt | Tool result | Caching | Implication |
|---|---|---|---|---|---|
| 1 | 10 | 5K | 200 | on | cheap session |
| 2 | 50 | 8K | 1K | on | conv history dominates |
| 3 | 100 | 8K | 1K | on | needs compaction |
| 4 | 50 | 8K | 1K | off | cost catastrophe |
| 5 | 50 | 20K | 5K | on | even with caching, expensive |
Assessment
Learner can identify why long sessions get expensive (quadratic conv history), why caching helps (sys prompt amortization), and when to insert compaction.
References
- Anthropic Documentation — Conversation compaction.
- Chapter 7 — Conversation History, Conversation Compaction.
Senior Instructional Designer Quality Review
Reviewer perspective: 15+ years designing engineering curricula for adult professional learners.
Overall verdict
Approve as-is for Chapter 7. Score: 88/100 (B+). Stacked area is the right primitive for L4 "examine which components grow at what rate." The quadratic growth of conversation history is THE critical insight, and the chart makes it visceral.
What works
- Bloom alignment. Decomposing accumulation by source is exactly L4 "examine."
- The triangular conversation-history sum. Visualization shows it; learners feel it without doing the math.
- Cost overlay on second axis. Translates token shape to dollars.
Gaps
- No "compaction event" annotation. A button to insert a compaction at iteration N — collapsing the conversation band — would teach the remediation. Score impact: −3.
- Quadratic vs linear isn't explicitly labeled. A small "(quadratic)" tag on the conversation series helps. Score impact: −1.
- Output growth is shown linear; in reality output sometimes shrinks as agent converges. Modeling that would teach realism. Score impact: −1.
Accessibility
Color-blind safe with five distinct hues; legend shows series labels.
Cognitive load
5 stacked series + 2 overlay lines = 7 series. At the upper edge.
Recommendation
Approve. Open follow-up for compaction-event button (gap 1).