Parallel Execution of Tasks
Claude Code now allows you to executes tasks in parallel. The good news is that you can now generate ALL the content of all the chapters in your entire 12 chapter book in 1/12th the time it used to take you. The bad news is that if you choose the parallel option your tokens will run out 38% faster.
Why Parallelism?
Pros of Parallel Task Execution
When you are not having problems with rate limits the tasks finish much faster. The exact speedup depends on how many tasks run in parallel. I have seen speedups of 6x for some projects.
Cons of Parallel Task Execution
There several big negatives:
- Much less token efficient (38% lower in my tests)
- Lost work when tasks fail due to rate limits
- Reliability
- Lack of visibility
When you hit a rate limit, much of your work will be lost.
- 3 of 6 agents completed before the rate limit hit, 1 was partial (missing JS), 2 failed completely
- 50% of deliverables required full manual rework by the parent agent
- ~780 lines of JavaScript had to be written manually vs. ~1,350 written by agents
- The most complex sim (dog-class-playground) was among the failures — rate limits disproportionately affect token-heavy tasks
- The partial failure (dog-class-uml-diagram) was arguably worse than total failure — it left wrong metadata (Geometry instead of CS) with TODO placeholders but no actual JS file
Each parallel agent is an independent subprocess with its own context. That means the fixed overhead gets multiplied by N:
Per-agent overhead (paid 6 times in parallel, once in serial):
- System prompt + safety rules (~5k tokens)
- CLAUDE.md + project CLAUDE.md (~3k tokens)
- Skill tool invocation → microsim-generator prompt expansion (~2-4k tokens)
- Reading reference files (existing sims for style matching, chapter content) — each agent independently reads the same files (~5-10k
tokens)
Rough math: - Fixed overhead per agent: ~15-20k tokens - Actual generation work: ~30-40k tokens - Parallel total: 6 × 50-60k = 300-360k tokens - Serial total: 20k (one-time overhead) + 6 × 35k (generation with shared context) = 230k tokens
The other factor: parallel agents can't learn from each other. If agent 1 reads class-vs-object-diagram.js to understand the project's MicroSim pattern, agents 2-6 independently read the same file and spend tokens on the same discovery. In serial execution, that knowledge stays in context and is reused for free.
So parallel is faster in wall-clock time but ~30-50% more expensive in tokens — which matters a lot when you're near a rate limit. In this session, the extra token consumption from 6 simultaneous agents is likely what pushed us over the limit in the first place.
