Taxonomy Distribution Report
Overview
- Total Concepts: 475
- Number of Taxonomies: 15
- Average Concepts per Taxonomy: 31.7
Distribution Summary
| Category | TaxonomyID | Count | Percentage | Status |
|---|---|---|---|---|
| OBS | OBS | 65 | 13.7% | ✅ |
| Foundation Concepts - Prerequisites | FOUND | 50 | 10.5% | ✅ |
| OPT | OPT | 45 | 9.5% | ✅ |
| RAG | RAG | 40 | 8.4% | ✅ |
| BUDG | BUDG | 35 | 7.4% | ✅ |
| ROUT | ROUT | 30 | 6.3% | ✅ |
| HARN | HARN | 28 | 5.9% | ✅ |
| ECON | ECON | 27 | 5.7% | ✅ |
| ANTH | ANTH | 25 | 5.3% | ✅ |
| OAI | OAI | 25 | 5.3% | ✅ |
| SKIL | SKIL | 25 | 5.3% | ✅ |
| AB | AB | 25 | 5.3% | ✅ |
| GOOG | GOOG | 20 | 4.2% | ✅ |
| CAP | CAP | 20 | 4.2% | ✅ |
| PRIV | PRIV | 15 | 3.2% | ✅ |
Visual Distribution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Balance Analysis
✅ No Over-Represented Categories
All categories are under the 30% threshold. Good balance!
Category Details
OBS (OBS)
Count: 65 concepts (13.7%)
Concepts:
-
- Structured Logging
-
- Log Schema Design
-
- Log Line
-
- JSON Log Format
-
- Log Field
-
- Required Log Field
-
- Optional Log Field
-
- Model Field
-
- Prompt Hash
-
- Input Token Field
-
- Output Token Field
-
- Cached Token Field
-
- Latency Field
-
- Cost Field
-
- Feature Tag
- ...and 50 more
Foundation Concepts - Prerequisites (FOUND)
Count: 50 concepts (10.5%)
Concepts:
-
- Generative AI
-
- Large Language Model
-
- Foundation Model
-
- Transformer Architecture
-
- Autoregressive Generation
-
- Token
-
- Input Token
-
- Output Token
-
- Cached Token
-
- Reasoning Token
-
- Token Count
-
- Tokenizer
-
- Byte Pair Encoding
-
- SentencePiece
-
- Vocabulary Size
- ...and 35 more
OPT (OPT)
Count: 45 concepts (9.5%)
Concepts:
-
- Prompt Engineering
-
- System Prompt Hygiene
-
- Instruction Compression
-
- Few-Shot Example
-
- Few-Shot Pruning
-
- Zero-Shot Prompting
-
- Chain Of Thought
-
- Dead Context
-
- Redundant Instruction
-
- Verbose Boilerplate
-
- Prompt Template
-
- Template Versioning
-
- Prompt Variable
-
- Variable Interpolation
-
- Prompt Compression Tool
- ...and 30 more
RAG (RAG)
Count: 40 concepts (8.4%)
Concepts:
-
- Retrieval Augmented Generation
-
- Embedding
-
- Vector Database
-
- Chunking
-
- Chunk Size
-
- Chunk Overlap
-
- Top-K Retrieval
-
- Reranker
-
- Cross-Encoder Reranker
-
- Retrieval Score
-
- Context Injection
-
- Retrieved Context Bloat
-
- Context Pruning
-
- Hybrid Retrieval
-
- BM25 Retrieval
- ...and 25 more
BUDG (BUDG)
Count: 35 concepts (7.4%)
Concepts:
-
- Agent Budget Policy
-
- Per-Session Token Budget
-
- Per-Session Tool Call Budget
-
- Loop Iteration Limit
-
- Wall Clock Limit
-
- Cost Cap
-
- Graceful Degradation
-
- Budget Exhaustion Handling
-
- Runaway Detection
-
- Circuit Breaker Pattern
-
- Tool Call Throttling
-
- Subtask Budget Allocation
-
- Budget Audit Log
-
- Budget Reporting
-
- Per-Engineer Budget
- ...and 20 more
ROUT (ROUT)
Count: 30 concepts (6.3%)
Concepts:
-
- Model Routing
-
- Cheap-First Cascade
-
- Escalation Trigger
-
- Confidence Threshold
-
- Quality Gate
-
- Fallback Model
-
- Cross-Vendor Routing
-
- Task Classifier
-
- Difficulty Estimation
-
- Routing Policy
-
- Routing Cost Savings
-
- Routing Quality Risk
-
- Per-Task Model Selection
-
- Vendor Lock-In Risk
-
- Vendor-Neutral Abstraction
- ...and 15 more
HARN (HARN)
Count: 28 concepts (5.9%)
Concepts:
-
- AI Coding Harness
-
- Agentic Loop
-
- Tool Use Loop
-
- Claude Code
-
- Claude Code Session
-
- Claude Code Hooks
-
- OpenAI Codex CLI
-
- Codex Session
-
- Google Antigravity
-
- Antigravity Workspace
-
- Harness System Prompt
-
- Harness Token Overhead
-
- Session Token Accumulation
-
- Per-Session Token Cost
-
- Conversation Compaction
- ...and 13 more
ECON (ECON)
Count: 27 concepts (5.7%)
Concepts:
-
- Per-Million-Token Price
-
- Input Token Price
-
- Output Token Price
-
- Cached Input Price
-
- Output Premium
-
- Unit Economics
-
- Cost Per Request
-
- Cost Per Feature
-
- Cost Per User
-
- Cost Per Outcome
-
- Cost Attribution
-
- Token Budget
-
- Monthly Token Spend
-
- Forecasting Token Cost
-
- Cost-Quality Tradeoff
- ...and 12 more
ANTH (ANTH)
Count: 25 concepts (5.3%)
Concepts:
-
- Anthropic API
-
- Claude Messages API
-
- Claude Model Family
-
- Claude Opus
-
- Claude Sonnet
-
- Claude Haiku
-
- Anthropic SDK
-
- API Key Management
-
- Anthropic Prompt Caching
-
- Cache Control Parameter
-
- Cache Breakpoint
-
- Cache TTL
-
- Cache Read Tokens
-
- Cache Write Tokens
-
- Extended Thinking
- ...and 10 more
OAI (OAI)
Count: 25 concepts (5.3%)
Concepts:
-
- OpenAI API
-
- Chat Completions API
-
- OpenAI Responses API
-
- OpenAI Model Family
-
- GPT Model Series
-
- OpenAI O Series
-
- Reasoning Model
-
- OpenAI SDK
-
- Function Calling
-
- Tool Choice Parameter
-
- JSON Mode
-
- Structured Outputs
-
- Response Format
-
- OpenAI Streaming
-
- OpenAI Batch API
- ...and 10 more
SKIL (SKIL)
Count: 25 concepts (5.3%)
Concepts:
-
- Skill
-
- Skill Description
-
- Skill Body
-
- Skill Trigger
-
- Skill Invocation
-
- Skill Frontmatter
-
- Skill Bundle
-
- Bundled Script
-
- Skill Asset File
-
- Lazy Skill Loading
-
- Eager Skill Listing
-
- Task Decomposition
-
- Task-Skill Binding
-
- Skill Selection
-
- Skill Misfire
- ...and 10 more
AB (AB)
Count: 25 concepts (5.3%)
Concepts:
-
- A/B Testing
-
- Hypothesis
-
- Null Hypothesis
-
- Control Group
-
- Treatment Group
-
- Traffic Split
-
- Random Assignment
-
- Stratified Assignment
-
- Primary Metric
-
- Guardrail Metric
-
- Quality Metric
-
- Cost Metric
-
- Latency Metric
-
- Satisfaction Metric
-
- Sample Size Calculation
- ...and 10 more
GOOG (GOOG)
Count: 20 concepts (4.2%)
Concepts:
-
- Google Gemini API
-
- Gemini Model Family
-
- Gemini Pro
-
- Gemini Flash
-
- Gemini Ultra
-
- Gemini SDK
-
- Long Context Window
-
- One Million Context
-
- Gemini Function Calling
-
- Gemini Tool Config
-
- Gemini Streaming
-
- Gemini Batch Mode
-
- Gemini Caching
-
- Vertex AI
-
- Google AI Studio
- ...and 5 more
CAP (CAP)
Count: 20 concepts (4.2%)
Concepts:
-
- Baseline Cost Measurement
-
- Optimization Hypothesis
-
- Quality Regression Detection
-
- Before-After Report
-
- Optimization Backlog
-
- Cost Reduction Target
-
- Pilot Rollout
-
- Canary Deployment
-
- Token Dashboard Project
-
- Vendor-Neutral Logging Project
-
- Skill Refactor Project
-
- Budget Policy Document
-
- Engineering Manager Review
-
- Cost Reduction Postmortem
-
- Reproducible Benchmark
- ...and 5 more
PRIV (PRIV)
Count: 15 concepts (3.2%)
Concepts:
-
- Data Privacy
-
- PII Detection
-
- Sensitive Field Redaction
-
- Compliance Risk
-
- GDPR
-
- HIPAA
-
- SOC2 Audit
-
- Data Residency
-
- Vendor Data Retention
-
- Opt-Out Of Training
-
- Logging Privacy Risk
-
- Hashing Sensitive Strings
-
- Tokenized Identifier
-
- Audit Trail
-
- Anonymization Strategy
Recommendations
- ✅ Excellent balance: Categories are evenly distributed (spread: 10.5%)
- ✅ MISC category minimal: Good categorization specificity
Educational Use Recommendations
- Use taxonomy categories for color-coding in graph visualizations
- Design curriculum modules based on taxonomy groupings
- Create filtered views for focused learning paths
- Use categories for assessment organization
- Enable navigation by topic area in interactive tools
Report generated by learning-graph-reports/taxonomy_distribution.py