Concept List: Token Optimization
This is the enumerated concept list for the Token Optimization course. Each concept has a unique ConceptID (1-470) used throughout the learning graph.
- Generative AI
- Large Language Model
- Foundation Model
- Transformer Architecture
- Autoregressive Generation
- Token
- Input Token
- Output Token
- Cached Token
- Reasoning Token
- Token Count
- Tokenizer
- Byte Pair Encoding
- SentencePiece
- Vocabulary Size
- Context Window
- Context Length
- Prompt
- System Prompt
- User Message
- Assistant Message
- Conversation Turn
- Multi-Turn Dialogue
- Streaming Response
- Stop Sequence
- Max Tokens Parameter
- Temperature
- Top P Sampling
- Logprobs
- Model Family
- Subword Tokenization
- BPE Merge Rules
- Tiktoken Library
- Claude Tokenizer
- Gemini Tokenizer
- Token-To-Char Ratio
- Whitespace Handling
- Unicode Normalization
- Special Tokens
- End-Of-Sequence Token
- Beginning-Of-Sequence Token
- Padding Token
- Token Boundary
- Pre-Tokenization
- Cross-Vendor Tokenizer Drift
- Token Counting API
- Local Token Estimation
- Token Count Caching
- Code Tokenization
- Multilingual Tokenization
- Per-Million-Token Price
- Input Token Price
- Output Token Price
- Cached Input Price
- Output Premium
- Unit Economics
- Cost Per Request
- Cost Per Feature
- Cost Per User
- Cost Per Outcome
- Cost Attribution
- Token Budget
- Monthly Token Spend
- Forecasting Token Cost
- Cost-Quality Tradeoff
- Cost-Latency Tradeoff
- Pareto Frontier
- Pricing Tier
- Volume Discount
- Batch Discount
- Enterprise Pricing
- Free Tier Limit
- Rate Limit
- Quota Management
- Burn Rate
- Anthropic API
- Claude Messages API
- Claude Model Family
- Claude Opus
- Claude Sonnet
- Claude Haiku
- Anthropic SDK
- API Key Management
- Anthropic Prompt Caching
- Cache Control Parameter
- Cache Breakpoint
- Cache TTL
- Cache Read Tokens
- Cache Write Tokens
- Extended Thinking
- Thinking Token Budget
- Claude Tool Use
- Tool Definition Schema
- Tool Result Block
- Message Content Block
- Anthropic System Prompt
- Stop Reason
- Anthropic Streaming
- Anthropic Batch API
- Claude Vision Input
- OpenAI API
- Chat Completions API
- OpenAI Responses API
- OpenAI Model Family
- GPT Model Series
- OpenAI O Series
- Reasoning Model
- OpenAI SDK
- Function Calling
- Tool Choice Parameter
- JSON Mode
- Structured Outputs
- Response Format
- OpenAI Streaming
- OpenAI Batch API
- OpenAI Embeddings
- OpenAI Fine Tuning
- OpenAI Vision
- Logit Bias
- Seed Parameter
- Token Usage Object
- Prompt Tokens Field
- Completion Tokens Field
- Total Tokens Field
- OpenAI Rate Limits
- Google Gemini API
- Gemini Model Family
- Gemini Pro
- Gemini Flash
- Gemini Ultra
- Gemini SDK
- Long Context Window
- One Million Context
- Gemini Function Calling
- Gemini Tool Config
- Gemini Streaming
- Gemini Batch Mode
- Gemini Caching
- Vertex AI
- Google AI Studio
- Gemini Safety Settings
- Gemini Multimodal Input
- Gemini Code Execution
- Gemini Grounding
- Gemini Token Counting
- AI Coding Harness
- Agentic Loop
- Tool Use Loop
- Claude Code
- Claude Code Session
- Claude Code Hooks
- OpenAI Codex CLI
- Codex Session
- Google Antigravity
- Antigravity Workspace
- Harness System Prompt
- Harness Token Overhead
- Session Token Accumulation
- Per-Session Token Cost
- Conversation Compaction
- Auto Compaction
- Manual Compaction
- Tool Call Iteration
- Multi-Step Reasoning
- Subagent Pattern
- Agent Memory
- Persistent Memory File
- Working Directory Context
- File Read Tool
- File Edit Tool
- Skill
- Skill Description
- Skill Body
- Skill Trigger
- Skill Invocation
- Skill Frontmatter
- Skill Bundle
- Bundled Script
- Skill Asset File
- Lazy Skill Loading
- Eager Skill Listing
- Task Decomposition
- Task-Skill Binding
- Skill Selection
- Skill Misfire
- False Positive Trigger
- False Negative Trigger
- Trigger Precision
- Skill Library
- Anthropic Skill Format
- Script Delegation
- Shell Script Skill
- Python Script Skill
- Skill Refactoring
- Token Reduction Ratio
- Structured Logging
- Log Schema Design
- Log Line
- JSON Log Format
- Log Field
- Required Log Field
- Optional Log Field
- Model Field
- Prompt Hash
- Input Token Field
- Output Token Field
- Cached Token Field
- Latency Field
- Cost Field
- Feature Tag
- User Identifier
- Outcome Field
- Trace Identifier
- Span Identifier
- Request Identifier
- Session Identifier
- PII Redaction
- Prompt Truncation In Logs
- Log Sampling
- Log Retention Policy
- Observability
- OpenTelemetry
- OTel LLM Conventions
- Metric
- Counter Metric
- Histogram Metric
- Dashboard
- Cost Dashboard
- Hit Rate Dashboard
- Latency Dashboard
- Token Volume Chart
- Time Series Aggregation
- Anomaly Detection
- Alerting Rule
- Token Spike Alert
- Cost Threshold Alert
- Cardinality Concern
- Aggregation Period
- Drill-Down Analysis
- Cross-Service Tracing
- Log File Analysis
- Log Aggregation
- Top-N Cost Drivers
- Pareto Analysis
- Outlier Detection
- Runaway Prompt
- Pathological Agent Loop
- Cost Hotspot
- Per-Feature Cost Roll-Up
- Per-User Cost Roll-Up
- Per-Model Cost Roll-Up
- Prompt Template Grouping
- Cohort Analysis
- Funnel Analysis
- Histogram Of Token Counts
- P50 Token Usage
- P95 Token Usage
- P99 Token Usage
- Long-Tail Cost
- Analysis Notebook
- A/B Testing
- Hypothesis
- Null Hypothesis
- Control Group
- Treatment Group
- Traffic Split
- Random Assignment
- Stratified Assignment
- Primary Metric
- Guardrail Metric
- Quality Metric
- Cost Metric
- Latency Metric
- Satisfaction Metric
- Sample Size Calculation
- Statistical Power
- Statistical Significance
- P-Value
- Confidence Interval
- Effect Size
- Stopping Rule
- Sequential Testing
- Multi-Armed Bandit
- CUPED Adjustment
- Novelty Effect
- Prompt Engineering
- System Prompt Hygiene
- Instruction Compression
- Few-Shot Example
- Few-Shot Pruning
- Zero-Shot Prompting
- Chain Of Thought
- Dead Context
- Redundant Instruction
- Verbose Boilerplate
- Prompt Template
- Template Versioning
- Prompt Variable
- Variable Interpolation
- Prompt Compression Tool
- Selective Compression
- Prompt Length Budget
- Output Length Budget
- Concise Output Instruction
- Token-Aware Rewriting
- Whitespace Stripping
- Comment Removal
- Schema Minimization
- Symbol Substitution
- Reusable Prompt Block
- Prompt Caching
- Cache Key
- Cache Hit
- Cache Miss
- Cache Hit Rate
- Cache Warming
- Cache Invalidation
- Cache Invariant
- Stable Prefix
- Volatile Suffix
- Cache Boundary
- Cross-Vendor Caching
- Cache Cost Savings
- Cache Monitoring
- Cache Hit Rate Metric
- Cache Eviction
- Implicit Caching
- Explicit Caching
- Cache Aware Routing
- Cache Stampede
- Retrieval Augmented Generation
- Embedding
- Vector Database
- Chunking
- Chunk Size
- Chunk Overlap
- Top-K Retrieval
- Reranker
- Cross-Encoder Reranker
- Retrieval Score
- Context Injection
- Retrieved Context Bloat
- Context Pruning
- Hybrid Retrieval
- BM25 Retrieval
- Dense Retrieval
- Query Rewriting
- HyDE
- Document Compression
- Summarization-Based RAG
- Citation Of Sources
- RAG Cost Analysis
- Context Quality Metric
- Retrieval Precision
- Retrieval Recall
- Context Window Budget
- Sliding Window
- Conversation Summarization
- Compaction Strategy
- Hierarchical Summary
- Memory File
- Long-Term Memory
- Short-Term Memory
- Context Truncation
- Pre-Send Token Counting
- Context Quality Decay
- Lost-In-The-Middle
- Context Reordering
- Selective Context Inclusion
- Context Eviction Policy
- Model Routing
- Cheap-First Cascade
- Escalation Trigger
- Confidence Threshold
- Quality Gate
- Fallback Model
- Cross-Vendor Routing
- Task Classifier
- Difficulty Estimation
- Routing Policy
- Routing Cost Savings
- Routing Quality Risk
- Per-Task Model Selection
- Vendor Lock-In Risk
- Vendor-Neutral Abstraction
- Max Tokens Setting
- Stop Sequence Setting
- Length Penalty
- JSON Schema Output
- Concise Mode
- Verbosity Parameter
- Reasoning Budget
- Thinking Token Limit
- Truncation Detection
- Streaming Cancellation
- Early Stopping
- Output Postprocessing
- Output Validation
- Schema Enforcement
- Reasoning Effort Setting
- Agent Budget Policy
- Per-Session Token Budget
- Per-Session Tool Call Budget
- Loop Iteration Limit
- Wall Clock Limit
- Cost Cap
- Graceful Degradation
- Budget Exhaustion Handling
- Runaway Detection
- Circuit Breaker Pattern
- Tool Call Throttling
- Subtask Budget Allocation
- Budget Audit Log
- Budget Reporting
- Per-Engineer Budget
- Per-Repository Budget
- Per-PR Budget
- Budget Notification
- Manager Weekly Report
- Budget Versus Outcome
- Batch API
- Asynchronous API
- Batch Job Submission
- Batch Job Status
- Batch Window
- Batch Discount Rate
- Batch Versus Synchronous
- Throughput Optimization
- Latency Tolerance
- Job Queue
- Result Polling
- Webhook Notification
- Idempotency Key
- Retry Policy
- Backoff Strategy
- Data Privacy
- PII Detection
- Sensitive Field Redaction
- Compliance Risk
- GDPR
- HIPAA
- SOC2 Audit
- Data Residency
- Vendor Data Retention
- Opt-Out Of Training
- Logging Privacy Risk
- Hashing Sensitive Strings
- Tokenized Identifier
- Audit Trail
- Anonymization Strategy
- Baseline Cost Measurement
- Optimization Hypothesis
- Quality Regression Detection
- Before-After Report
- Optimization Backlog
- Cost Reduction Target
- Pilot Rollout
- Canary Deployment
- Token Dashboard Project
- Vendor-Neutral Logging Project
- Skill Refactor Project
- Budget Policy Document
- Engineering Manager Review
- Cost Reduction Postmortem
- Reproducible Benchmark
- Eval Suite
- Golden Test Set
- Regression Test Loop
- Continuous Cost Monitoring
- Token Efficiency Roadmap
- 5-Hour Limit
- Weekly Limit
- Serial Execution
- Parallel Execution
- Parallel Token Penalty