References: Model Routing and Output Control

Routing - Wikipedia - The networking concept of selecting paths based on criteria; directly analogous to the LLM model-selection problem this chapter formalizes.
Mixture of experts - Wikipedia - The ML pattern of routing inputs to specialized models based on task type; the conceptual ancestor of cross-vendor LLM routing covered here.
Cascading classifiers - Wikipedia - The classic cheap-first-then-expensive evaluation pattern that this chapter applies to model selection across cost tiers.
AI Engineering - Chip Huyen - O'Reilly - The chapters on inference optimization and model serving cover routing, output controls, and the cost-quality framing this chapter applies to LLMs.
Designing Machine Learning Systems - Chip Huyen - O'Reilly - The production-ML chapters establish the routing-and-fallback patterns that translate naturally to multi-vendor LLM systems.
RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing - LM-SYS GitHub - Reference implementation of an LLM router that learns to dispatch easy queries to cheap models; pairs well with this chapter's routing-policy material.
Martian Blog: LLM Routing - Martian - Working notes from a commercial LLM-routing platform on confidence-threshold tuning and escalation patterns that complement this chapter's recommendations.
OpenAI Structured Outputs - OpenAI - Reference for the response_format parameter and JSON Schema enforcement covered in the output-control half of this chapter.
Anthropic Tool Use and JSON - Anthropic - Reference for forcing structured output via tool definitions, the canonical Anthropic pattern that pairs with OpenAI's structured outputs.
LangChain Output Parsers - LangChain - Reference for the output-validation and retry patterns that operationalize the schema-enforcement half of this chapter's output-control recommendations.