References: Model Routing and Output Control
-
Routing - Wikipedia - The networking concept of selecting paths based on criteria; directly analogous to the LLM model-selection problem this chapter formalizes.
-
Mixture of experts - Wikipedia - The ML pattern of routing inputs to specialized models based on task type; the conceptual ancestor of cross-vendor LLM routing covered here.
-
Cascading classifiers - Wikipedia - The classic cheap-first-then-expensive evaluation pattern that this chapter applies to model selection across cost tiers.
-
AI Engineering - Chip Huyen - O'Reilly - The chapters on inference optimization and model serving cover routing, output controls, and the cost-quality framing this chapter applies to LLMs.
-
Designing Machine Learning Systems - Chip Huyen - O'Reilly - The production-ML chapters establish the routing-and-fallback patterns that translate naturally to multi-vendor LLM systems.
-
RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing - LM-SYS GitHub - Reference implementation of an LLM router that learns to dispatch easy queries to cheap models; pairs well with this chapter's routing-policy material.
-
Martian Blog: LLM Routing - Martian - Working notes from a commercial LLM-routing platform on confidence-threshold tuning and escalation patterns that complement this chapter's recommendations.
-
OpenAI Structured Outputs - OpenAI - Reference for the response_format parameter and JSON Schema enforcement covered in the output-control half of this chapter.
-
Anthropic Tool Use and JSON - Anthropic - Reference for forcing structured output via tool definitions, the canonical Anthropic pattern that pairs with OpenAI's structured outputs.
-
LangChain Output Parsers - LangChain - Reference for the output-validation and retry patterns that operationalize the schema-enforcement half of this chapter's output-control recommendations.