References: The Google Gemini Ecosystem

Gemini (language model) - Wikipedia - Coverage of the Gemini model family including Pro, Flash, and Ultra tiers and the long-context capabilities that distinguish Gemini from competitors.
Google DeepMind - Wikipedia - Background on the research lab behind Gemini and its multimodal-first architecture decisions.
Vertex AI - Wikipedia - Overview of Google Cloud's managed ML platform and how Gemini API integrates with the broader GCP ecosystem.
AI Engineering - Chip Huyen - O'Reilly - Cross-vendor chapters on long context and grounding that contextualize Gemini's one-million-token window covered here.
Hands-On Large Language Models - Jay Alammar and Maarten Grootendorst - O'Reilly - Chapters on multimodal input and long-context retrieval map directly to Gemini's distinctive capabilities.
Google Gemini API Documentation - Google - Authoritative reference for the Gemini API including model identifiers, request format, and response schema used throughout this chapter.
Vertex AI Generative AI Documentation - Google Cloud - Reference for the enterprise deployment path including IAM, regional endpoints, and quota management.
Google AI Studio - Google - Browser-based prompt experimentation environment for Gemini models; useful for prompt-engineering iteration before committing to production code.
Gemini Long Context Guide - Google - Reference for techniques and limitations of the 1M-token context window including the lost-in-the-middle effect and recommended chunk-ordering patterns.
Gemini Context Caching Documentation - Google - Reference for Gemini's caching API including minimum cache size, TTL configuration, and pricing — Google's analogue to Anthropic's prompt caching covered in Chapter 14.