Chapter 2 References — AI Capability Curve¶

Curated resources for deeper exploration of the topics in this chapter.

Books¶

Kurzweil, Ray. (2005). The Singularity Is Near: When Humans Transcend Biology. Viking. Introduces the concept of exponential technological growth using Moore's Law as an analogy, foundational for understanding why AI capability curves appear surprising.
Brynjolfsson, Erik, and Andrew McAfee. (2014). The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies. W. W. Norton. Uses the "second half of the chessboard" metaphor to explain why exponential AI progress catches institutions off guard — directly relevant to how educators should interpret benchmark data.

METR. (2024). "Measuring the Ability of Language Models to Complete Long-Horizon Tasks." METR Technical Report. https://metr.org/blog/2024-08-06-update-on-evaluations/ The primary source for the METR task-horizon doubling finding referenced throughout this chapter; describes how agent autonomy is measured over time.
Hendrycks, Dan, et al. (2020). "Measuring Massive Multitask Language Understanding." arXiv. https://arxiv.org/abs/2009.03300 Introduces the MMLU benchmark that has become a standard measure of AI capability across academic subjects, helping readers understand how AI progress is tracked.
Epoch AI. (2024). "Machine Learning Model Performance." https://epochai.org/trends Tracks trends in AI compute, training data, and benchmark scores over time; essential context for understanding the capability curve this chapter describes.
McKinsey Global Institute. (2023). "The Economic Potential of Generative AI." McKinsey & Company. https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai Quantifies AI cost trends and productivity implications, supporting the chapter's discussion of AI costs falling rapidly alongside rising capabilities.
Sevilla, Jaime, et al. (2022). "Compute Trends Across Three Eras of Machine Learning." arXiv. https://arxiv.org/abs/2202.05924 Documents the doubling timescales for AI training compute, giving educators data to evaluate the Moore's Law analogy.

Our World in Data. (2024). AI Progress. https://ourworldindata.org/artificial-intelligence Provides free, openly licensed charts showing AI benchmark progress over time, ideal for school board presentations.
Stanford HAI. (2024). AI Index Report 2024. https://aiindex.stanford.edu/report/ The most comprehensive annual report tracking AI capabilities, costs, and adoption across sectors including education.
Eleuther AI. (2024). Language Model Evaluation Harness. https://github.com/EleutherAI/lm-evaluation-harness The open-source framework behind many public benchmark comparisons; helps readers understand how benchmark saturation is measured.

Two Minute Papers. (2023). "GPT-4 Is Impressive, But What Comes Next?" YouTube. https://www.youtube.com/c/K%C3%A1rolyZsolnai-Feh%C3%A9r Accessible summaries of recent AI capability jumps that illustrate the pace of change described in this chapter.