Source: Compiled from multiple public benchmarks and research papers, May 2025.
Note: MMLU (Massive Multitask Language Understanding) measures model performance across 57 subjects ranging from STEM to humanities.