AI vs. Human
Test Scores on Various Tasks Over Time
This chart looks at timeline trends from the time of the introduction of a new capability to the current day and compares each capability with a human-level estimate (zero baseline)
General Knowledge Question Answering Score vs. Training Compute
Note that in general, humans score around 89% on these knowledge tests.