AI Task Completion Time Horizons
How long can AI models work on tasks before failing? (from metr.org)
Frontier Models
Non-Frontier Models
Scale:
Linear
Log
Success Probability:
50%
80%
?