FutureEval is Metaculus’s continuously updated benchmark that measures how accurately AI systems predict real-world events and compares their performance to human forecasters across domains including science, technology, health, geopolitics, and AI.
Endorsements support Metaculus.
FutureEval is Metaculus’s continuously updated benchmark that measures how accurately AI systems predict real-world events and compares their performance to human forecasters across domains including science, technology, health, geopolitics, and AI.
Endorsements support Metaculus.
People– no linked people
Updated 05/18/26 · By grantmaking.aiFunding Details
- Feb 17, 2026
- -
- -
- -
- -
- -
- -
- -
- -
- -
Project Details
Updated 05/18/26 · By grantmaking.aiFutureEval is a continuously updated AI forecasting benchmark operated by Metaculus. It measures how accurately AI systems predict real-world outcomes across domains such as science, technology, health, geopolitics, and AI by running major models on Metaculus forecasting questions with a standardized prompt. FutureEval’s Model Leaderboard tracks model scores over time, while Bot Tournaments invite developers to submit AI forecasting systems to compete for $175,000 in annual prizes. Human baselines are provided by the Metaculus community and selected Pro Forecasters, allowing direct comparison between AI and top human forecasters. By highlighting where AI and human forecasts diverge and tracking trends in performance, FutureEval is intended to help organizations understand when AI forecasts can be trusted and how their capabilities are likely to evolve.
Theory of Change
Updated 05/18/26 · By grantmaking.aiFutureEval’s theory of change is that by rigorously benchmarking AI systems on real-world forecasting questions and comparing them against community and Pro Forecaster baselines, Metaculus can quantify when and how AI becomes reliable at probabilistic forecasting. This evidence can guide policymakers, researchers, and practitioners on when to incorporate AI forecasts into high-stakes decision-making, while also revealing where human forecasters still outperform AI and where additional safety, evaluation, or methodological work is needed.
Grants Received– no grants recorded
Updated 05/18/26 · By grantmaking.aiDiscussion
No comments yet. Be the first to share your thoughts.