ForecastBench is a dynamic, continuously updated benchmark that measures the forecasting accuracy of AI systems on a constantly changing set of questions. We evaluate AI systems by regularly asking them to make forecasts about future events, thereby creating a benchmark that remains contamination-free over time. By surveying both the general public and superforecasters, the benchmark also provides measures of human accuracy against which one can assess the performance of AI systems.
After publication of the ForecastBench paper at ICLR 2025, we began accepting external submissions from independent researchers, forecasting teams, and major AI labs in October 2025. Since then, we have received submissions from xAI and Google DeepMind, several startups, and additional anonymous participants. External participation has continued to grow. New forecasting rounds take place every two weeks, and we welcome further participation.
The benchmark is actively maintained and developed by a core team of FRI researchers, and we are moving toward an updated, more challenging version that we aim to release by the end of 2026. We believe that forecasting serves as a valuable proxy for general AI capabilities and that ForecastBench offers a rigorous, evolving testbed for tracking how those capabilities develop across both frontier AI companies and startups focused on forecasting as a product.
For more about ForecastBench, and to view the baseline and tournament leaderboards:




