{"id":605,"date":"2024-11-05T12:00:42","date_gmt":"2024-11-05T12:00:42","guid":{"rendered":"https:\/\/forecastingresearch.org\/?post_type=research&#038;p=605"},"modified":"2026-04-17T17:33:13","modified_gmt":"2026-04-17T17:33:13","slug":"forecastbench","status":"publish","type":"research","link":"https:\/\/forecastingresearch.org\/research\/forecastbench","title":{"rendered":"ForecastBench"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">ForecastBench is a dynamic, continuously updated benchmark that measures the forecasting accuracy of AI systems on a constantly changing set of questions. We evaluate AI systems by regularly asking them to make forecasts about future events, thereby creating a benchmark that remains contamination-free over time. By surveying both the general public and superforecasters, the benchmark also provides measures of human accuracy against which one can assess the performance of AI systems.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After publication of the ForecastBench <a href=\"https:\/\/arxiv.org\/abs\/2409.19839\">paper<\/a> at ICLR 2025, we began accepting external submissions from independent researchers, forecasting teams, and major AI labs in October 2025. Since then, we have received submissions from xAI and Google DeepMind, several startups, and additional anonymous participants. External participation has continued to grow. New forecasting rounds take place every two weeks, and we welcome further participation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The benchmark is actively maintained and developed by a core team of FRI researchers, and we are moving toward an updated, more challenging version that we aim to release by the end of 2026. We believe that forecasting serves as a valuable proxy for general AI capabilities and that ForecastBench offers a rigorous, evolving testbed for tracking how those capabilities develop across both frontier AI companies and startups focused on forecasting as a product.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For more about ForecastBench, and to view the baseline and tournament leaderboards:<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"btn orange\" href=\"https:\/\/www.forecastbench.org\/\">Visit the ForecastBench website <svg width=\"7\" height=\"9\" viewBox=\"0 0 7 9\" fill=\"none\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n  <path d=\"M0.000156283 8.60806L4.22416 4.33606V4.24006L0.000156283 6.10352e-05H1.80816L6.06416 4.28806L1.80816 8.60806H0.000156283Z\" fill=\"#102B23\"\/>\n<\/svg>\n<svg width=\"8\" height=\"10\" viewBox=\"0 0 8 10\" fill=\"none\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n  <path d=\"M0.601719 8.85794L4.82572 4.58594V4.48994L0.601719 0.249939H2.40972L6.66572 4.53794L2.40972 8.85794H0.601719Z\" fill=\"#102B23\"\/>\n<\/svg><\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"A dynamic, continuously updated benchmark that measures the forecasting accuracy of AI systems on a constantly changing set of questions.","protected":false},"featured_media":1617,"template":"","meta":{"footnotes":"[]"},"research_type":[3],"class_list":["post-605","research","type-research","status-publish","has-post-thumbnail","hentry","research_type-project"],"acf":[],"yoast_head":"<title>ForecastBench &#8211; Forecasting Research Institute<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/forecastingresearch.org\/research\/forecastbench\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"ForecastBench &#8211; Forecasting Research Institute\" \/>\n<meta property=\"og:description\" content=\"A dynamic, continuously updated benchmark that measures the forecasting accuracy of AI systems on a constantly changing set of questions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/forecastingresearch.org\/research\/forecastbench\" \/>\n<meta property=\"og:site_name\" content=\"Forecasting Research Institute\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-17T17:33:13+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/forecastingresearch.org\/wp-content\/uploads\/2024\/11\/FRI-illustration-library-8-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2752\" \/>\n\t<meta property=\"og:image:height\" content=\"1728\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench\",\"url\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench\",\"name\":\"ForecastBench &#8211; Forecasting Research Institute\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/forecastingresearch.org\\\/wp-content\\\/uploads\\\/2024\\\/11\\\/FRI-illustration-library-8-1.jpg\",\"datePublished\":\"2024-11-05T12:00:42+00:00\",\"dateModified\":\"2026-04-17T17:33:13+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench#primaryimage\",\"url\":\"https:\\\/\\\/forecastingresearch.org\\\/wp-content\\\/uploads\\\/2024\\\/11\\\/FRI-illustration-library-8-1.jpg\",\"contentUrl\":\"https:\\\/\\\/forecastingresearch.org\\\/wp-content\\\/uploads\\\/2024\\\/11\\\/FRI-illustration-library-8-1.jpg\",\"width\":2752,\"height\":1728},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/research\\\/forecastbench#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/forecastingresearch.org\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"ForecastBench\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/forecastingresearch.org\\\/#website\",\"url\":\"https:\\\/\\\/forecastingresearch.org\\\/\",\"name\":\"Forecasting Research Institute\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/forecastingresearch.org\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>","yoast_head_json":{"title":"ForecastBench &#8211; Forecasting Research Institute","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/forecastingresearch.org\/research\/forecastbench","og_locale":"en_US","og_type":"article","og_title":"ForecastBench &#8211; Forecasting Research Institute","og_description":"A dynamic, continuously updated benchmark that measures the forecasting accuracy of AI systems on a constantly changing set of questions.","og_url":"https:\/\/forecastingresearch.org\/research\/forecastbench","og_site_name":"Forecasting Research Institute","article_modified_time":"2026-04-17T17:33:13+00:00","og_image":[{"width":2752,"height":1728,"url":"https:\/\/forecastingresearch.org\/wp-content\/uploads\/2024\/11\/FRI-illustration-library-8-1.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/forecastingresearch.org\/research\/forecastbench","url":"https:\/\/forecastingresearch.org\/research\/forecastbench","name":"ForecastBench &#8211; Forecasting Research Institute","isPartOf":{"@id":"https:\/\/forecastingresearch.org\/#website"},"primaryImageOfPage":{"@id":"https:\/\/forecastingresearch.org\/research\/forecastbench#primaryimage"},"image":{"@id":"https:\/\/forecastingresearch.org\/research\/forecastbench#primaryimage"},"thumbnailUrl":"https:\/\/forecastingresearch.org\/wp-content\/uploads\/2024\/11\/FRI-illustration-library-8-1.jpg","datePublished":"2024-11-05T12:00:42+00:00","dateModified":"2026-04-17T17:33:13+00:00","breadcrumb":{"@id":"https:\/\/forecastingresearch.org\/research\/forecastbench#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/forecastingresearch.org\/research\/forecastbench"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/forecastingresearch.org\/research\/forecastbench#primaryimage","url":"https:\/\/forecastingresearch.org\/wp-content\/uploads\/2024\/11\/FRI-illustration-library-8-1.jpg","contentUrl":"https:\/\/forecastingresearch.org\/wp-content\/uploads\/2024\/11\/FRI-illustration-library-8-1.jpg","width":2752,"height":1728},{"@type":"BreadcrumbList","@id":"https:\/\/forecastingresearch.org\/research\/forecastbench#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/forecastingresearch.org\/"},{"@type":"ListItem","position":2,"name":"ForecastBench"}]},{"@type":"WebSite","@id":"https:\/\/forecastingresearch.org\/#website","url":"https:\/\/forecastingresearch.org\/","name":"Forecasting Research Institute","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/forecastingresearch.org\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/research\/605","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/research"}],"about":[{"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/types\/research"}],"version-history":[{"count":10,"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/research\/605\/revisions"}],"predecessor-version":[{"id":1791,"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/research\/605\/revisions\/1791"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/media\/1617"}],"wp:attachment":[{"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/media?parent=605"}],"wp:term":[{"taxonomy":"research_type","embeddable":true,"href":"https:\/\/forecastingresearch.org\/api\/wp\/v2\/research_type?post=605"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}