Abstract
Forecasting tournaments are misaligned with the goal of producing actionable forecasts of existential risk, an extreme-stakes domain with slow accuracy feedback and elusive proxies for long-run outcomes. We show how to improve alignment by measuring facets of human judgment that play central roles in policy debates but have long been dismissed as unmeasurable. The key is supplementing traditional objective accuracy metrics with intersubjective metrics that test forecasters’ skill at predicting other forecasters’ judgments on topics that resist objective scoring, such as long-range scenarios, probativeness of questions, insightfulness of explanations, and impactfulness of risk-mitigation options. We focus on the value of Reciprocal Scoring, an intersubjective method grounded in micro-economic research that challenges top forecasters to predict each other’s judgments. Even if cumulative information gains prove modest and are confined to a 1-to-5 year planning horizon, the expected value of lives saved would be massive.
Acknowledgments
Our work on this document was funded by members of Founder’s Pledge. The authors thank Yunzi (Louise) Lu and Josh Rosenberg for extensive support with this research.
Disclaimer
Any views expressed in this paper do not necessarily reflect those of the Federal Reserve Bank of Chicago or the Federal Reserve System.



