The Crystal Ball Fallacy: What Perfect Predictive Models Really Mean
The young quant stormed into my office, his face flushed with anger. “Your model is broken!” he exclaimed, slamming a stack of trading statements onto my desk. “We lost everything!”
I looked up, calm despite his outburst. “The model is perfect,” I said evenly. “Exactly as advertised.”
He scoffed. “Perfect? We predicted Google’s stock wouldn’t rise, sold naked calls to collect premium, and got wiped out when it shot up! We lost millions! How can you call that perfect?”
I leaned forward. “What do you think a perfect predictive model is?”
“Isn’t it obvious?” he snapped. “A model that tells you exactly where the stock price will be tomorrow.”
I smiled. “There’s your mistake. What you’re describing isn’t a model, it’s a crystal ball. And those don’t exist.”
Although Hypothetical, A Perfect Model Is not a Crystal Ball
Imagine your weather app predicts a 20% chance of rain tomorrow. If it rains, people often say, “The forecast was wrong.” But a 20% chance means rain was possible—just not likely.
Consider tomorrow’s weather forecast. A crystal ball would tell you exactly what will happen: “At 2 PM, there will be heavy rain.” A perfect model, if it existed, instead, would tell you the true probabilities: “15% chance of sun, 30% chance of clouds, 40% chance of light rain, 15% chance of heavy rain.” Even with these exact probabilities, we still would not know which condition will actually occur—and that is not a flaw, it is the nature of a model, even a perfect one.
Consider the 2016 U.S. Presidential Election. FiveThirtyEight’s final model gave Donald Trump a 28% chance of winning. When Trump won, many declared the model “wrong.” However, as many have explained, this is similar to rolling a die and getting a 1 or 2—not likely, but entirely possible.
Even if we assume the model was perfect and the true probability was 28%, it still would not tell you which outcome would occur. This is where people confuse a perfect model with a crystal ball, expecting deterministic certainty instead of probabilities.
Why the Hypothesis of a Perfect Model is Useful
When taking on a new “data project,” it is often informative to ask stakeholders how they would use a perfect model. Their answers often reveal hidden assumptions and limitations that even perfection could not solve.
Take the example of a travel destination recommender. For simplicity, let us assume a traveller either loves or hates a destination, ignoring the full range of possible satisfaction levels. A perfect model might tell us there is a 95% probability that a prospective traveller would love experiencing the Republic of Kiribati. But what if the prospective traveller has never heard of this destination? Even with near-certain odds of satisfaction, convincing someone to book a journey to an unknown place remains challenging. And there is still that 5% chance we would guide this prospective traveller toward a costly, awful experience.
This reveals a crucial insight: even perfect models cannot eliminate all business challenges. They can tell us probabilities, but cannot solve issues of reach, trust, communication, or risk tolerance. In fact, even if we had a crystal ball guaranteeing our prospective travellers would love Kiribati, the challenge of reaching and convincing them to trust that recommendation would remain.
The young fictional quant in our opening story fell into this trap, believing a “perfect” model would guarantee profits rather than just provide true probabilities. In contrast, real-world pioneers in quantitative trading understood the true nature of mathematical models. Ed Thorp—author of “Beat the Dealer” (1966) and who independently derived what would later be known as the Black-Scholes model—and Jim Simons—whose Renaissance Technologies purportedly achieved returns of 66% annually before fees from 1988 to 2018 (Zuckerman, 2019, “The Man Who Solved the Market”)—were seasoned mathematicians who knew their models, not even perfect, provided not certainties, or approximation of it, but statistical edges they could leverage. They understood that even a slight advantage (like a coin biased 50.01% vs 49.99%) could lead to long-term success—provided you can survive long enough to let the law of large numbers work in your favor.
Many tried, few succeeded. Because developing a mathematical model that gives an edge is not sufficient, as our destination recommender example illustrates. Nassim Taleb, in his foreword to Thorp’s “A Man for All Markets,” perhaps exaggeratedly, claims that the model part “is easy, very easy. It is capturing the edge, converting it into dollars in the bank, restaurant meals, interesting cruises, and Christmas gifts to friends and family; that’s the hard part.” Even with a hypothetical perfect model that gives us exact probabilities of all outcomes, success is not guaranteed.
Before starting to build a model, it is therefore wise to first ensure stakeholders understand what perfect predictions mean—true probabilities of outcomes, not future-occurrence. Then, make the hypothesis of perfection and ask three key questions: First, how would stakeholders use these perfect probability predictions? Second, what organizational and operational challenges would remain even with true probability knowledge? And third, if we relax the perfection hypothesis, does a realistic path to value creation still exist with imperfect predictions? If the challenges appear insurmountable even under perfect conditions, or if value creation seems unlikely with realistic model performance, perhaps it is better to move on to another data project with more attainable upsides.
The Takeaway: A Perfect Predictive Model Gives Uncertainties, Not Certainties
A perfect model, even if it could exist, tells you what could happen and how likely it is—not what will happen. In other words, a perfect model tells the true distribution of a random variable. Misunderstanding this distinction leads to misplaced certainty and, too often, unnecessary blame when “unlikely” events occur.
In practice, however, most predictive models used in the corporate world only approximate point estimates—e.g., predictions of a single summary statistics, typically the mean, rather than full probability distributions. While this may be sufficient in some cases, it often falls short when uncertainty or rare events play a significant role. By focusing only on the mean, these models miss critical information about the shape, or the tails of the distribution—exactly where the most consequential events often lie. Our quant’s disaster with naked calls illustrates this perfectly: knowing the expected stock price movement (the mean) tells you nothing about the probability of extreme moves that could trigger catastrophic losses.
The real challenge is not building a perfect model—it is understanding and communicating what our imperfect models can and cannot tell us. As statistician George Box famously observed, “All models are wrong, but some are useful.” The key is determining how useful they can be within their limitations. This often means wrestling with layers of uncertainty: not just the probabilities our models predict, but the uncertainty in those probabilities themselves. When a weather forecast predicts a 20% chance of rain tomorrow, how confident are we in that 20%? A model’s true value lies not just in its predictions, but in helping us understand these cascading levels of uncertainty.
The examples discussed here—e.g., weather forecasts, election outcomes and travel recommendations—are, in their simplest form, cases where a single probability can describe the full distribution of possible outcomes. Will it rain or not? Will someone love a destination or not? But many real-world problems often involve complex, unknown distributions that cannot be captured by simple formulas. While various techniques—e.g., Bayesian methods, Gaussian processes and other probabilistic approaches—can help us grapple with these uncertainties, the fundamental challenge remains: understanding what our models can tell us, what they cannot, and how to use that knowledge to make better decisions.
Art and text by Loic Merckel. Licensed under CC BY 4.0. Originally published on 619.io. For discussions or engagement, feel free to refer to the LinkedIn version or Medium version. Otherwise, attribute the original source when sharing or reusing.