The Edge Factory

"One must imagine Sisyphus happy." — Albert Camus.
The edge always rolls back. The work is to keep pushing... and to measure each climb.

Note: This piece is also shared on LinkedIn.

In corporate environments, there is often a quiet pressure to prioritize intellectual comfort over hard truths. But if the goal is to allocate capital effectively and capture real value, we have to confront reality early enough to act.

The harder problem is not reflexive negativity. It is the opposite: the posture most reliably rewarded in large organizations is enthusiasm: agreeing, amplifying, adding momentum, not puncturing. Optimism is pleasant to be around and easy to promote. But uncritical enthusiasm is as corrosive to good stewardship as the destructive cynicism it looks nothing like. Both stop us from testing whether what we say is actually true.

This is where I find Dave Snowden’s defense of cynicism useful. In his essay Cynicism, curiosity and context, Snowden makes a point that is rarely said openly in corporate environments: “it is the cynics who care in an organisation because they are still prepared to criticise even though it may not be in their self interest to do so.”

That is not destructive cynicism. Snowden is clear that rejecting everything by default is as bad as sycophantic enthusiasm. The useful position sits in the middle: caring enough to question, challenge, and test whether what we say is true.

This, to me, is part of the cultural upgrade required for a serious data strategy.

In complex environments, cause and effect are not always stable, visible, or fully knowable in advance. The appropriate response is not to impose a universal methodology, but to probe reality through coherent, safe-to-fail experiments and learn from what actually happens.

The consequence is uncomfortable but important: many models may look coherent on historical data and still fail when exposed to the next regime of reality. A train/test split checks coherence with the past. It does not prove future economic value.

A trained model is not a fact extracted from data. It is a hypothesis. In Popperian terms, hypotheses are not meant to be admired for their coherence. They are meant to be exposed to the possibility of being wrong.

Yet most of the time, no real experimentation is carried out. Offline tests are wrongly accepted as sufficient. Causal inference is used too rarely, and when it is used, it is often poorly executed, making the conclusions hard to trust.

Consider an industry I know well: airline data is often ill-behaved. Take fares. The average ticket price an airline reports is a modest number, a few hundred euros. Yet that average sits on top of a wildly dispersed distribution: most of the cabin has paid economy fares, while a handful of premium long-haul itineraries can be worth tens of thousands. Add ancillary revenue, which most customers never buy at all, and you get data that is zero-inflated, highly skewed, long-tailed, possibly fat-tailed, and non-stationary. This is not the clean Gaussian world of textbook examples.

In such settings, standard comparisons of means can become fragile. The Central Limit Theorem is asymptotic, and convergence can be painfully slow. Achieving adequate statistical power with a classical t-test may require group sizes so large that the test becomes operationally unrealistic. The general-purpose experimentation tools that work well for standard digital metrics, especially proportion-like outcomes, can become inadequate, or even misleading, for revenue-based metrics of this shape.

For value, we need causal measurement. But causal measurement is not a one-off stamp of approval. It matters because edges decay.

A model, signal, or treatment can create value only under certain conditions. Those conditions move: users adapt, competitors copy, regulations change, interfaces evolve, and sometimes the act of optimizing against a signal weakens the signal itself.

This is why the factory metaphor matters. As Marcos López de Prado put it: “The money is not in making a car, it is in making a car factory.”

This piece continues an idea from my deck Having an Edge: the durable advantage is not one model, one insight, or one successful experiment. It is the factory that repeatedly discovers, measures, captures, and renews small edges before they fade.

Experimentation is one of the core machines in that factory. It prevents us from mistaking historical coherence, internal conviction, or retrospective attribution for real economic value.

Being data-driven is not feeding data into a model and declaring value. It is building a mechanism that turns hypotheses into evidence, evidence into decisions, and decisions into measured economic value.

None of this happens because people set out to deceive. The drift is structural. Jason Zweig once relayed a three-part rule from his father on the ways to get paid; the first was simply that telling people what they want to hear is the most rewarding of the three. Organizations do not intend to reward that. But incentives, applause, and promotion cycles often do… quietly, and without anyone deciding to lie.

To resist this drift and actually build that factory, teams need a cultural shift:

Accepting that most tested ideas will fail. Among firms with world-class experimentation capabilities, failure rates of 80% or more are normal. It would be quite pretentious for any team to assume it is naturally better than Google, Amazon, Booking.com, Microsoft, or Airbnb.
Treating failed tests as learning, not embarrassment. Not as a virtue-signaling statement in a slide deck, but as something observed and felt in daily operations: in decisions, incentives, governance, and resource allocation.
Preferring measured economic value over claimed value. Questionable revenue attribution schemes are not evidence. A spectacular claim, such as a reported return far above what the underlying activity could plausibly produce, should trigger Twyman’s Law before it triggers applause: if a number looks unusually good, the first hypothesis should be that something may be wrong with the measurement.
Reducing bureaucratic theater when it slows learning. If processes, however well-intentioned, increase coordination cost, slow down testing, and protect plans from reality, then it is not agility. It is ceremony, not progress.
Designing for non-stationarity. User behavior, markets, regulation, competitors, interfaces, and technology keep moving. A signal that creates value today may decay tomorrow. The wheel has to keep turning.

In that sense, the challenge is not only technical. It is organizational. We do not need more rituals that certify comfort. We need a system that helps us discover small edges, capture them while they exist, and keep the wheel turning when they decay.