The Kutta Condition of AI: Engineering Ships Before Theory Catches Up

Aeroplanes worked before anyone understood why wings produced lift.

The Wright brothers flew in 1903. The Kutta-Joukowski theorem — the mathematical framework that correctly explains the lift generated by airfoils — was formalised around the same time, but the complete theoretical understanding of boundary layer behaviour and circulation theory wasn’t settled until decades later. Engineers built aircraft that reliably flew using heuristics, empirical testing, and iterative refinement. The theory caught up to the practice long after the practice was already changing the world.

AI is in the same position, and we haven’t fully accepted what that means.

Large language models work. They work reliably enough to be deployed in production systems that millions of people depend on. They produce outputs that are useful, accurate, and sometimes remarkable. What we don’t have is a theoretical account of why they work — a satisfying explanation of what is happening inside the model that produces these outputs, what the representations mean, how the capabilities emerge from the architecture and training process. The interpretability research is serious and is making real progress, but we are not close to a complete account.

The engineering has lapped the theory, exactly as it did in aeronautics.

This creates a specific kind of discomfort for people who are accustomed to deploying systems they understand. A database query planner has a theory. A sorting algorithm has a proof. A financial model has explicit assumptions that can be examined and challenged. When something goes wrong, the theory tells you where to look. With current AI systems, the equivalent of “the wing doesn’t produce lift anymore” can happen without a clear theoretical account of what changed or why.

The response to this discomfort tends toward one of two failure modes. The first is to wait for the theory before deploying — to treat the interpretability gap as a reason for caution substantial enough to slow deployment significantly. This position is coherent but increasingly untenable as the competitive and operational pressure to deploy grows. It also misapplies the lesson from aeronautics: waiting for complete theory before deploying would have meant no aviation until well into the twentieth century.

The second is to pretend the theory exists — to treat the engineering confidence we have in specific use cases as theoretical understanding we don’t actually have. This is the failure mode that produces AI governance frameworks that look rigorous but don’t account for the possibility of failures that theory can’t anticipate, and model risk processes that treat “passes validation” as equivalent to “we understand why it works.”

The more honest position is the one aeronautical engineers occupied for several decades: we have reliable empirical methods, we’ve accumulated enough operational experience to know what the failure modes look like in practice, and we’re building governance and oversight structures calibrated to that uncertainty rather than to a theoretical confidence we don’t have.

What this means practically is building more robustly around the theory gap than you would around a gap in a system you understood. Extensive operational monitoring to catch empirical failures before they cascade. Conservative deployment scopes that limit the blast radius when edge cases emerge. Human oversight concentrated at the points where failure is most consequential. Testing regimes that explore the distribution of inputs the system will actually encounter, not just the distribution it was built for.

None of this is exotic. It’s the standard engineering response to operating in a regime of incomplete theory: build conservatively, monitor intensively, learn from operation, update accordingly. Aeronautics did this for decades and built a safety record that eventually became one of the best in any transportation mode.

The theory will catch up. Interpretability research is genuine and progress is real. But the appropriate question now isn’t “do we understand this well enough to deploy it?” The question is “do we have the operational discipline to deploy it responsibly before the theory arrives?” That’s a governance question, not a capabilities question. And it has a known answer from the history of engineering.

P.S. The most interesting implication of the Kutta condition analogy: the theory that eventually arrives may be surprising. The intuitive explanation of wing lift — air moving faster over the curved top surface — turned out to be incorrect or at least incomplete. The theory that correctly explains what actually happens is less intuitive. There’s no particular reason to believe that the theory that eventually explains LLM behaviour will match the intuitions we’ve built about it during the empirical period.

Related Articles

AI Evals: Why Teams Build Metrics Before They've Read a Trace

Three Crates Before Lunch

Taste Requires Stakes

Share this article