I use LLMs every day. I watch them solve problems I’d struggle with, then fail at things a child could do. The same model that writes a nuanced analysis of regulatory risk will confidently tell me 7 × 8 = 54.
So I keep asking: does it understand anything?
The honest answer is I don’t know. And when I try to define what “understanding” even means, I get stuck.
“Understanding means you have a model of how something works.” But LLMs have something — they can generalise, transfer knowledge, reason about novel situations. Not reliably, but not never. Is that a model? A partial one?
“Understanding means you know why, not just what.” But when I ask an LLM why rain causes wet streets, it gives the right causal chain. It’s seen it in the training data, sure. But when I explain the same thing, I’m also drawing on things I’ve learned. How different is that, really?
“Understanding means knowing what you don’t know.” This one feels closer. When I don’t understand something, I feel the boundary. LLMs don’t — they confabulate with the same fluency whether they’re right or wrong. That absence of uncertainty might be the clearest sign that something is missing.
But here’s what makes it hard: I’ve never had to define understanding before. With other humans, I just assume it — they work like me inside, so if they can explain something clearly, they probably understand it. With LLMs, I can’t make that assumption. The behaviour looks like understanding. The failures look like its absence. And I have no way to peek inside.
Maybe understanding isn’t binary. Maybe LLMs are somewhere on a spectrum we don’t have language for — more than pattern matching, less than whatever we mean by “real” understanding. Something genuinely new.
What I do know: being fluent isn’t the same as understanding. The model that helped me think through this question assembled its arguments from training data. I arrived at the question from ten years of building AI systems and watching them surprise and disappoint me in equal measure. The outputs might look similar. The process isn’t.
I don’t know if that difference matters. But I think it does.