Cognitive offloading is real, and it works in both directions.
When we offload a cognitive task to a tool — navigation to GPS, arithmetic to a calculator, recall to a search engine — we free up mental resources that can be redirected to other things. This is the beneficial version: the GPS frees attention that would have gone to wayfinding, which can go to conversation or observation. The calculator frees working memory that would have tracked intermediate steps, which can go to understanding what the calculation is trying to measure.
The problematic version happens when the offloaded skill atrophies in a way that reduces performance on tasks that depend on it. Taxi drivers who stopped navigating by memory don’t just become worse at finding routes — they show changes in spatial cognition that affect how they process spatial information generally. The skill that was offloaded had connections to other skills that weren’t being offloaded, and those connections weakened.
AI assistants are the most powerful cognitive offloading tool in history, and the atrophy question is real. Not the amateur version — “people will stop knowing things because AI tells them things” — but the more specific version: which cognitive skills are being offloaded, what other skills do they connect to, and are those connections weakening in ways that matter?
The answer depends heavily on what AI is being used for and how. Offloading first-draft generation to AI while keeping the evaluative judgment that assesses the draft is probably fine. The generative skill may weaken; the evaluative skill is being exercised more intensively than before. But offloading the evaluation too — accepting AI outputs without substantive scrutiny because the scrutiny feels unnecessary — is where the atrophy risk becomes serious. The skill being offloaded is judgment, and judgment doesn’t have a clean boundary with other skills.
In financial services, this manifests specifically in the review and analysis functions where AI deployment is most tempting. A credit analyst using AI to generate first-pass risk assessments, then scrutinising the assessment carefully and updating based on information the AI missed, is probably exercising their analytical judgment more efficiently than before. The same analyst accepting the AI assessment with a light review because the AI is usually right is potentially degrading the judgment that makes their review valuable in the first place.
The governance response to this tends to be “keep humans in the loop,” which is true but insufficient. A human in the loop who has learned to treat AI outputs as ground truth pending contradiction isn’t exercising meaningful oversight. The formal structure of human review is present; the cognitive engagement that makes review valuable is not.
The more specific governance response is to design for deliberate cognitive exercise rather than passive sign-off. This means rotating high-judgment tasks between AI-assisted and unassisted completion, so the skill is exercised regularly in conditions that require genuine engagement. It means building review processes that specifically require the reviewer to generate their own assessment before seeing the AI output, to create conditions where the reviewer’s independent judgment is exercised before the AI’s output anchors their thinking. It means tracking not just whether humans are reviewing AI outputs, but whether the reviews are identifying genuine errors — because a review process that never finds errors is probably not finding them because they’re being accepted, not because they don’t exist.
None of this is anti-AI. The goal is to capture the efficiency benefits of AI assistance while preserving the judgment capabilities that make the human oversight meaningful. The two aren’t in tension — but they require deliberate design, not just the assumption that keeping humans in the loop is sufficient.
The cognitive offloading literature suggests that the skills most at risk are the ones most seamlessly replaced. Arithmetic offloading didn’t degrade higher-order mathematical reasoning because the replacement was so complete that people stopped exercising arithmetic at all; the connection between arithmetic practice and mathematical intuition weakened. The skills that survived were the ones still being exercised in contexts where the tool didn’t fully substitute.
AI won’t fully substitute for most high-judgment professional tasks anytime soon. But it will seamlessly handle enough of the component skills that the connections between those components and the higher-order judgment could weaken if the governance design doesn’t specifically preserve them.
That’s the specific problem. It has a specific governance response. Most current frameworks haven’t gotten there yet.
P.S. The most useful proxy for whether a review process is preserving genuine judgment: ask reviewers what the last AI output was that they substantially revised, and why. If no one can answer this, the review process is probably ceremony rather than oversight.