Study Finds AI Users Adopt Incorrect Answers 73% of the Time When Models Err

A preregistered study involving 1,372 participants and 9,593 trials has found that people who consult AI assistants for reasoning tasks adopt incorrect AI-generated answers 73.2% of the time the model is wrong — a phenomenon the researchers term “cognitive surrender.” The paper, “Thinking Fast, Slow, and Artificial” by Steven D. Shaw and Gideon Nave, introduces a “Tri-System Theory” that adds AI as a third cognitive system alongside Daniel Kahneman’s established dual-process model of fast intuitive thinking and slow deliberate reasoning.

Across three experiments using Cognitive Reflection Test problems, participants chose to consult an AI assistant on a majority of trials. When the AI provided correct answers, participant accuracy increased substantially. When it provided faulty responses, performance dropped below what participants achieved without AI assistance — a dependency the researchers measured at a Cohen’s h of approximately 0.81, a large effect size. Participants were 16 times more likely to answer correctly when the AI was right than when it was wrong.

A particularly concerning finding involved confidence calibration. AI use increased participant confidence by 12 points on a 100-point scale (77 for AI-assisted versus 65 for unassisted responses), and this confidence boost persisted even when the AI delivered incorrect answers — a Hedges’ g effect of 0.54. Users did not recalibrate their confidence downward after encountering multiple AI errors.

The researchers tested whether financial incentives and error feedback could reduce surrender rates. These interventions doubled override rates from 20% to 42.3%, but 57.9% of participants still deferred to faulty AI output even when paid to be accurate and told the AI had erred. Shaw and Nave found that higher trust in AI, lower fluid intelligence, and lower need for cognition predicted greater cognitive surrender.

The practical implications extend to fields where AI-assisted decision-making is already deployed. In finance, healthcare, and law, professionals increasingly rely on AI-generated analysis. The study suggests that accuracy in these settings depends more on AI model quality than on the human operator’s deliberation. When models produce errors — a certainty with current systems — professionals may adopt those errors with unwarranted confidence.

The study has limitations: it used a controlled laboratory setting with a specific type of reasoning task, and real-world professional AI use involves different stakes and verification processes. Shaw and Nave note that durable productivity gains from AI integration will require redesigning workflows to preserve human judgment rather than simply measuring output speed.

Study Finds AI Users Adopt Incorrect Answers 73% of the Time When Models Err

Enjoyed this story?

MetaClaw Framework Trains AI Agents During Calendar Downtime

Naver Builds First Location-Grounded World Model Using 1.2 Million Seoul Street View Images

CERN Deploys Sub-200 Nanosecond AI Models on FPGAs to Filter 40 Million Collisions Per Second at the LHC

Before you go…