AI Catches 3 Cancer Misdiagnoses: Claude & NotebookLM

Pratik Desai, a 34-year-old technologist, documented in early 2026 how he built an AI-assisted oversight system for his mother’s Stage 4 duodenal adenocarcinoma — and in doing so, caught three CT-scan misdiagnoses her oncology team had not flagged. The system combined daily exports from Epic’s patient portal with Google’s NotebookLM and Anthropic’s Claude, creating a parallel diagnostic layer that cost nothing beyond existing subscriptions. It is the most detailed public account yet of consumer AI tools functioning as clinical oversight by a non-physician caregiver.

The Workflow: Epic, NotebookLM, and Claude as a Diagnostic Stack

Desai exported his mother’s complete medical records daily from Epic’s MyChart portal — lab results, imaging reports, pathology notes, treatment summaries — and uploaded them into NotebookLM as a persistent, queryable knowledge base. He then used Claude to interrogate that knowledge base, flag inconsistencies across reports, and surface specific concerns to bring to her oncology team.

The technical lift was minimal. Epic’s MyChart supports PDF and structured data exports. NotebookLM ingests PDFs natively. Claude handled the synthesis. The entire stack cost nothing beyond a Claude Pro subscription (~$20/month) and Google’s NotebookLM, which remains free. The architecture requires no programming, no API integration, and no medical training to operate.

What made it effective was not the AI’s diagnostic capability in isolation — it was the combination of longitudinal context (months of records in one queryable knowledge base) with a model capable of spotting inconsistencies across dozens of documents. Radiologists reviewing individual scans don’t routinely cross-reference a patient’s six-month imaging history line by line. Desai’s system did, automatically, every day.

Three Misdiagnoses, Three Interventions That Changed the Outcome

According to Desai’s documented account, the AI workflow flagged three separate errors in his mother’s radiology reports over the course of her treatment — each of which led to a direct clinical intervention.

The first involved a CT scan that characterized a lesion as “stable” — a finding the AI identified as inconsistent with measurements from two prior scans showing a measurable increase in volume. Desai raised the documented discrepancy with her oncologist, who ordered a re-read. The lesion was reclassified as progressive, prompting a treatment adjustment.

The second catch was a missed pleural effusion — fluid accumulation around the lung — that appeared in the imaging data but was absent from the radiologist’s written report. The documentation gap had delayed a drainage intervention by several days. Once flagged and escalated by Desai, the clinical team acted.

The third involved a medication dosing inconsistency between two concurrent treatment protocols, where the AI identified a potential interaction that had not been reconciled in her chart. The oncology pharmacist confirmed the interaction and the regimen was adjusted.

In each case, Desai presented specific, documented discrepancies to the clinical team. The AI provided the analysis. The physicians made the decisions. That division — AI as alert mechanism, physician as decision-maker — is central to why the workflow operated within ethical boundaries rather than outside them.

Why Oncology Creates the Structural Conditions for This Kind of Miss

The errors Desai caught are not anomalous. Radiology miss rates for CT findings range from 3% to 5% in controlled studies, according to research published in the American Journal of Roentgenology. In complex oncology cases involving multiple specialists, imaging sessions, and co-existing conditions, the cumulative probability of documentation errors is substantially higher.

The structural cause is fragmented care. An oncologist reviewing a treatment plan may not have the radiology report open. A radiologist reading a current scan may not have the patient’s three-month-old imaging pulled for comparison. Epic stores all this data — but surfacing cross-document inconsistencies requires manual effort that clinical workflows rarely accommodate.

This is not a physician failure. It is a systems architecture failure — and it is precisely the class of problem that AI document reasoning is built to address. Just as AI has reshaped consumer applications by synthesizing data sources that were always available but never cross-referenced in real time, it is beginning to do the same for personal health records.

How to Replicate Desai’s Workflow (Step-by-Step)

Any caregiver with Epic MyChart access and a Claude Pro or Gemini Advanced subscription can replicate this setup in under an hour. The architecture is straightforward:

Export records from Epic MyChart: Navigate to Health → Download My Data. Select all categories: labs, imaging, visit notes, medications. Export as PDF or FHIR bundle.
Create a NotebookLM notebook: Upload all exported PDFs as source documents. Add new records after each appointment or scan, ideally the same day.
Query with structured prompts: Ask specific, document-grounded questions — “Are there any inconsistencies between the imaging reports from January and March?” or “Does the current medication list reflect the treatment plan from the last oncology note?”
Log every flagged concern: Maintain a running document of AI-identified discrepancies with source citations, organized by date, to bring to clinical appointments.
Frame every concern as a question: “I noticed these two reports seem inconsistent on X — can we review that?” gets a physician’s attention. “The AI said this is wrong” does not.

Desai estimated several hours per week to maintain the workflow during active treatment. The bottleneck is not the technology — it is the time investment and willingness to engage directly with a clinical team that may initially resist caregiver-sourced discrepancies.

The Medical Ethics Argument Nobody Is Making Clearly

The standard objection is that AI-assisted review by non-physicians generates more anxiety-inducing false positives than genuine catches. This deserves direct engagement rather than dismissal.

False-positive risk is real. LLMs misinterpret medical terminology, lack clinical context, and occasionally hallucinate. But Desai was not reading imaging. He was querying text-based radiology reports — a task that does not require clinical training and that large language models handle with demonstrably high accuracy on standardized benchmarks. He was performing consistency checks on documentation, not performing diagnostic interpretation. That distinction defines the appropriate scope of this workflow.

The deeper issue is access asymmetry. A technically literate caregiver with disposable time and a $20/month subscription can run a parallel oversight layer. Most caregivers cannot. If AI-assisted oversight demonstrably improves oncology outcomes — and this case provides documented evidence it can — then access to that oversight becomes a healthcare equity issue, not merely a technology story.

The Humans First movement, which centers on concerns about AI displacing human judgment, rarely addresses cases where AI is filling oversight gaps that human systems have structurally left open. Desai’s workflow is the latter. The distinction matters enormously for how policymakers, health systems, and ethicists should respond.

What Hospitals Should Learn From One Man’s Workflow

Patients and caregivers are already building AI oversight layers independently, using tools that health systems do not control and cannot audit. Epic’s MyChart export function was designed for patient access, not for feeding AI knowledge bases. That is exactly how it is now being used.

Health systems that ignore this will find their documentation standards stress-tested from outside their quality-assurance processes — by caregivers with free tools and enough motivation to read every report twice. The more productive response is structured integration: patient-facing AI tools with standardized interfaces to EHR data, designed explicitly for caregiver oversight.

Anthropic’s Claude agent architecture is among the frameworks now under evaluation at academic medical centers for formal clinical integration. But those are institution-controlled deployments. What Desai built was entirely patient-driven — a category that does not yet have a formal name, a regulatory framework, or a standardized design. That gap is itself a policy problem.

The most important thing hospitals can do today costs nothing: create structured channels for caregivers to raise AI-flagged concerns without being dismissed. That requires no technology investment. It requires a policy decision.

What Claude and NotebookLM Actually Got Right

Claude’s contribution in this case was document reasoning, not medical expertise. Claude 3.7 Sonnet, which Desai reported using, scores in the 95th percentile on standardized reading comprehension benchmarks and maintains coherent context across hundreds of pages of dense, cross-referenced text. NotebookLM’s contribution was retrieval — surfacing the specific document segments Claude needed to perform its analysis.

Neither tool had access to the imaging itself. Every catch was a documentation catch, not a diagnostic catch. That boundary defines both the current utility of this workflow and its limits — and it is the most important thing for caregivers considering a similar approach to understand.

MegaOne AI tracks 139+ AI tools across 17 categories. The combination of a retrieval-augmented knowledge base (NotebookLM) with a capable reasoning model (Claude) is emerging as the highest-value architecture for document-intensive personal workflows. Healthcare — with its layered, cross-referenced, chronologically sensitive records — is the most document-intensive domain most people will ever personally encounter.

Desai’s workflow is not a template health systems can officially endorse, and caregivers should not mistake it for a substitute for clinical judgment. What it is: documented proof that the gap between what EHR systems contain and what clinical workflows surface is wide enough for a technically capable non-physician to find meaningful, life-extending signal. Health systems have a choice — design tools that close that gap themselves, or continue watching caregivers close it with whatever subscriptions they can afford.

A 34-Year-Old Used Claude and NotebookLM to Catch His Mom’s Cancer Misdiagnoses — Her Doctors Missed All 3

The Workflow: Epic, NotebookLM, and Claude as a Diagnostic Stack

Three Misdiagnoses, Three Interventions That Changed the Outcome

Why Oncology Creates the Structural Conditions for This Kind of Miss

How to Replicate Desai’s Workflow (Step-by-Step)

The Medical Ethics Argument Nobody Is Making Clearly

What Hospitals Should Learn From One Man’s Workflow

What Claude and NotebookLM Actually Got Right

Enjoyed this story?

A 34-Year-Old Used Claude and NotebookLM to Catch His Mom’s Cancer Misdiagnoses — Her Doctors Missed All 3

The Workflow: Epic, NotebookLM, and Claude as a Diagnostic Stack

Three Misdiagnoses, Three Interventions That Changed the Outcome

Why Oncology Creates the Structural Conditions for This Kind of Miss

How to Replicate Desai’s Workflow (Step-by-Step)

The Medical Ethics Argument Nobody Is Making Clearly

What Hospitals Should Learn From One Man’s Workflow

What Claude and NotebookLM Actually Got Right

Enjoyed this story?

OpenAI Says GPT-5.6 Sol Autonomously Post-Trained Its Luna Model

Apple Sues OpenAI Over an Alleged Campaign to Steal Trade Secrets

GPT-5.6 Sol Ultra Proof Cracks a 50-Year-Old Graph Theory Conjecture