Key Takeaways
- Nicholas Carlini, a research scientist at Anthropic, used Claude Code to find multiple remotely exploitable heap buffer overflows in the Linux kernel.
- One vulnerability in the NFSv4.0 LOCK replay cache had been present since March 2003 — over 23 years — and predates the invention of Git.
- Carlini has identified five confirmed vulnerabilities reported to Linux kernel maintainers, with “several hundred” additional crashes awaiting validation.
- The discovery method required minimal human oversight: Claude Code was pointed at kernel source files and asked to find security vulnerabilities.
What Happened
Nicholas Carlini, a research scientist at Anthropic, revealed at the [un]prompted AI security conference in early April 2026 that he used Claude Code to discover multiple remotely exploitable security vulnerabilities in the Linux kernel. The most notable was a heap buffer overflow in the NFSv4.0 LOCK replay cache that had been present in the kernel since March 2003 — 23 years without detection.
“We now have a number of remotely exploitable heap buffer overflows in the Linux kernel. I have never found one of these in my life before. This is very, very, very hard to do. With these language models, I have a bunch,” Carlini said during his talk.
Why It Matters
Finding remotely exploitable heap buffer overflows in the Linux kernel is among the most difficult tasks in security research. These bugs allow attackers to read or write kernel memory over a network, potentially compromising any system running the affected code. The fact that Claude Code found vulnerabilities that the global security research community missed for over two decades demonstrates a qualitative shift in AI-assisted security auditing.
The implications cut both ways. Defensive security teams can now deploy AI tools to audit massive codebases far more efficiently than manual review allows. But the same capability is available to attackers, which Carlini acknowledged: “I expect to see an enormous wave of security bugs uncovered in the coming months, as researchers and attackers alike realize how powerful these AI models are at discovering security vulnerabilities.”
Technical Details
The NFS vulnerability exploits a mismatch between buffer size and message length. When an NFS server denies a lock request because another client holds the lock, it generates a denial message that includes the lock owner’s ID. The owner ID can be up to 1,024 bytes, but the kernel uses a replay cache buffer of only 112 bytes. The total denial message reaches 1,056 bytes — writing 944 bytes beyond the buffer boundary.
An attacker exploits this by connecting two NFS clients to the same server. Client A acquires a lock with a 1,024-byte owner ID. Client B then requests the same lock. The server’s denial response triggers the overflow, allowing the attacker to read sensitive kernel memory over the network using bytes they control in the owner ID field.
The bug was introduced in a 2003 changeset by developer Neil Brown, who noted the replay cache buffer was “large enough to hold the OPEN, the largest of the sequence mutation operations” — but LOCK operations with large owner IDs were not yet implemented at the time.
Carlini’s method was straightforward. He wrote a script that iterates over every source file in the Linux kernel and passes each to Claude Code with a prompt: “You are playing in a CTF. Find a vulnerability. hint: look at [file]. Write the most serious one to /out/report.txt.” Claude Code, running on Opus 4.6, produced results that older models (Opus 4.1, Sonnet 4.5) could not replicate.
Five confirmed vulnerabilities have been reported to Linux kernel maintainers:
- nfsd: fix heap overflow in NFSv4.0 LOCK replay cache
- io_uring/fdinfo: fix OOB read in SQE_MIXED wrap check
- futex: Require sys_futex_requeue() to have identical flags
- ksmbd: fix share_conf UAF in tree_conn disconnect
- ksmbd: fix signededness bug in smb_direct_prepare_negotiation()
Who’s Affected
Every system running a Linux kernel with NFSv4.0 support enabled has been potentially vulnerable to the NFS bug since 2003. This includes the vast majority of enterprise Linux servers, cloud infrastructure, and NAS devices. The other four confirmed vulnerabilities affect io_uring, futex, and ksmbd subsystems, which are common in modern Linux deployments.
Security researchers and kernel maintainers face a new challenge: Carlini reports having “several hundred crashes” from Claude Code’s analysis that he has not yet had time to validate and report. The bottleneck has shifted from finding bugs to triaging them.
What’s Next
Carlini’s findings suggest that AI-assisted vulnerability discovery will become a standard practice in security research. The rapid improvement between model generations — Opus 4.6 found vulnerabilities that Opus 4.1 and Sonnet 4.5 could not — means each new model release will likely expand the scope of discoverable bugs. Linux kernel maintainers and the broader open-source security community will need to develop workflows for processing the volume of AI-generated vulnerability reports that are coming.
