4mo ago

ChatGPT o3 found a Linux Kernel vulnerability. "The future" has an 8% success rate, and a 28% chance of false positives.

sean.heelan.io How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation

In this post I’ll show you how I found a zeroday vulnerability in the Linux kernel using OpenAI’s o3 model. I found the vulnerability with nothing more complicated than the o3 API &#821…

This blog post has been reported on and distorted by a lot of tech news sites using it to wax delusional about AI's future role in vulnerability detection.

But they all gloss over the critical bit: in fairly ideal circumstances where the AI was being directed to the vuln, it had only an 8% success rate, and a whopping 28% false positive rate!