LAUNCHES

OpenAI Launches Safety Bug Bounty Program to Identify AI Abuse and Agentic Risks

M megaone_admin Mar 26, 2026 2 min read
Engine Score 8/10 — Important

OpenAI's safety bug bounty has a high industry impact, directly enabling security researchers to improve AI safety. This initiative, while not entirely novel in tech, is a significant and actionable development for a leading AI company.

Editorial illustration for: OpenAI Launches Safety Bug Bounty Program to Identify AI Abuse and Agentic Risks

OpenAI launched a Safety Bug Bounty program on March 25, 2026, expanding its vulnerability reporting framework beyond traditional security flaws to cover AI abuse patterns and safety risks that could cause real-world harm. The program complements the company’s existing Security Bug Bounty, which has been running since April 2023, and pays between $250 for lower-priority findings and $7,500 for critical safety issues.

The new program targets a category of problems that conventional security audits miss. A traditional bug bounty rewards researchers for finding exploits like SQL injection or authentication bypass. OpenAI’s Safety Bug Bounty instead rewards identification of ways the AI could be manipulated to produce harmful outcomes, even when no technical vulnerability exists. This includes prompt injection attacks that bypass safety filters, agentic workflows that take unintended actions, and systematic methods to extract harmful content.

For agentic risks specifically, where AI agents take actions in the real world, the program requires that submissions be reproducible at least 50 percent of the time. This threshold acknowledges that agentic systems are inherently less deterministic than traditional software, while still demanding consistency before paying a bounty. As AI agents gain capabilities like computer use, web browsing, and code execution, the surface area for unintended behavior expands dramatically.

The launch coincides with OpenAI’s broader safety push. The company’s Codex Security tool, released around March 9, identified over 11,000 high-severity and critical flaws including 792 critical vulnerabilities across 1.2 million scanned commits during its first 30 days of testing. These were found in widely used open-source projects including OpenSSH, GnuTLS, PHP, and Chromium. On March 24, OpenAI published a Teen Safety Policy Pack on GitHub with guidelines for developers building age-appropriate AI experiences.

The program is managed through Bugcrowd, the same platform handling OpenAI’s security program. The maximum payout for the security program was increased to $100,000 in March 2025, though the safety program’s current ceiling is lower at $7,500, a gap that may narrow as the program matures.

The program establishes a framework that other AI companies will likely need to replicate. As models become more capable and autonomous, the distinction between security vulnerabilities and safety risks blurs. A model that can be reliably manipulated into generating harmful medical advice represents a real risk that someone needs to find and report before it causes harm at scale.

Share

Enjoyed this story?

Get articles like this delivered daily. The Engine Room — free AI intelligence newsletter.

Join 500+ AI professionals · No spam · Unsubscribe anytime

M
MegaOne AI Editorial Team

MegaOne AI monitors 200+ sources daily to identify and score the most important AI developments. Our editorial team reviews 200+ sources with rigorous oversight to deliver accurate, scored coverage of the AI industry. Every story is fact-checked, linked to primary sources, and rated using our six-factor Engine Score methodology.

About Us Editorial Policy