OpenAI Launches Safety Bug Bounty Program for AI Abuse and Risk Reporting

OpenAI launched a public Safety Bug Bounty program on March 25, 2026, expanding its existing security bounty to cover AI abuse and safety risks that could cause tangible harm even when they do not represent traditional security vulnerabilities. The program offers rewards ranging from $250 for low-priority issues to $7,500 for critical safety flaws, with submissions managed through Bugcrowd.

The safety bounty differs from conventional bug bounties in scope. Traditional programs focus on unauthorized access, data leaks, and code execution vulnerabilities. The new program targets misuse patterns — ways that AI systems could be manipulated to cause harm through normal usage channels. For agentic AI risks, the program requires that reported vulnerabilities be reproducible at least 50 percent of the time, acknowledging the probabilistic nature of AI behavior.

The launch comes alongside other safety-focused releases. OpenAI published a Teen Safety Policy Pack on March 24, providing policy guidelines and sample prompts for developers building AI experiences for teenagers. The company also released gpt-oss-safeguard, an open-source safety model, as part of its approach to making safety tools available beyond its own products.

OpenAI’s Codex Security tool, an AI-driven application security agent launched around March 9, provides context for why a safety bounty makes sense now. In its first 30 days of research testing, Codex Security identified over 11,000 high-severity and critical flaws — including 792 critical vulnerabilities — across 1.2 million scanned commits in widely used open-source projects including OpenSSH, GnuTLS, PHP, and Chromium.

The program’s reward structure — maxing at $7,500 compared to $100,000 for security bugs — reflects the difficulty of quantifying safety risk. A security vulnerability has clear impact: unauthorized access, data exposure, code execution. A safety vulnerability is harder to assess: an AI that provides subtly biased medical advice or that can be manipulated into generating harmful content through indirect prompting causes real damage, but the severity is contextual and the exploitability is probabilistic.

Whether other AI companies follow with similar programs will indicate how the industry approaches the gap between security (what an attacker can make a system do) and safety (what a system might do even when used as intended). The distinction matters as AI tools gain the ability to take actions in the real world.

OpenAI Launches Safety Bug Bounty Program for AI Abuse and Risk Reporting

Enjoyed this story?

Bluesky Launches AI Assistant Attie for Custom Social Media Feed Creation

Chroma Releases Context-1: 20B Parameter Model for Multi-Hop Search

Developer Releases Miasma Tool to Trap AI Web Scrapers with Poisoned Data

Before you go…