OpenAI Unveils GPT-5.5-Cyber to Revolutionize Automated Patching

OpenAI is significantly escalating the AI arms race in cybersecurity with the full release of GPT-5.5-Cyber, a specialized model designed to outperform existing industry benchmarks. By moving beyond simple vulnerability detection to automated patch generation, OpenAI aims to bridge the critical gap between finding flaws and fixing them.

Benchmarking Success: GPT-5.5-Cyber vs. Mythos

The release of GPT-5.5-Cyber marks a major milestone in specialized LLM performance. According to OpenAI, the new model sets new highs across critical security benchmarks, notably outperforming Anthropic's Mythos 5. In the CyberGym benchmark, which measures an agent's ability to reproduce known flaws, GPT-5.5-Cyber achieved a score of 85.6%, surpassing Mythos 5's 83.8% and the standard GPT-5's 81.8%.

Even more striking is the performance on ExploitGym, where GPT-5.5-Cyber reached 39.5%, nearly double the 25.95% recorded by the base GPT-5 model. On the SEC-bench Pro, which evaluates long-term vulnerability discovery, the model scored 69.8%, maintaining a significant lead over Claude Opus 4 (73.1% in CyberGym) and previous iterations. These numbers suggest that GPT-5.5-Cyber is specifically fine-tuned for the nuanced logic required in offensive and defensive security research.

Closing the Loop with Codex Security

A central component of the Daybreak cybersecurity initiative is the updated Codex Security plugin. While many tools focus solely on scanning, the updated Codex Security manages the entire pipeline from discovery to patch generation. Since its research preview in March, the plugin has scanned over 30 million commits across 30,000 codebases, resulting in 500,000 automatically flagged fixes.

The plugin functions as a virtual security engineer, performing deep scans of entire codebases, conducting attack path analysis, and checking if vulnerable code is actually reachable. Critically, it supports modern developer workflows by exporting findings via SARIF files or CodeQL queries. To prevent "hallucinated" security fixes, OpenAI emphasizes that human engineers must still sign off on every change.

A Global Defense Ecosystem

OpenAI is not building this in isolation; it is constructing a massive partner network through the Daybreak Cyber Partner Program. The program includes industry giants such as CrowdStrike, Cisco, Cloudflare, Palo Alto Networks, IBM, and SentinelOne. These firms can integrate GPT-5.5 with "Trusted Access for Cyber" directly into their proprietary security products.

Furthermore, the initiative extends into the public sector and open-source stability. OpenAI has established Trusted Access partnerships with governments including Australia, Canada, France, Germany, Japan, and the UK. On the open-source front, the "Patch the Planet" initiative—partnering with Trail of Bits and HackerOne—is already working on critical projects like cURL, Go, and Python to secure the foundation of the internet.

Key Takeaways

  • Superior Benchmarks: GPT-5.5-Cyber leads key industry tests like CyberGym and ExploitGym, outperforming both Anthropic's Mythos and standard GPT-5 models.
  • End-to-End Automation: The Codex Security plugin automates the transition from vulnerability discovery to patch generation, supporting deep scans and attack path analysis.
  • Vetted Access Only: To mitigate risks, the highly permissive GPT-5.5-Cyber model is restricted to verified defenders under strict monitoring and guardrails.