OpenAI and Broadcom Unveil Jalapeño: A Custom Chip for LLM Inference

AI-assisted draft.

In this article

OpenAI and Broadcom Unveil Jalapeño: A Custom Chip for LLM Inference

OpenAI is officially moving beyond software by venturing into custom silicon with the announcement of "Jalapeño," a dedicated Intelligence Processor. Developed in partnership with Broadcom, this custom accelerator is designed to optimize large language model (LLM) inference at a massive scale.

A Purpose-Built Architecture for Modern LLMs

Unlike many current solutions that rely on modified general-purpose GPUs, Jalapeño is a ground-up design specifically engineered for the unique demands of LLM inference. The goal is to solve the primary bottlenecks of modern AI: data movement and hardware underutilization. By optimizing the architecture to push utilization closer to its theoretical maximum, OpenAI aims to significantly improve performance per watt compared to current state-of-the-art hardware.

While OpenAI has not yet released a finalized technical report or independent benchmarks, early internal tests suggest substantial efficiency gains. Engineering samples are already being utilized in lab environments to run complex machine learning workloads, including the GPT-5.3-Codex-Spark model—a model that currently relies on Cerebras hardware for its inference needs.

A Multi-Company Powerhouse Collaboration

The development of Jalapeño is a sophisticated multi-partner effort that spans the entire hardware stack. OpenAI leads the chip design, leveraging its own AI models to accelerate the development cycle, which reportedly took just nine months from design to tape-out. Broadcom provides the critical silicon manufacturing expertise and advanced networking technology, including its high-performance Tomahawk networking chips. Completing the ecosystem, Celestica is responsible for the boards, racks, and full system integration.

This collaboration represents a strategic shift for OpenAI, moving from a company focused solely on models and products to one that controls the underlying hardware stack. By owning the silicon, OpenAI can theoretically run its models faster, more reliably, and at a much lower cost than competitors relying on third-party providers.

Scaling to Gigawatt Levels by 2026

The roadmap for Jalapeño is highly ambitious. Broadcom CEO Hock Tan has indicated that the first deployment is planned for late 2026, with the intention of operating at a gigawatt scale alongside Microsoft and other strategic partners. The scale of this rollout underscores the massive infrastructure requirements of next-generation AI.

Reports suggest that the partnership includes significant commercial commitments, with Microsoft reportedly expected to guarantee the purchase of 40 percent of the initial chip production to secure the first phase. This level of vertical integration and guaranteed demand signals a major move toward stabilizing the high-cost, high-energy supply chain required to sustain the AI revolution.

Key Takeaways

Custom Silicon Strategy: Jalapeño is an "Intelligence Processor" designed from scratch for LLM inference, aiming to outperform general-purpose hardware in performance per watt.
Rapid Development: Using its own AI models to speed up the process, OpenAI achieved a nine-month design-to-tape-out cycle, a record for high-performance ASICs.
Massive Scale Deployment: The first large-scale deployment is targeted for late 2026 at a gigawatt scale, supported by Broadcom and Microsoft.

OpenAI and Broadcom Unveil Jalapeño: A Custom Chip for LLM Inference

OpenAI and Broadcom Unveil Jalapeño: A Custom Chip for LLM Inference

A Purpose-Built Architecture for Modern LLMs

A Multi-Company Powerhouse Collaboration

Scaling to Gigawatt Levels by 2026

Key Takeaways

Continue reading

OpenAI Unveils Jalapeño: Its First Custom AI Inference Chip

OpenAI Unveils Jalapeño: Its First Custom AI Inference Chip

Inside the Math: How OpenAI’s Jalapeño Chip Targets AI Economics

OpenAI’s Jalapeño Chip: A Strategic Shift Away from Nvidia Dominance

OpenAI Jalapeño Chip: How OpenAI Slashes AI Costs by 50%