OpenAI and Broadcom unveil a custom chip built for AI inference at scale

OpenAI and Broadcom have announced a custom chip designed specifically to run large language models at scale, according to Ars Technica, marking one of the most significant moves yet by an AI developer to build its own silicon. The collaboration pairs OpenAI's knowledge of how its models behave with Broadcom's experience designing and manufacturing high-end semiconductors.
The chip is aimed squarely at inference, the term for the work a model does when it generates a response to a user, as opposed to training, the resource-intensive process of building the model in the first place. As AI services reach hundreds of millions of users, inference has become the dominant ongoing cost, and the economics of serving each query now shape the entire business.
That distinction explains the strategy. A chip optimised purely for inference can strip away features needed for training and concentrate transistors on the specific operations that running a model requires. In principle, that focus can deliver more performance per watt and per dollar than a general-purpose processor handling the same task.
The move also speaks to the industry's dependence on Nvidia, whose graphics processors have powered the bulk of the AI boom. That reliance has left major AI companies exposed to supply constraints and high prices, and several have responded by designing their own accelerators to gain leverage and predictability over their most important input.
Broadcom's role is central. The company is not a household name in consumer technology, but it is a powerhouse in custom chip design, helping large customers turn their requirements into working silicon. Partnering with Broadcom lets OpenAI pursue bespoke hardware without building a semiconductor operation from scratch.
Designing a chip is only part of the challenge. Custom silicon needs a software stack that lets models actually run on it efficiently, and much of the industry's tooling has grown up around Nvidia's ecosystem. Making an in-house chip competitive means investing heavily in the compilers and libraries that translate AI workloads onto the new hardware.
The potential payoff is substantial. If OpenAI can run its models on chips tailored to its exact needs, it could lower the cost of every interaction, ease its dependence on a single supplier, and gain the freedom to optimise hardware and models together. Those advantages compound at the scale OpenAI now operates.
There are risks as well. Cutting-edge chip projects are expensive and slow, and the field moves quickly enough that a design can be overtaken before it ships in volume. Custom silicon also locks in particular assumptions about how models work, which could become a constraint if the underlying technology shifts.
The announcement fits a broader pattern across big technology firms, several of which have built their own AI accelerators rather than relying entirely on off-the-shelf parts. The trend reflects a recognition that, at sufficient scale, controlling the hardware becomes as strategically important as the models themselves.
For the wider market, the Ars Technica report suggests, the significance lies less in any single specification than in the direction of travel. As inference costs dominate the economics of AI, the companies that run the largest services are increasingly determined to design the chips those services depend on, reshaping the balance of power in the semiconductor industry.
Read next

How liquid cooling cuts data-center water use to near zero: a simple explainer
A new data-center cooling design that runs warm, at around 45 degrees Celsius, can cut water use to almost nothing while keeping AI chips cool. This explainer breaks down why data centers consume so much water, how warm liquid cooling changes the equation, and what it means as AI demand grows.

What are passkeys? The password replacement, and who still refuses to offer it
A new website is naming and shaming companies that still do not offer passkeys, the technology designed to replace passwords. This explainer covers what passkeys are, how they make accounts harder to hack, and why adoption across major services remains uneven.

AI was supposed to kill engineering jobs: why new data shows the opposite
Predictions that AI would wipe out software engineering jobs have not held up, with new data suggesting these roles are among the most resilient. Analysts point to a pattern in which AI tools make engineers more productive rather than redundant, even as the nature of the work shifts.

Google opens the Play Store to outside payments: what changes for apps and users
Google is finally allowing app developers to use alternative payment systems in the Play Store, a shift driven by antitrust pressure and legal settlements. The change could lower the fees developers pay and reshape how digital purchases work on Android, though the practical impact for users is still taking shape.

Hollywood and OpenAI: how artificial intelligence is reshaping the cinema economy
Italian director Luca Guadagnino's new artificial-intelligence-themed film "Artificial" reflects a growing alignment between Hollywood's major studios and AI companies such as OpenAI. A report from The Verge sets out what the alignment means for the film industry.
