OpenAIOpenAI x Broadcom — The OpenAI Podcast Ep. 8
CHAPTERS
OpenAI and Broadcom announce a custom chip + full-system partnership
Andrew Mayne opens with Sam Altman, Greg Brockman, and Broadcom’s Hock Tan and Charlie Kawwas to announce a new partnership. The core news: they’ve been co-developing a custom AI chip and—unexpectedly—expanded into designing a full integrated system together.
What “10 gigawatts” means and why vertical integration matters
Sam explains that “10 incremental gigawatts” of deployment is massive—yet still likely insufficient as demand grows. He frames the effort as end-to-end optimization from transistor fabrication to the tokens returned by ChatGPT, aiming for large efficiency and cost gains.
Why Broadcom is the partner: compute as the engine of frontier models
Hock Tan argues the fit is natural: OpenAI pushes frontier models, and Broadcom can deliver leading semiconductor and system expertise. Compute is positioned as the critical ingredient for progress toward better models and eventually superintelligence.
Workload-driven design: building chips and systems around inference needs
Sam describes how inference capacity requirements triggered a shift toward a chip tailored for specific workloads, and then to full-system co-design. The chapter emphasizes designing to actual AI workload shapes rather than relying solely on general-purpose accelerators.
Using AI to help design the chip: faster schedules and area reductions
Greg explains that OpenAI’s models have been applied to the chip design process itself, producing optimizations and helping compress timelines. Human experts often recognize the optimizations as reasonable, but the models accelerate discovery and iteration.
From chat to agents: why demand for compute will explode
Greg describes the product shift from interactive chat to always-on agents working in the background. He argues that everyone ideally has a 24/7 agent, but current compute availability forces gating features to higher tiers—illustrating a vast unmet demand.
AI infrastructure as historic-scale critical utility (railroads/internet analogy)
The group compares AI infrastructure to railroads and the internet: a foundational utility enabling entire ecosystems rather than a “chip project.” They stress global collaboration, multi-industry coordination, and the need for openness/standards to scale to billions.
Why OpenAI is designing chips now: control, niche workloads, and roadmap influence
Greg says custom silicon is about serving underserved workloads and gaining leverage over the direction of hardware roadmaps. He recalls early frustration when accelerator startups ignored OpenAI’s feedback about where models were headed, motivating deeper in-house involvement.
Training vs inference: different chips for different phases of AI
Hock and Greg outline why training chips prioritize raw compute and networking, while inference leans more on memory and memory bandwidth. The chapter frames a future of multiple specialized chips and platforms rather than a one-size-fits-all accelerator.
Compute becomes central to AGI: lessons from scaling and Dota 2
Greg recounts OpenAI’s shift in 2017 from “ideas-first” to empirically validated scaling laws, first seen in large-scale RL (Dota 2). This experience pushed OpenAI to treat compute as a core lever, not a secondary implementation detail.
Energy and efficiency: ‘intelligence per watt’ as the gating factor
Sam summarizes the goal as maximizing intelligence output per unit of energy—‘wringing out’ efficiency across model, chip, and system design. The conversation acknowledges GPUs’ flexibility and importance while arguing that optimized systems will deliver better efficiency for known workloads.
The hardware roadmap: 3D stacking, optics, and faster scaling curves
Charlie outlines technical directions: multi-die packaging in 2D, moving to 3D stacking, and integrating optical switching (including a cited 100Tbps optics integration). The aim is to improve cluster performance and energy efficiency, potentially accelerating progress with rapid iteration cycles.
Timeline, deployment ramp, and the mission for compute abundance
They give a concrete schedule: first results by end of next year, followed by rapid deployment over about three years. Greg emphasizes the massive execution challenge and ties it to OpenAI’s mission—making AI broadly accessible by shifting the world from compute scarcity to compute abundance.
Closing: partnership appreciation and future check-ins
Andrew wraps by noting the excitement and inviting future updates as the project develops. The guests thank each other and reiterate shared commitment to the partnership and the long-term roadmap.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome