Skip to content
No PriorsNo Priors

No Priors Ep. 53 | With AMD CTO Mark Papermaster

Compute is the fuel for the AI revolution, and customers want more chip vendors. AMD CTO Mark Papermaster joins Sarah and Elad on No Priors to discuss AMD’s strategy, their newest GPUs, where inference workloads will live, the chip software stack, how they are thinking about supply chain issues, and what we can expect from AMD in 2024. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Show Notes: 0:00 Introduction and Mark’s background 2:35 AMD background and current markets 4:40 AMD shifting to AI space 8:54 AI applications coming out of AMD 10:57 Software investment 15:15 The benefits of open-source stacks 16:58 Evolving GPU market 20:21 Constraints on GPU production 24:11 Innovations in chip technology 27:57 Chip supply chain 30:18 Future of innovative hardware products 35:42 What’s next for AMD

Sarah GuohostMark PapermasterguestElad Gilhost
Feb 29, 202439mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 0:52

    Conviction’s Embed Accelerator plug + setting up the episode with AMD CTO Mark Papermaster

    Sarah opens with a brief announcement about Conviction’s Embed Accelerator, including funding and compute credits. She then introduces Mark Papermaster, highlighting his leadership background across major hardware companies and teeing up a conversation about GPUs and industry competition.

    • Embed Accelerator: $150k uncapped SAFE plus compute/API credits
    • Mark Papermaster introduced as AMD CTO
    • Mentions Mark’s prior roles at IBM, Apple, and Cisco
    • Episode focus: GPUs, competition, and the evolving compute landscape
  2. 0:52 – 2:46

    Mark’s career arc: from early CMOS at IBM to Apple, then AMD as Moore’s Law slows

    Mark recounts entering chip design during the early shift to CMOS at IBM, where he worked across microprocessors and large systems. He describes moving to Apple to lead iPhone/iPod engineering before joining AMD in 2011 at a moment when Moore’s Law was clearly decelerating and innovation needed to come from system-level design.

    • Early hands-on chip design during the CMOS transition at IBM
    • Work spanning PowerPC, mainframes, and RISC servers
    • Recruited to Apple to run iPhone/iPod engineering
    • Joined AMD in 2011 as Moore’s Law began to slow, raising the innovation bar
  3. 2:46 – 4:42

    AMD’s portfolio and market footprint: CPUs, gaming, cloud, embedded, and acquisitions

    Mark gives a high-level overview of AMD’s evolution from second-source roots to a broad compute portfolio. He emphasizes AMD’s competitiveness mandate over the last decade and how acquisitions like Xilinx and Pensando expanded AMD into embedded and networking/interconnect needs for modern scale-out workloads.

    • AMD’s shift from second-source heritage to broad compute portfolio
    • Turnaround mandate under CEO Lisa Su and leadership team
    • EPYC CPUs in cloud deployments; major presence in gaming consoles
    • Xilinx acquisition expands embedded/FPGA footprint
    • Pensando adds networking/interconnect for scaled workloads
  4. 4:42 – 9:13

    Why AMD leaned into AI: heterogeneous compute strategy and the road to MI300

    Elad asks how AMD’s AI focus emerged; Mark traces the GPU-driven breakthrough era (e.g., image recognition and NLP acceleration) and explains AMD’s deliberate sequencing. AMD first rebuilt CPU competitiveness with Zen, then scaled heterogeneous CPU+GPU systems for HPC and AI, culminating in the MI300 platform launch.

    • AI acceleration via GPUs recognized as pivotal early on
    • AMD prioritized rebuilding CPU leadership first (Zen in 2017)
    • Long-term bet on heterogeneous compute (CPU + GPU) enabled by ATI acquisition
    • HPC supercomputer wins as a proving ground for hardware + software co-design
    • MI300 announced as flagship for HPC and AI training/inference
  5. 9:13 – 11:00

    MI300 workloads and performance claims: training, inference, efficiency, and memory bandwidth

    Mark details where AMD is most bullish: large-model training and especially inference at scale. He argues MI300 is competitive in training and leads in inference, attributing gains to optimized math engines plus high-bandwidth memory capacity that improves performance per watt and density per rack.

    • Primary demand driver: LLM training and inference
    • MI300 positioned as “halo” product to compete head-on at the top end
    • Emphasis on inference leadership and FP16/vLLM-style throughput metrics
    • Performance-per-watt and rack-space efficiency highlighted
    • Memory bandwidth/capacity framed as critical to real-world efficiency
  6. 11:00 – 15:39

    Software stack and developer adoption: ROCm, frameworks, and real deployment feedback loops

    Sarah asks about competing beyond hardware—especially software ecosystems like CUDA vs ROCm. Mark explains AMD’s approach: support GPU semantics and key frameworks (PyTorch, ONNX, TensorFlow), partner with platforms like Hugging Face for continuous testing, and use early customer deployments to harden “easy to deploy” experiences.

    • Competition is multi-dimensional: hardware, efficiency, and software ecosystem
    • ROCm positioned as the enabling stack; focus on ease of deployment
    • Deep involvement with PyTorch/ONNX/TensorFlow; PyTorch Foundation participation
    • Hugging Face model testing on AMD hardware as part of release process
    • Customer deployments (e.g., Lamini) used to refine usability and reliability
  7. 15:39 – 17:16

    Open-source philosophy: avoiding lock-in and winning on merit

    Mark explains why AMD emphasizes open-source tooling and stacks, from LLVM to ROCm. He frames it as a customer-choice strategy—eschewing proprietary “walled gardens”—and argues open ecosystems accelerate collaboration and keep the industry from stagnating due to lack of competition.

    • Open source as a cultural commitment and collaboration lever
    • LLVM and ROCm highlighted as core open components
    • Goal: customer choice rather than proprietary lock-in
    • Competition viewed as necessary to avoid industry stagnation
    • Xilinx acquisition further strengthened AMD’s open-source posture
  8. 17:16 – 20:46

    How AI compute markets expand: from hyperscalers to specialized clouds, edge, and AI PCs

    Elad asks about the cloud compute market and how it changes as GPU supply constraints ease. Mark predicts constraints will eventually abate while demand continues to balloon, driving a proliferation of model sizes and tailored clusters and pushing inference outward—from hyperscalers to tier-2 providers, edge deployments, and AI-enabled PCs and embedded devices.

    • GPU supply constraints expected to ease over time, while demand expands rapidly
    • Shift from hyperscaler-only clusters to more diverse data center operators
    • Rise of smaller/fine-tuned models enables more varied compute configurations
    • Inference moving to the edge for latency and data locality (factory floor, etc.)
    • AI accelerators integrated into PCs; embedded pull via Xilinx portfolio
  9. 20:46 – 24:28

    What constrains GPU supply: wafers, substrates, advanced packaging, and looming power limits

    The discussion turns to what’s actually limiting GPU availability. Mark explains that it’s not just fab capacity: substrates and advanced packaging (chiplets, HBM integration, vertical/lateral interconnect) are key bottlenecks, and he flags data center power availability as a major longer-term constraint—driving a stronger focus on energy efficiency each generation.

    • Supply-demand management lessons from the pandemic era
    • Constraints include substrates and advanced packaging capacity, not only wafers
    • MI300 exemplifies complex chiplet + HBM packaging and interconnect requirements
    • AMD’s fabless model and partnerships (e.g., TSMC) emphasized
    • Power availability in data centers described as a key future constraint; efficiency becomes top priority
  10. 24:28 – 28:05

    Post–Moore’s Law innovation: chiplets, heterogeneous engines, packaging, and end-to-end co-optimization

    Sarah asks what innovation looks like when transistor scaling no longer does “most of the heavy lifting.” Mark describes a holistic design approach: mix specialized compute engines, use chiplets on the most suitable process nodes, advance packaging/interconnect (including 3D concepts), and optimize up through the software stack and application requirements.

    • Moore’s Law slowing increases cost and reduces automatic power/perf gains
    • Holistic design replaces node-shrink-only progress
    • Heterogeneous compute: right engine for the right workload (CPU/GPU/dedicated accelerators)
    • Chiplets enable mixing nodes and functions for better economics/performance
    • Packaging/interconnect plus software-stack awareness are core to next-gen gains
  11. 28:05 – 30:41

    Supply chain and geopolitics: geographic diversification of fabs and packaging ecosystems

    Sarah raises strategic concerns around supply chain concentration and geopolitical risk. Mark outlines AMD’s approach: collaborate with governments and partners to expand and diversify manufacturing and packaging footprints across regions, noting that semiconductor supply chain shifts take years compared to software product cycles.

    • Semiconductor supply continuity framed as national-security relevant
    • Support for fab expansion and geographic diversification (e.g., US and global builds)
    • Packaging and broader ecosystem diversification required—not just foundries
    • Acknowledges historical global “pockets of expertise” and need to rebalance
    • Long lead times: supply chain changes take years, unlike software iteration
  12. 30:41 – 36:03

    New AI-first hardware wave: what predicts success in devices like AR/VR and assistants

    Elad asks about the resurgence of consumer hardware experimentation (Vision Pro, Rabbit, Humane, robotics). Mark credits shrinking, low-power compute as an enabler but says success depends on product-market pull—devices must deliver a genuinely loved capability and often create a new category rather than incremental improvement.

    • Compute and power efficiency advancements enable new device categories
    • AR/VR success requires low latency to avoid discomfort and deliver presence
    • Technology is necessary but not sufficient for product success
    • Successful devices solve real user needs and deliver “loveable” experiences
    • AI in PCs could feel like a new category due to local, low-latency capabilities (e.g., live translation)
  13. 36:03 – 39:03

    AMD’s 2024 focus: AI across the full portfolio and a major deployment year

    Sarah closes by asking what AMD wants engineers and founders to know about 2024. Mark describes 2024 as a deployment year: AMD has AI-enabled its portfolio from cloud to edge to PCs and gaming, and now aims to be recognized in AI based on delivered performance, ecosystem readiness, and breadth of real-world deployments.

    • 2024 positioned as a major rollout/deployment year for AMD’s AI capabilities
    • AI enablement across cloud, edge, PCs, embedded, and gaming (e.g., AI upscaling)
    • Goal: broader recognition beyond the incumbent leader via results and value
    • Emphasis on partner ecosystem and end-to-end experiences spanning cloud + client
    • Episode wrap-up and thanks

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.