Lex Fridman PodcastRaschka & Lambert on Lex Fridman: Why Post-Training Won 2025
Rlvr and inference time scaling, not architecture, drove 2025 AI gains. Deepseek open-weight releases showed frontier performance need not be closed-source.
At a glance
WHAT IT’S REALLY ABOUT
AI in 2026: scaling, post-training, open models, agents, geopolitics, compute
- Lex Fridman hosts Sebastian Raschka and Nathan Lambert to map the “state of AI” entering 2026, using the “DeepSeek moment” as a turning point for open-weight Chinese models and intensified global competition.
- They argue architectures remain largely transformer-based, while major progress now comes from post-training (RLVR, RLHF, inference-time scaling), better data pipelines, and systems/compute optimizations rather than radical architectural change.
- The conversation contrasts US product dominance (ChatGPT/Gemini/Claude/Grok) with China’s surge in open-weight releases and friendlier licenses, discussing how this could reshape adoption, policy, and business models.
- They explore emerging directions (tool use, agents, long context, continual learning, diffusion text models), plus societal issues: work culture, hype bubbles, safety, education/learning, developer jobs, and the long-run trajectory toward (or away from) AGI narratives.
IDEAS WORTH REMEMBERING
5 ideasNo one “owns” unique ideas anymore; budget and compute are the moat.
Raschka argues researcher mobility makes technical ideas diffuse quickly; the differentiator becomes hardware, capital, and operational excellence rather than secret breakthroughs.
2026 progress is driven more by post-training and inference-time scaling than new architectures.
Both guests emphasize transformers are still the core; large capability jumps come from RLVR-style training, better post-training pipelines, and letting models “think” longer at inference.
Open-weight Chinese models may reshape global adoption through licenses and distribution, not just quality.
Lambert notes security concerns limit US companies paying Chinese APIs, so open weights let Chinese labs gain mindshare and usage via US hosting—especially with fewer licensing “strings” than some Western releases.
Serving costs dominate training costs at scale—business incentives shape model design.
They highlight that training may be millions, but serving hundreds of millions of users can be billions; this pushes routing, smaller models, speed/intelligence tradeoffs, and product-level monetization experiments.
RLVR is the 2025–2026 post-training workhorse, but evaluation contamination remains a major scientific problem.
RLVR works best on verifiable domains (math/code) and scales well, yet both warn benchmark leakage (e.g., Qwen math contamination) can mislead conclusions and distort research claims.
WORDS WORTH SAVING
5 quotesI don't think nowadays, 2026, that there will be any company who is... having access to a technology that no other company has access to.
— Sebastian Raschka
Extended thinking and inference time scaling is just a way to make the models marginally smarter, and I will always edge on that side.
— Nathan Lambert
One of the best ways to solve hallucinations is to not try to always remember information or make things up... why not use a calculator app or Python?
— Sebastian Raschka
Our GPUs are hurting... we're releasing this because we can use your GPUs.
— Nathan Lambert (paraphrasing Sam Altman’s rationale for open model distribution)
I'm hoping that we, society drowns in slop enough to snap out of it.
— Nathan Lambert
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome