State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

Name: State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Uploaded: 2026-01-31T12:00:00Z
Duration: 4 h 25 min 12 s
Description: Lex Fridman hosts Sebastian Raschka and Nathan Lambert to map the “state of AI” entering 2026, using the “DeepSeek moment” as a turning point for open-weight Chinese models and intensified global competition.

Lex Fridman PodcastJan 31, 20264h 25m

Lex Fridman (host), Sebastian Raschka (guest), Nathan Lambert (guest), Lex Fridman (host), Sebastian Raschka (guest), Lex Fridman (host)

China vs US AI race and “DeepSeek moment”Model “winners”: ChatGPT vs Gemini vs Claude vs GrokBest AI for coding: IDEs vs terminal agentsOpen-weight explosion: China, US/EU labs, licensing dynamicsTransformer lineage: MoE, attention tweaks, KV cache efficiencyScaling laws: pre-training vs RL scaling vs inference-time scalingTraining stack: pre-, mid-, post-training; synthetic data; data qualityPost-training: RLVR, RLHF, DPO; value functions/process rewardsTool use and agents; UI/UX as a differentiatorLong context, memory, continual learningText diffusion models as non-autoregressive alternativesEconomics: serving costs, subscriptions, ads, acquisitions, IPOsWork culture (996), Silicon Valley bubble, and human impactsAGI timelines, definitions, and skepticism about singularityCompute geopolitics: NVIDIA, TPUs, data centers, power constraintsOpen models policy (The “Atom/Adam Project”) and national strategy

In this episode of Lex Fridman Podcast, featuring Lex Fridman and Sebastian Raschka, State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 explores aI in 2026: scaling, post-training, open models, agents, geopolitics, compute Lex Fridman hosts Sebastian Raschka and Nathan Lambert to map the “state of AI” entering 2026, using the “DeepSeek moment” as a turning point for open-weight Chinese models and intensified global competition.

AI in 2026: scaling, post-training, open models, agents, geopolitics, compute

Lex Fridman hosts Sebastian Raschka and Nathan Lambert to map the “state of AI” entering 2026, using the “DeepSeek moment” as a turning point for open-weight Chinese models and intensified global competition.

They argue architectures remain largely transformer-based, while major progress now comes from post-training (RLVR, RLHF, inference-time scaling), better data pipelines, and systems/compute optimizations rather than radical architectural change.

The conversation contrasts US product dominance (ChatGPT/Gemini/Claude/Grok) with China’s surge in open-weight releases and friendlier licenses, discussing how this could reshape adoption, policy, and business models.

They explore emerging directions (tool use, agents, long context, continual learning, diffusion text models), plus societal issues: work culture, hype bubbles, safety, education/learning, developer jobs, and the long-run trajectory toward (or away from) AGI narratives.

Key Takeaways

No one “owns” unique ideas anymore; budget and compute are the moat.

Raschka argues researcher mobility makes technical ideas diffuse quickly; the differentiator becomes hardware, capital, and operational excellence rather than secret breakthroughs.

Get the full analysis with uListen AI

2026 progress is driven more by post-training and inference-time scaling than new architectures.

Both guests emphasize transformers are still the core; large capability jumps come from RLVR-style training, better post-training pipelines, and letting models “think” longer at inference.

Get the full analysis with uListen AI

Open-weight Chinese models may reshape global adoption through licenses and distribution, not just quality.

Lambert notes security concerns limit US companies paying Chinese APIs, so open weights let Chinese labs gain mindshare and usage via US hosting—especially with fewer licensing “strings” than some Western releases.

Get the full analysis with uListen AI

Serving costs dominate training costs at scale—business incentives shape model design.

They highlight that training may be millions, but serving hundreds of millions of users can be billions; this pushes routing, smaller models, speed/intelligence tradeoffs, and product-level monetization experiments.

Get the full analysis with uListen AI

RLVR is the 2025–2026 post-training workhorse, but evaluation contamination remains a major scientific problem.

RLVR works best on verifiable domains (math/code) and scales well, yet both warn benchmark leakage (e. ...

Get the full analysis with uListen AI

Tool use reduces hallucinations by offloading memory to external systems, but introduces trust, integration, and UX challenges.

Tool calling (search, Python, CLI) can improve factuality and reliability, but users fear granting system access; closed labs benefit from deep integration, while open models face fragmentation across tool ecosystems.

Get the full analysis with uListen AI

Coding assistance is increasingly agentic; “English as programming” is a real skill shift.

They compare IDE copilots (Codex plugin/Cursor) vs terminal agents (Claude Code) and argue the interface/workflow can matter more than the raw model; senior devs may ship more AI-generated code because they can better specify and review it.

Get the full analysis with uListen AI

Long context will keep growing, but real gains may come from smarter context management (compaction, sparse attention, recursion).

They expect incremental context-length increases but emphasize techniques like sparse/sliding attention, recursive task decomposition, and agent-controlled summarization to maintain performance at lower cost.

Get the full analysis with uListen AI

Continual learning is desirable for “employee-like” adaptation, but may be economically impractical; context + memory may substitute.

Lambert frames continual learning as weight updates from feedback; Raschka argues global updates already happen via model releases, while per-user training is expensive—making in-context learning/memory and lightweight adapters (LoRA) more practical.

Get the full analysis with uListen AI

Text diffusion models could win niches where speed and long outputs matter, but struggle with interactive tool-use loops.

They describe diffusion-style parallel generation as promising for large diffs and rapid output, yet tool-augmented workflows interrupt generation in ways that fit autoregressive chains better.

Get the full analysis with uListen AI

The “one model to rule everything” dream is weakening; multi-agent, multi-model workflows are rising.

Lambert suggests future systems will be many specialized agents/models coordinated by UX and orchestration rather than one universal chatbot, aligning with current multi-subscription and multi-tool habits.

Get the full analysis with uListen AI

Open models are strategically important for US innovation and talent pipelines; policy attention is increasing.

Lambert’s “Atom/Adam Project” argues open models drive research ecosystems; they cite the White House AI Action Plan’s open-weight section and note that banning open models is unrealistic without extreme internet control.

Get the full analysis with uListen AI

Notable Quotes

“I don't think nowadays, 2026, that there will be any company who is... having access to a technology that no other company has access to.”
— Sebastian Raschka

“Extended thinking and inference time scaling is just a way to make the models marginally smarter, and I will always edge on that side.”
— Nathan Lambert

“One of the best ways to solve hallucinations is to not try to always remember information or make things up... why not use a calculator app or Python?”
— Sebastian Raschka

“Our GPUs are hurting... we're releasing this because we can use your GPUs.”
— Nathan Lambert (paraphrasing Sam Altman’s rationale for open model distribution)

“I'm hoping that we, society drowns in slop enough to snap out of it.”
— Nathan Lambert

Questions Answered in This Episode

DeepSeek’s “multi-head latent attention” and other attention tweaks: which ones matter most in practice, and why?

Get the full analysis with uListen AI

You argue ideas diffuse fast but compute is the moat—what specific compute bottlenecks (power, networking, HBM, yield) most constrain 2026 progress?

Get the full analysis with uListen AI

RLVR works great for math/code; what’s the most credible path to extending it to open-ended domains without “LLM-as-judge” reward hacking?

Get the full analysis with uListen AI

How should we interpret “aha moments” in reasoning traces—useful emergent behavior or mostly amplification of pretraining patterns?

Get the full analysis with uListen AI

If benchmark contamination is so pervasive, what evaluation protocol would you trust for 2026 frontier models (e.g., post-cutoff secret tests, live competitions, CASP-like)?

Get the full analysis with uListen AI

Transcript Preview

Lex Fridman

The following is a conversation all about the state-of-the-art in artificial intelligence, including some of the exciting technical breakthroughs and developments in AI that happened over the past year, and some of the interesting things we think might happen this upcoming year. At times, it does get super technical, but we do try to make sure that it remains accessible to folks outside the field without ever dumbing it down. It is a great honor and pleasure to be able to do this kind of episode with two of my favorite people in the AI community, Sebastian Raschka and Nathan Lambert. They are both widely respected machine learning researchers and engineers who also happen to be great communicators, educators, writers and Twitterers, X posters. Sebastian is the author of two books I highly recommend for beginners and experts alike. First is Build a Large Language Model From Scratch and Build a Reasoning Model From Scratch. I truly believe in the machine learning computer science world, the best way to learn and understand something is to build it yourself from scratch. Nathan is the post-training lead at the Allen Institute for AI and author of the definitive book on reinforcement learning from human feedback. Both of them have great X accounts, great Substacks. Sebastian has courses on YouTube, Nathan has a podcast, and everyone should absolutely follow all of those. This is the Lex Fridman Podcast. To support it, please check out our sponsors in the description, where you can also find links to contact me, ask questions, give feedback, and so on. And now, dear friends, here's Sebastian Raschka and Nathan Lambert. So I think, uh, one useful lens to look at all of this through is the DeepSeek, so-called DeepSeek moment. This happened about a year ago, in January 2025, when the open-weight Chinese company, DeepSeek, released DeepSeek-R1 that, uh, I think it's fair to say, surprised everyone with, uh, near or at state-of-the-art performance, with allegedly much less compute for much cheaper. And from then to today, the AI competition has gotten insane, both on the research level and the product level. It's just been accelerating. Let's discuss all of this today, and maybe let's start with some spicy questions if we can. [chuckles] Uh, who is winning at the international level? Would you say it's the set of companies in China or the set of companies in the United States? And Sebastian, Nathan, it's good to see you guys. Uh, so Sebastian, who do you think is winning?

Sebastian Raschka

Um, so winning is [chuckles] a very broad, uh, you know, term. I, I would say you mentioned the DeepSeek moment, and I do think DeepSeek is definitely winning the hearts of the people who work on open-weight models because they share these as open models. Um, winning, I think, has multiple timescales to it. We have today, we have next year, we have in ten years. One thing I know for sure is that, um, I don't think nowadays, 2026, that there will be any company who is, let's say, having access to a technology that no other company has access to. And that is mainly because researchers are frequently changing jobs, changing labs, they, uh, rotate. So I don't think there will be a clear winner in terms of technology access. However, I do think there will be, uh, the differentiating factor will be budget and hardware constraints. So I don't think the ideas will be proprietary, but the way or the resources that are needed to implement them. And so I don't see currently a take-it-all scenario where a winner takes it all. I, I can't see that at the moment.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome