Ilya Sutskever – We're moving from the age of scaling to the age of research

Dwarkesh PodcastNov 25, 20251h 36m

Ilya Sutskever (guest), Dwarkesh Patel (host)

Gap between model benchmark scores and real‑world economic impactLimitations of pre‑training and the shift from scaling to researchRole of RL, value functions, and eval‑driven reward hackingHuman vs. machine generalization, sample efficiency, and continual learningEmotions, evolution, and value functions as alignment analogiesSSI’s strategy, compute needs, and path toward superintelligenceAlignment visions: caring for sentient life, deployment, and long‑run equilibria

In this episode of Dwarkesh Podcast, featuring Ilya Sutskever and Dwarkesh Patel, Ilya Sutskever – We're moving from the age of scaling to the age of research explores ilya Sutskever: Beyond scaling laws toward deeply generalizing superintelligence Ilya Sutskever argues that the era of simply scaling pre‑training is ending and we are re‑entering an era where genuine research and new training recipes matter more than raw compute. He highlights a glaring gap between benchmark performance and real‑world usefulness, blaming overfitting to evals, weak generalization, and poorly understood RL fine‑tuning. Much of the discussion contrasts human learning and robustness with current models, exploring value functions, emotions, evolution, and why humans generalize so much better from far less data. Sutskever outlines SSI’s bet on a different technical path to human‑like continual learners, the societal implications of such systems, and his views on alignment, superintelligence, and what “AI going well” might require.

Ilya Sutskever: Beyond scaling laws toward deeply generalizing superintelligence

Ilya Sutskever argues that the era of simply scaling pre‑training is ending and we are re‑entering an era where genuine research and new training recipes matter more than raw compute. He highlights a glaring gap between benchmark performance and real‑world usefulness, blaming overfitting to evals, weak generalization, and poorly understood RL fine‑tuning. Much of the discussion contrasts human learning and robustness with current models, exploring value functions, emotions, evolution, and why humans generalize so much better from far less data. Sutskever outlines SSI’s bet on a different technical path to human‑like continual learners, the societal implications of such systems, and his views on alignment, superintelligence, and what “AI going well” might require.

Key Takeaways

Benchmark‑driven RL can cause models to overfit evals while underperforming in reality.

Teams design RL environments inspired by public benchmarks, so models become like hyper‑specialized competition coders: great on targeted tests, but surprisingly brittle and repetitive in open‑ended workflows.

Get the full analysis with uListen AI

Pre‑training reached diminishing returns; future gains demand new recipes, not just more scale.

Pre‑training was a clear, low‑risk scaling recipe—add data, compute, parameters—but data is finite, compute is now huge, and 100× more of the same is unlikely to radically transform capabilities, pushing the field back into exploratory research.

Get the full analysis with uListen AI

Generalization is the core unsolved problem separating current models from human‑like intelligence.

Humans learn deeply and robustly from tiny amounts of data—even in domains like math and coding that didn’t shape our evolution—while today’s models require massive data and still fail in simple but off‑distribution situations.

Get the full analysis with uListen AI

Value functions and richer intermediate feedback could make RL vastly more compute‑efficient.

Instead of only rewarding final outcomes after long trajectories, learning robust value estimates for partial progress (as humans do with emotions and gut feelings) could massively reduce wasted exploration and improve stability.

Get the full analysis with uListen AI

Alignment may be easier if advanced AIs care about sentient life, including themselves.

Sutskever suggests that an AI which models itself as a sentient being could more naturally extend empathy to other sentient beings, analogous to human mirror neurons and empathy, though this may not perfectly align with specifically human interests.

Get the full analysis with uListen AI

Continual, human‑like learners deployed across the economy could trigger rapid growth.

Instead of a single monolithic AGI that ‘knows everything,’ Sutskever envisions powerful learners that can quickly master any job, whose many instances specialize, learn on the job, and potentially aggregate their knowledge, driving very fast economic expansion.

Get the full analysis with uListen AI

Future safety strategy will likely converge once systems feel unmistakably powerful.

He predicts that as models begin to clearly feel powerful to their creators (not just impressive on paper), frontier labs and governments will become more paranoid, collaborate more on safety, and seek common strategies for constraining superintelligent systems.

Get the full analysis with uListen AI

Notable Quotes

“The models seem smarter than their economic impact would imply.”
— Ilya Sutskever

“Up until 2020 it was the age of research; from 2020 to 2025 it was the age of scaling; now it’s back to the age of research again, just with big computers.”
— Ilya Sutskever

“These models somehow just generalize dramatically worse than people, and it’s super obvious.”
— Ilya Sutskever

“I think the fact that people are like that is proof it can be done.”
— Ilya Sutskever

“There are more companies than ideas by quite a bit.”
— Ilya Sutskever

Questions Answered in This Episode

If generalization is the core bottleneck, what concrete research directions could most improve it beyond today’s architectures and training regimes?

Ilya Sutskever argues that the era of simply scaling pre‑training is ending and we are re‑entering an era where genuine research and new training recipes matter more than raw compute. ...

Get the full analysis with uListen AI

How can we design RL and eval systems that incentivize real‑world robustness instead of benchmark overfitting and ‘reward hacking’ by researchers?

Get the full analysis with uListen AI

What would a practical implementation of an AI value function that mirrors human emotions and gut judgments actually look like in modern ML systems?

Get the full analysis with uListen AI

How realistic—and desirable—is the idea of advanced AIs that explicitly care about sentient life, given conflicts between human and non‑human interests?

Get the full analysis with uListen AI

In a world of many superintelligent, continually learning agents, what governance or technical mechanisms could robustly cap their power and prevent destructive competition?

Get the full analysis with uListen AI

Transcript Preview

Ilya Sutskever

You know what's crazy?

Dwarkesh Patel

Uh-huh.

Ilya Sutskever

That all of this is real.

Dwarkesh Patel

Yeah? Meaning what?

Ilya Sutskever

Don't- don't you think so?

Dwarkesh Patel

Meaning what?

Ilya Sutskever

Like all this AI stuff, and all this Bay-

Dwarkesh Patel

Like, it actually happened?

Ilya Sutskever

...Area? Yeah. That it's happe- like, isn't it straight out of science fiction?

Dwarkesh Patel

Yeah. I- i- another thing that's crazy is, like, how normal this low takeoff feels. The idea that we'd be investing 1% of GDP in AI, like, I feel like it would ha- felt like a bigger deal, you know? But right now, it just feels like-

Ilya Sutskever

We get used to things pretty fast, turns out, yeah. But also, it's kinda like it's abstract, like, what does it mean? What it means that you see it in the news-

Dwarkesh Patel

Yeah.

Ilya Sutskever

...that such and such company announced such and such dollar amount.

Dwarkesh Patel

Right.

Ilya Sutskever

That's- that's all you see.

Dwarkesh Patel

Right.

Ilya Sutskever

It's not really felt in any other way, so far.

Dwarkesh Patel

No. Should we actually begin here? I think this is interesting discussion.

Ilya Sutskever

Sure.

Dwarkesh Patel

I think your point about, well, from the average person's point of view, nothing is that different, will continue being true, even into the singularity.

Ilya Sutskever

No, I don't think so.

Dwarkesh Patel

Okay. Interesting.

Ilya Sutskever

So, the thing which I was referring to not feeling different is, okay, so such and such company announced some, uh, difficult to comprehend dollar amount of investment.

Dwarkesh Patel

Right.

Ilya Sutskever

I don't think anyone knows what to do with that.

Dwarkesh Patel

Yeah.

Ilya Sutskever

But I think that the impact of AI is gonna be felt. AI is going to be diffused through the economy. There are very strong economic forces for this, and I think the impact is going to be felt very strongly.

Dwarkesh Patel

When do you expect that impact? I think the models seem smarter than their economic impact would imply.

Ilya Sutskever

Yeah, this is one of the very confusing things about the models right now, how to reconcile the fact that they are doing so well on evals.

Dwarkesh Patel

Mm-hmm.

Ilya Sutskever

And you look at the evals, and you go, "Those are pretty hard evals."

Dwarkesh Patel

Right.

Ilya Sutskever

They're doing so well. But the economic impact seems to be dramatically behind.

Dwarkesh Patel

Yes.

Ilya Sutskever

And it's almost like it's- it's very difficult to make sense of, how can the model, on the one hand, do these amazing things-

Dwarkesh Patel

Yeah.

Ilya Sutskever

...and then, on the other hand, like, repeat itself twice in some situation, in a kind of a... An- an example would be, let's say you use live coding to do something, and you go to some place, and then you get a bug. And then you tell the model, "Can you please fix the bug?"

Dwarkesh Patel

Yeah.

Ilya Sutskever

And the model says, "Oh, my God. You're so right, I have a bug. Let me go fix that." And it produces a second bug.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript