Dwarkesh PodcastJohn Schulman (OpenAI Cofounder) — Reasoning, RLHF, & plan for 2027 AGI
At a glance
WHAT IT’S REALLY ABOUT
OpenAI Cofounder John Schulman on Training, Alignment, and Near-Term AGI
- John Schulman explains how large language models are first pre-trained to imitate the internet and then post-trained (via RLHF and related methods) into helpful, safe assistants with a narrower, chat-focused persona.
- He expects rapid capability gains over the next few years, including long-horizon task execution, better coding and research assistance, and agents that can work on entire projects rather than single prompts.
- Schulman discusses alignment and safety plans if AGI arrives sooner than expected, emphasizing careful evaluation, possible pauses, and coordination among major labs to avoid unsafe racing dynamics.
- He also details how RLHF actually works in practice, why post-training has dramatically improved GPT‑4 since launch, and how future systems may learn online, use richer memory, and act more like persistent colleagues than search engines.
IDEAS WORTH REMEMBERING
5 ideasPre-training creates a broad, calibrated world-model; post-training shapes it into a specific assistant persona.
Base models learn to predict the next token across internet-scale data and can imitate many styles, while RLHF and fine-tuning narrow this into a helpful, instruction-following chatbot optimized for human approval rather than raw imitation.
Near-term models will likely handle full projects, not just single prompts.
Schulman anticipates that within a couple of years models will write multi-file codebases, test, iterate, and recover from errors, enabling sustained collaboration on tasks that today require human project management.
Long-horizon training could unlock big jumps, but won’t alone solve AGI.
Extending coherence over hours to months via long-horizon RL may trigger phase-like transitions in capability, yet he expects remaining deficits—taste, ambiguity-handling, UI/real-world affordances—that still limit fully human-level performance.
Alignment will likely require incremental deployment, strong evaluations, and cross-lab coordination.
If AGI arrives earlier than expected, Schulman favors slowing new training and large-scale deployment, running intensive evals and red-teaming, monitoring deployed systems, and coordinating limits across big labs to avoid unsafe races.
Post-training is a major performance lever and a potential moat.
Most of GPT‑4’s improvement since launch (e.g., in ELO benchmarks) comes from better post-training—data quality, iterative RLHF, and process improvements—which requires complex infrastructure, tacit organizational knowledge, and skilled raters.
WORDS WORTH SAVING
5 quotesI think even in one or two years, you could imagine having the models carry out a whole coding project… moving away from using the model like a search engine and more towards having a whole project that I'm doing in collaboration with the model.
— John Schulman
We might not wanna jump to having AIs run whole firms immediately, even if the models are good enough to actually run a successful business themselves.
— John Schulman
It seems like then you should be planning for the possibility you would have AGI very soon…
— Dwarkesh Patel
Yeah, I think that would be reasonable.
— John Schulman
Right now it's hard to get the models to do anything coherent. But if they started to get really good, I think we would have to take some of these questions seriously.
— John Schulman
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome