No Priors Ep. 112 | With OpenAI Deep Research, Isa Fulford

On this episode of No Priors, Sarah sits down with Isa Fulford, one of the masterminds behind deep research. They unpack how the initiative began, the role of human expert data, and what it takes to build agents with real-world capability and even taste. Isa shares the differences between deep research and OpenAI’s o3 model, the challenges around latency, and how she sees agent capabilities evolving. Plus, OpenAI has announced that deep research is free for all US users starting today. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @IsaFulf Show Notes: 0:00 Deep research’s inception & evolution 6:12 Data creation 7:20 Reinforcement fine-tuning 9:05 Why human expert data matters 11:23 Failure modes of agents 13:55 The roadmap ahead for Deep Research 18:32 How do agents develop taste? 19:29 Experience and path to building a broadly capable agent 22:03 Deep research vs. o3 25:55 Latency 27:56 Predictions for agent capabilities

Sarah GuohostIsa Fulfordguest

Apr 24, 202530mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Inside OpenAI Deep Research: Building a Broadly Capable Research Agent

The episode features Isa Fulford discussing Deep Research, OpenAI’s agentic product that uses reinforcement learning, web browsing, and tools to perform complex, multi-step research tasks. She explains how the team moved from an internal demo to a production system by designing new datasets, tools, and evaluations grounded in real-world knowledge work. The conversation covers when reinforcement fine-tuning (RFT) is worth doing, how human experts and synthetic data shaped the model’s capabilities, and how Deep Research is already used across domains from science to fashion and travel. Fulford also outlines the path toward unified, trustworthy agents that can both research and take actions, along with the safety, memory, and UX challenges that must be solved first.

IDEAS WORTH REMEMBERING

5 ideas

Ground agent training in concrete, high-value user tasks.

Rather than focusing on flashy transactional demos (like ordering food), the team started from real knowledge-work tasks—literature reviews, product comparisons, travel planning—and built datasets and evals specifically around those practical outcomes.

Use reinforcement fine-tuning when tasks are critical, niche, or out-of-distribution.

RFT is most worth the effort when a task is either central to your business and needs a substantial quality boost, or so different from a model’s training distribution (e.g., specialized genomics workflows) that prompting alone cannot reach acceptable performance.

Human experts plus synthetic data are key to high-quality agent behavior.

Deep Research relied on expert-generated browsing trajectories and outcomes across many domains, supplemented with synthetic datasets, to teach the model what good research looks like without hand-specifying every step of the process.

Tooling must evolve alongside models to unlock richer capabilities.

Today’s Deep Research uses a text-based browser (with PDFs and images) and Python for analysis; future agents will need wider toolsets and training data that force them to choose, combine, and backtrack across tools to solve complex, multi-step problems.

Preventing and exposing hallucinations remains critical, especially as answers get longer.

Even though Deep Research hallucinates less than previous OpenAI models, comprehensive, well-written answers can be over-trusted, so features like citations are essential to let users inspect and verify sources.

WORDS WORTH SAVING

5 quotes

If you can't write a literature review, you're not gonna be able to write a new scientific paper.

— Isa Fulford

We just are gonna go for max thinking time every time.

— Isa Fulford

Everybody does kind of see a pretty clear path to this broadly capable agent.

— Isa Fulford

Anything that you would delegate to a coworker, it should be able to do.

— Isa Fulford

It really was one of those things where we thought that training on browsing tasks would work… but actually the first time you train a model on a new dataset and seeing it actually working was pretty incredible.

— Isa Fulford

Origin and goals of OpenAI’s Deep Research productReinforcement learning and reinforcement fine-tuning (RFT) for agentic behaviorDesigning datasets, tools, and evaluations for browsing-based researchHuman expert data, synthetic data, and generalization across domainsSafety, hallucinations, and guardrails for research and action-taking agentsProduct evolution: private data, right actions, and unified agentsReal-world use cases and UX tradeoffs (depth vs speed, memory, preferences)

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.