
No Priors Ep. 107 | With Physical Intelligence Co-Founder Chelsea Finn
Sarah Guo (host), Chelsea Finn (guest), Narrator
In this episode of No Priors, featuring Sarah Guo and Chelsea Finn, No Priors Ep. 107 | With Physical Intelligence Co-Founder Chelsea Finn explores chelsea Finn on building general-purpose robots through data and hierarchy Chelsea Finn, Stanford professor and co-founder of Physical Intelligence (PI), discusses her decade-long journey in robotics and her company’s mission to build general-purpose AI models that can control many different robots in the physical world.
Chelsea Finn on building general-purpose robots through data and hierarchy
Chelsea Finn, Stanford professor and co-founder of Physical Intelligence (PI), discusses her decade-long journey in robotics and her company’s mission to build general-purpose AI models that can control many different robots in the physical world.
PI focuses on scaling diverse real-world robot data, sharing hardware designs, and developing foundation models that generalize across tasks, environments, and embodiments rather than optimizing for a single narrow application.
Finn explains their architecture combining transformers with pretrained vision-language models, the importance of teleoperated data collection, and their new hierarchical interactive robot system that decomposes long-horizon tasks and incorporates natural language interaction.
She reflects on challenges in robotics versus self-driving, the underrated complexity of motor control, likely form factors for future robots, and why openness and community-building may be essential for the field to succeed at all.
Key Takeaways
Generalist robot models require massive, diverse real-world data, not just more compute.
Finn emphasizes that the primary bottleneck is breadth and diversity of robot experience—many buildings, objects, and tasks—rather than simply scaling model size or FLOPs.
Get the full analysis with uListen AI
Teleoperated, robot-specific data is irreplaceable, even when leveraging human and internet videos.
While web and human demonstration data help with concepts and semantics, robots must still practice using their own bodies to acquire low-level motor competence, analogous to how humans learn physical skills.
Get the full analysis with uListen AI
Cross-embodiment learning can recycle data and accelerate progress across robot platforms.
PI’s research shows that policies trained on pooled data from many different robots can outperform per-lab models, and that changing hardware no longer forces you to discard past datasets.
Get the full analysis with uListen AI
Hierarchical and language-interactive control is key for long, complex tasks.
Their HI robot system separates high-level step selection (conditioned on natural language prompts) from low-level motor control, enabling tasks like sandwich-making with on-the-fly user corrections.
Get the full analysis with uListen AI
Openness—sharing code, weights, and even hardware designs—is a deliberate strategic bet.
PI believes the bigger risk is that no one solves general-purpose robotics; building a robust ecosystem, attracting top researchers, and seeding better hardware outweighs IP-protection concerns at this phase.
Get the full analysis with uListen AI
Motor control and physical intelligence are deeply complex, often underestimated aspects of AI.
Finn notes that evolution spent enormous effort on hands and movement; tasks like pouring water or tying shoelaces demand nuanced control and tiny margins of error, unlike many perception-only AI tasks.
Get the full analysis with uListen AI
Future robot ecosystems will likely be diverse in hardware, powered by shared models.
Rather than one humanoid design dominating, Finn anticipates a “Cambrian explosion” of specialized robot forms—kitchen arms, laundry bots, etc. ...
Get the full analysis with uListen AI
Notable Quotes
“We're trying to build a big neural network model that could ultimately control any robot to do anything in any scenario.”
— Chelsea Finn
“The number one thing is just getting more diverse robot data.”
— Chelsea Finn
“I think the biggest risk with this bet is that it won't work. I'm not really worried about competitors; I'm more worried that no one will solve the problem.”
— Chelsea Finn
“People underestimate how much intelligence goes into motor control.”
— Chelsea Finn
“I would love if we could give our robots skin.”
— Chelsea Finn
Questions Answered in This Episode
How can the robotics community systematically measure and compare the “diversity” of robot training data across different projects and companies?
Chelsea Finn, Stanford professor and co-founder of Physical Intelligence (PI), discusses her decade-long journey in robotics and her company’s mission to build general-purpose AI models that can control many different robots in the physical world.
Get the full analysis with uListen AI
What are the most promising ways to safely deploy early generalist robots in the real world, given their inevitable error rates and lack of human-like robustness?
PI focuses on scaling diverse real-world robot data, sharing hardware designs, and developing foundation models that generalize across tasks, environments, and embodiments rather than optimizing for a single narrow application.
Get the full analysis with uListen AI
How might adding memory and temporal context to current vision-based policies qualitatively change what robots can do compared to today’s frame-by-frame control?
Finn explains their architecture combining transformers with pretrained vision-language models, the importance of teleoperated data collection, and their new hierarchical interactive robot system that decomposes long-horizon tasks and incorporates natural language interaction.
Get the full analysis with uListen AI
Where is the tipping point at which internet-scale vision-language knowledge meaningfully substitutes for scarce robot-specific experience, and where does it clearly fail?
She reflects on challenges in robotics versus self-driving, the underrated complexity of motor control, likely form factors for future robots, and why openness and community-building may be essential for the field to succeed at all.
Get the full analysis with uListen AI
If a “Cambrian explosion” of robot form factors occurs, who will coordinate standards and interfaces so that one foundation model can realistically power such heterogeneous hardware?
Get the full analysis with uListen AI
Transcript Preview
Hi, listeners. Welcome to No Priors. This week, we're speaking to Chelsea Finn, co-founder of Physical Intelligence, a company bringing general purpose AI into the physical world. Chelsea co-founded Physical Intelligence alongside a team of leading researchers and minds in the field. She's an associate professor of computer science and electrical engineering at Stanford University, and prior to that, she worked at Google Brain and was at Berkeley. Chelsea's research focuses on how AI systems can acquire general purpose skills through interactions with the world. So, Chelsea, uh, thank you so much for joining us today on No Priors.
Yeah. Thanks for having me.
You've done a lot of, um, really important storied work, um, in robotics between your work at Google, at Stanford, et cetera, so I would, I would just love to hear a little bit firsthand your background in terms of your path in the world of robotics, what drew you to it initially, and some of the work that you've done.
Yeah. It's been a long road. At the beginning, I was really excited about the impact that robotics could have in the world, but at the same time, I was also really fascinated by this problem of developing perception and intelli-intelligence in machines, uh, and robots embody all of that. Uh, and also there's... Sometimes there's some cool math that you can do as well that makes... Keeps your brain active, makes you think, uh, and so I think all of that is really fun about working in the field. I started working more seriously in robotics more than 10 years ago at this point, uh, at the start of my PhD at Berkeley, and we were working on neural network control, uh, trying to train neural networks that map from image pixels to directly actually to motor torques on a robot arm. Uh, at the time, this was not very popular, uh, and we've come a long way, and it's-
Mm-hmm.
... a lot more accepted in robotics and also just generally something that a lot of people are excited about. Since that beginning point, it was very clear to me that we could train robots to do pretty cool things, but that getting the robot to do one of those things in many scenarios with many objects was a major, major challenge. So 10 years ago, we were training robots to, like, screw a cap onto a bottle and use a spatula to lift an object into a bowl and kind of do a tight insertion or hang up, um, like, a hanger on a clothes rack. Uh, and so pretty cool stuff, uh, but actually getting the robot to do that in many environments with many objects, that's where a big part of the challenge comes in. And, uh, I've been thinking about ways to make broader data sets, train on those broader data sets, and also different approaches for learning, whether it be reinforcement learning, video prediction, uh, imitation learning, uh, all, all those things. Uh, and so yeah. Moved from... Um, spent a year at, at Google Brain, uh, in between my PhD and joining Stanford. Uh, became a professor at Stanford. Started a lab there. Did a lot of work, um, along all these lines. Uh, and then recently started Physical Intelligence, uh, almost a year ago at this point, so I've been on leave from Stanford for that. And it's been really exciting to be able to try to execute on the vision that, uh, the co-founders, uh, that we collectively have and, um, and do it with a lot of resources and so forth. And, um, I'm also still advising students at Stanford as well.
Install uListen to search the full transcript and get AI-powered insights
Get Full TranscriptGet more from every podcast
AI summaries, searchable transcripts, and fact-checking. Free forever.
Add to Chrome