No Priors Ep. 117 | With Co-Director of Stanford's HAI & Founder of World Labs Dr. Fei-Fei Li

No PriorsJun 5, 202535m

Sarah Guo (host), Fei-Fei Li (guest), Elad Gil (host)

Definition and importance of spatial intelligence and 3D world modelsWorld Labs’ mission: 3D generation as a foundation model problemFrontiers beyond language models: emotional intelligence and physical/robotic intelligenceSimulation, haptics, and robot morphology in future roboticsCreative and commercial applications of generative 3D worlds (XR, design, content)Fei-Fei Li’s research journey: ImageNet, captioning, and academic moonshotsHuman-centered AI and AI’s role in augmenting healthcare and society

In this episode of No Priors, featuring Sarah Guo and Fei-Fei Li, No Priors Ep. 117 | With Co-Director of Stanford's HAI & Founder of World Labs Dr. Fei-Fei Li explores fei-Fei Li on spatial intelligence, world models, and fearless AI Dr. Fei-Fei Li discusses why she founded World Labs to build 3D world-model foundation models and argues that spatial intelligence is as fundamental to AI as language. She explains spatial intelligence as understanding, reasoning about, and generating plausible 3D environments that respect geometry and physics, enabling applications from robotics to creative tools and AR/VR. The conversation covers unsolved frontiers like emotional intelligence, the role of simulation and haptics in robotics, and the importance of diverse robot morphologies optimized for specific tasks. Li also reflects on her ImageNet and captioning work, advocates for fearless research and entrepreneurship outside mega-corporate labs, and articulates a vision of human-centered AI that augments people, especially in domains like healthcare.

Fei-Fei Li on spatial intelligence, world models, and fearless AI

Dr. Fei-Fei Li discusses why she founded World Labs to build 3D world-model foundation models and argues that spatial intelligence is as fundamental to AI as language. She explains spatial intelligence as understanding, reasoning about, and generating plausible 3D environments that respect geometry and physics, enabling applications from robotics to creative tools and AR/VR. The conversation covers unsolved frontiers like emotional intelligence, the role of simulation and haptics in robotics, and the importance of diverse robot morphologies optimized for specific tasks. Li also reflects on her ImageNet and captioning work, advocates for fearless research and entrepreneurship outside mega-corporate labs, and articulates a vision of human-centered AI that augments people, especially in domains like healthcare.

Key Takeaways

Spatial intelligence and 3D world models are missing pillars of current AI.

Li argues that AI is incomplete without robust spatial intelligence—the capacity to understand, reason about, and generate 3D worlds that are geometrically and physically plausible, just as evolution endowed humans and animals with such capabilities.

Get the full analysis with uListen AI

Building 3D foundation models will unlock new classes of applications.

World Labs is focused on solving 3D generation as a foundation model problem, which could power applications in design, navigation, simulation, robotics, and immersive AR/VR/XR by providing realistic, editable 3D environments.

Get the full analysis with uListen AI

Data scarcity and productization are major challenges for 3D AI.

Unlike language models that benefit from abundant web text, 3D models require sophisticated data acquisition, synthesis, and engineering, and must overcome the friction of delivering 3D as an intuitive, everyday medium.

Get the full analysis with uListen AI

Simulation and haptics are undervalued components in training robots.

Li believes simulation and synthetic data are crucial for robotics, and that haptic sensing—how systems perceive touch and force—must be tightly integrated with vision and spatial perception to enable robust manipulation, not just navigation.

Get the full analysis with uListen AI

Robotic forms will diversify to optimize for task efficiency and energy.

She anticipates a wide variety of robot morphologies tailored to their environments and tasks (e. ...

Get the full analysis with uListen AI

Fearlessness is essential for impactful research and startups in AI.

Drawing from her own path with ImageNet and now World Labs, Li emphasizes being “rationally bold”: choosing big, uncertain problems, accepting that some ideas seem crazy at first, and not being constrained by the scale of incumbent efforts or training budgets.

Get the full analysis with uListen AI

Human-centered AI should superpower people, especially in under-served domains like healthcare.

Li’s vision is an AI future where technology augments, rather than displaces, human-centered values—love, justice, prosperity—by helping address unmet needs in areas like diagnosis, precision medicine, aging, and mental health where we are short on human capacity, not excess.

Get the full analysis with uListen AI

Notable Quotes

“Without spatial intelligence, AI would be incomplete.”
— Fei-Fei Li

“We are the first company we know of that is solving this 3D generation foundation model problem.”
— Fei-Fei Li

“My hypothesis is that the requirements of different tasks are so vast that having very few forms is energy inefficient.”
— Fei-Fei Li

“Sometimes fearless is this very interesting position where you're somewhat delusional and crazy, but somewhat just rationally bold.”
— Fei-Fei Li

“I think AI is a tool to help people… I want to build a world that AI collaborates and superpowers people.”
— Fei-Fei Li

Questions Answered in This Episode

How might robust 3D world models change everyday creative workflows for designers, marketers, and game developers within the next five years?

Dr. ...

Get the full analysis with uListen AI

What new kinds of safety, ethics, or bias issues arise when AI systems can generate and manipulate highly realistic 3D environments and simulations?

Get the full analysis with uListen AI

In practice, how can startups or academic labs compete on spatial intelligence when large companies have far more compute and proprietary 3D data?

Get the full analysis with uListen AI

What concrete steps are needed to integrate haptics, vision, and world models into a unified robotics stack that can perform complex manipulation tasks?

Get the full analysis with uListen AI

How can policymakers and institutions operationalize a “human-centered AI” framework so that technologies like world models and robotics genuinely improve healthcare access and outcomes?

Get the full analysis with uListen AI

Transcript Preview

Sarah Guo

(music plays) Hi listeners, and welcome back to No Priors. Today's guest is Dr. Fei-Fei Li, a pioneer in computer vision and deep learning. She created ImageNet, the groundbreaking dataset that helped spark the deep learning revolution. Fei-Fei is a Stanford professor and the co-director of the Stanford Institute for Human Centered AI. She's also led AI at Google Cloud, advised international policymakers, and recently co-founded World Labs, a company dedicated to developing spatially intelligent AI. Fei-Fei, thank you for joining us today.

Fei-Fei Li

Well, thanks for inviting me. This is gonna be fun.

Sarah Guo

So, you have made extraordinary contributions to, um, science and policy over the past two, two decades. I'll start with the biggest question. Like, why start a company now?

Fei-Fei Li

Because in my heart I wanna build. I see this as such a critical and fun and exciting moment to build some extraordinary technology that everybody can use, and I believe so much in spatial intelligence and the kind of 3D world models that can empower so many people as well as so many use cases, and I think that's just, it's gonna be really, um, exciting and I can do that with an extraordinarily, uh, uh, e- extraordinarily brilliant group of young technologists.

Sarah Guo

I wanna come back to, you know, the people you're working with because I, I, uh, know some of your co-founders and was, uh, you know, trying to convince them desperately to start a company a while back and then they were like, "Oh no, we have a bigger mission now with Fei-Fei." What is spatial intelligence? Can you define it for a broader audience?

Fei-Fei Li

Spatial intelligence, to me, is the ability to, um, to understand reason and interact and generate 3D worlds because our world fundamentally, no matter how you say we can project it, fundamentally it's 3D and it's 3D because physically it's 3D and digitally if there is a true 3D representation, then we can make a lot of things happen more easily whether it's designing or curation or navigation or, or simulation or, or the experiencing of, um, uh, AR/VR. All this, to me, is part of spatial intelligence. And again, I think it's, what really excites me is humans have spatial intelligence. We are... it's part of our, uh, core intelligent capabilities. Animals have a spatial intelligence. The, the entire journey of evolution also is, um, deeply intertwined with the evolution of spatial intelligence, so it's so fundamental. Without spatial intelligence, AI would be incomplete.

Elad Gil

How does that translate into what you're doing with your company or is there anything you can share in terms of what that means relative to what you're building?

Fei-Fei Li

Yeah, so we're cracking one of the hardest problem in AI which is actually, um, making world models that are fundamentally 3D because once you can, um, crack that problem, you can unlock a lot of spatial intelligence problems, so we are the first company we know of that is solving this, uh, the 3D generation, uh, foundation model problem.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome