Robots Don't Need More Compute. They Need This.

Encord is building the AI-native data infrastructure for physical AI and robotics, helping the world's top AI teams create, manage, annotate, and evaluate the data that goes into their models. The company just announced a $60 million Series C led by Wellington Management. In this episode of Founder Firesides, co-founders Eric and Ulrik sat down with YC's Nicolas Dessaigne to talk about building a data infrastructure company before ChatGPT made AI a hot category, why physical AI is the next frontier but needs a fundamentally different approach to data than LLMs, and how they're opening an R&D facility in the Bay Area where robotics companies can bring their own hardware and collect the training data they need to get to market.

Nicolas DessaignehostUlrik WaageguestEric Landauguest

Apr 29, 202618mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Encord’s bet: physical AI progress hinges on better data pipelines

Encord positions itself as an AI-native “universal data layer” that helps physical AI teams ensure the right data enters model training and the wrong data stays out.
The company evolved from computer-vision annotation automation to a broader multimodal platform after ChatGPT increased trust in AI-assisted data workflows.
Physical AI faces the inverse constraint of LLMs: compute is abundant, but real-world embodied data collection, curation, and evaluation are hard and costly.
Encord is expanding into earlier (data collection/pre-training) and later (post-deployment observability and exception handling) stages to create a full data flywheel for robotics teams.
The founders argue the economic upside is massive because most of the world’s economy involves physical work, and they aim to become the default infrastructure layer for physical AI data.

IDEAS WORTH REMEMBERING

5 ideas

Physical AI is bottlenecked by data, not compute.

LLMs benefited from internet-scale text data; robotics needs embodied, real-world data that must be intentionally captured, curated, and validated to reach similar “scaling law” gains.

Small data errors can become big real-world failures.

In physical systems, mistakes can cause safety-critical outcomes (e.g., vehicles, drones), so the tolerance for noisy labels and poor curation is much lower than in many digital-only AI products.

A single, consolidated view of the data flywheel is a competitive edge.

Encord argues that indexing, curation, annotation, model-assisted pre-labeling, and evaluation in one platform enables automation across the pipeline and faster iteration to production.

ChatGPT didn’t just create demand—it changed trust dynamics.

The founders claim early customers were reluctant to let AI touch their data; widespread consumer trust in AI after ChatGPT made AI-assisted workflows (including annotation automation) more acceptable.

Multimodality is operationally harder than text-only AI.

Coordinating video, sensor streams, audio, and text at scale is harder to visualize, QA, and collaborate on, increasing the need for specialized tooling beyond typical LLM data stacks.

WORDS WORTH SAVING

5 quotes

So ultimately, a model is only as good as the data it's trained on, and even like the slightest errors in the data set can influence and impact like how the model actually works in the real world.

— Ulrik Waage

They thought that the Icelandic dating market was going to be bigger than, than the AI market.

— Eric Landau

What it proved was that if you throw data and compute at a problem, then these systems can be extremely performant... With physical AI, it's actually the opposite. Now we have all the compute infrastructure, but you need the data to actually get to the scaling law.

— Eric Landau

But if you have a model in the real world hallucinate, right? That's a self-driving car. That could be a drone that falls down from the sky.

— Ulrik Waage

One thing that was, I think, surprising to, to us is how much it costs to not make a decision. So there's a big opportunity cost of just indecision, and you're constantly paying interest on decisions that you don't make.

— Eric Landau

AI data infrastructure and “universal data layer”Data quality, curation, annotation, and evaluationMultimodal datasets (video, audio, sensors + text)Physical AI vs. LLM scaling constraintsRobotics data collection environments (R&D facility)Post-deployment exception handling and observabilityGo-to-market momentum, Series C, and hiring humans + agents

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.