Skip to content
The Twenty Minute VCThe Twenty Minute VC

Turing CEO Jonathan Siddharth: Who Wins in Data Labelling & Why 99% of Knowledge Work Will Disappear

Jonathan Siddharth is Founder and CEO of Turing, one of the fastest-growing AI companies advancing frontier models. Jonathan has led the company to an astonishing $300M ARR with just $225M raised and a profitable company. A Stanford-trained AI scientist, Jonathan previously helped pioneer natural language search at Powerset, which was acquired by Microsoft. ----------------------------------------------- Timestamps: 00:00 Intro 00:51 Redefining “Talent Marketplaces” Today 03:46 Data, Compute, Algorithms: What is Most Abundant? 16:59 The Biggest Challenges Enterprises Have with AI Adoption 20:57 Why Will 99% of Knowledge Work Will be Gone in 10 Years 28:53 How Will Data-Driven Feedback Loops Replace Technology as the Moat 34:20 Is Revenue BS in Data Labelling? Are Players Calling GMV Revenue? 43:43 Are We in an AI Bubble? 52:22 Why is SaaS Dead in a World of AI? 01:00:32 Will the Phone be the Primary User Interface to an AI World? 01:07:46 Quick-Fire Round ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Jonathan Siddharth on X: https://twitter.com/jonsidd Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #jonathansiddharth #turing #datalebelling #ai #data #saas

Jonathan SiddharthguestHarry Stebbingshost
Dec 1, 20251h 17mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

Turing’s CEO on research accelerators, agents, and SaaS’s decline

  1. Siddharth argues the “data labeling” era is ending as frontier labs now need complex, expert-generated data and reinforcement-learning environments to train agentic systems that perform real workflows.
  2. He describes Turing as a “research accelerator” partnering with most frontier labs while also deploying fine-tuned, smaller on-prem models in enterprises to capture real-world failure modes and improve systems.
  3. He predicts slow-but-steady AI takeoff rather than sudden discontinuity, with major near-term value coming from unlocking a “model capability overhang” via scaffolding, context, evals, and partial-autonomy UX.
  4. He contends durable advantage shifts from software features to data-driven feedback loops created by deployment, where usage generates the gradients that continuously improve agents and workflows.
  5. He believes SaaS as traditionally packaged GUIs and apps is structurally threatened by cheap custom software creation, foundation model vendors moving up-stack, and agents using tools/APIs directly rather than human-oriented interfaces.

IDEAS WORTH REMEMBERING

5 ideas

Frontier labs increasingly want workflow realism, not labeled trivia.

As base models get smarter, marginal gains require harder-to-produce data: expert-level tasks, domain nuance, and multi-step work outputs rather than simple prompt/response pairs or “cat pictures.”

Agents change the data pipeline: RL environments become core infrastructure.

Training agents relies less on imitation alone (SFT/RLHF) and more on reinforcement learning in simulated business “mini-worlds” with tool clones, verifiers, and curricula that produce trajectories and feedback.

Enterprise deployments are a data strategy, not just a go-to-market motion.

Siddharth argues “touching reality” reveals where models break; those failures become new training/eval data, creating a compounding loop that improves performance and defensibility over time.

The biggest blocker to enterprise AI is operational “schlep,” not model IQ.

He highlights messy, siloed data; lack of eval infrastructure; workflow redesign for partial autonomy; and change management (including tandem human+AI periods) as the practical work that makes pilots succeed.

Small, fine-tuned models can beat giant models for specific regulated workflows.

For tasks like insurance underwriting, firms may prefer on-prem, fine-tuned models (≈0.5B–10B parameters) for speed, accuracy, and data sovereignty, distilling institutional judgment without leaking it to competitors.

WORDS WORTH SAVING

5 quotes

All knowledge work is going to be automated. It's only a matter of time.

Jonathan Siddharth

I think the era of data labeling companies is over.Uh, Turing is a research accelerator, and it's now the era of research accelerators.

Jonathan Siddharth

If a human's job involves looking at a computer, analyzing what's on the screen, using different tools, using a keyboard and a mouse, it's going to be automated. It's only a matter of time.

Jonathan Siddharth

I don't see an AI bubble. Like, I feel like these models are incredibly powerful today. Like, GPT-5 is, like, fucking awesome. I don't know what people were talking about when they're talking about... You know, I know there was some chatter. I think we've just gotten used to magic.

Jonathan Siddharth

SaaS, as we know it, I think is over. I feel like quite a few SaaS apps, um, were built at a time when, um, software was relatively hard to build, uh, and complex to build.

Jonathan Siddharth

From talent marketplaces to “research accelerators”Complex data generation vs simple labelingChatbots to agents: tool use and RL environmentsRL environments as synthetic-but-grounded workflow dataEnterprise AI adoption constraints: “first-mile” and “last-mile” schlepData-driven feedback loops as the new moatRevenue quality and concentration in AI data servicesNo AI bubble vs “capability overhang”SaaS defensibility in an agentic worldFuture AI interfaces: wearables, always-on multimodal devicesRobotics/embodied AI as next major data frontier

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.