Skip to content
The Twenty Minute VCThe Twenty Minute VC

Turing CEO Jonathan Siddharth: Who Wins in Data Labelling & Why 99% of Knowledge Work Will Disappear

Jonathan Siddharth is Founder and CEO of Turing, one of the fastest-growing AI companies advancing frontier models. Jonathan has led the company to an astonishing $300M ARR with just $225M raised and a profitable company. A Stanford-trained AI scientist, Jonathan previously helped pioneer natural language search at Powerset, which was acquired by Microsoft. ----------------------------------------------- Timestamps: 00:00 Intro 00:51 Redefining “Talent Marketplaces” Today 03:46 Data, Compute, Algorithms: What is Most Abundant? 16:59 The Biggest Challenges Enterprises Have with AI Adoption 20:57 Why Will 99% of Knowledge Work Will be Gone in 10 Years 28:53 How Will Data-Driven Feedback Loops Replace Technology as the Moat 34:20 Is Revenue BS in Data Labelling? Are Players Calling GMV Revenue? 43:43 Are We in an AI Bubble? 52:22 Why is SaaS Dead in a World of AI? 01:00:32 Will the Phone be the Primary User Interface to an AI World? 01:07:46 Quick-Fire Round ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Jonathan Siddharth on X: https://twitter.com/jonsidd Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #jonathansiddharth #turing #datalebelling #ai #data #saas

Jonathan SiddharthguestHarry Stebbingshost
Dec 1, 20251h 17mWatch on YouTube ↗

CHAPTERS

  1. Turing’s shift from “talent marketplace” to superintelligence data partner

    Jonathan reframes Turing as a company training superintelligence rather than matching developers to jobs. He positions the frontier labs’ needs as a three-pillar stack—research, compute, and data—arguing Turing is focused on the data pillar as it rapidly evolves.

  2. Why frontier-model data has become harder: simple labeling → expert, real-work data

    The conversation explains the transition from easy-to-produce datasets (basic prompts, simple tasks) to sophisticated, domain-specific work artifacts. Jonathan argues that progress now depends on expert humans producing complex outputs that resemble real workplace deliverables.

  3. From chatbots to agents: how training data changes (SFT/RLHF → tool use + RL)

    Jonathan distinguishes chatbot training (SFT and RLHF) from agent training that requires tool-use behavior and multi-step execution. He introduces reinforcement learning environments as a core mechanism for teaching agents to act in realistic workflows.

  4. Building RL environments at scale: the SDR workflow example

    A concrete SDR scenario illustrates how agents learn inside simulated “mini world models” with cloned tools and verifiers. Jonathan describes curriculum design and feedback signals, likening the approach to self-play dynamics from AlphaZero.

  5. The $30T workflow matrix: industry × function × role × workflow

    Jonathan argues knowledge work can be decomposed into workflows that can each be modeled and trained. Turing’s ambition is to generate RL environments across this four-dimensional space, creating coverage across the economy.

  6. How Turing differs from data-labeling vendors: “research accelerator” + enterprise reality checks

    Jonathan positions Turing as a research-oriented partner rather than a labeling shop, emphasizing fast-changing training paradigms. He also highlights enterprise deployments as a way to see where models fail in real conditions and feed improvements back into training.

  7. Why enterprises will want custom (often smaller) models: the insurance underwriting case

    A detailed underwriting example explains why on-prem, fine-tuned smaller models can outperform giant general models for specific tasks. Jonathan argues enterprises want to distill proprietary institutional judgment while protecting sensitive data and integrating internal tools.

  8. AI adoption constraints in enterprises: incentives, change management, and “first-mile/last-mile schlep”

    Harry challenges the 10-year automation timeline by pointing to broken processes and poor data inside incumbents. Jonathan responds that competition will force adoption, especially in revenue-driving front office use cases, while acknowledging the heavy implementation work required.

  9. Labor-to-AI budget transfer: where it’s happening and how to measure value

    They discuss whether AI growth depends on budgets shifting from labor to technology. Jonathan cites early wins in lower-risk domains and references research showing models already reach expert-level performance in many task categories, with big room left in multi-step work.

  10. If knowledge work is automated: productivity explosion, entrepreneurship, and inequality debate

    Jonathan forecasts massive leverage for individuals and a boom in entrepreneurship as “intelligence becomes an API.” Harry questions whether this widens inequality; Jonathan argues cheaper access to expertise narrows gaps relative to hiring expensive humans.

  11. Moats in an AI world: data-driven feedback loops and deployment flywheels

    As software creation gets easier, Jonathan argues durable advantage comes from feedback loops created by real usage. He emphasizes enterprise deployment as the critical path to uncover failure modes and continuously retrain, making “touching reality” the moat.

  12. Revenue quality in data provisioning: GAAP vs GMV, project recurrence, and trust with labs

    They unpack confusion around ‘revenue’ in the data ecosystem and how to interpret it. Jonathan describes lab work as recurring project-based demand, where reliability, secrecy, and operational firewalls are essential to remain a trusted supplier.

  13. Market dynamics: concentration, geopolitics, and why Jonathan rejects the “AI bubble” thesis

    Jonathan argues concentration among a few major buyers is normal (comparing to Nvidia’s customer concentration) and expects massive spending across compute, energy, and data. He predicts sovereign models and government demand, and insists capability is real—deployment friction, not a bubble, is the bottleneck.

  14. Why ‘SaaS is dead’ (and the counterargument), plus the future interface beyond the phone

    Jonathan claims SaaS is threatened by DIY app building, foundation model verticalization, and UI shifts away from human-centric GUIs. Harry argues most companies can’t build/maintain dozens of tools and vertical SaaS remains defensible; they then explore ambient, multimodal devices as the next interface.

  15. Quick-fire: takeoff speed, China, open vs closed models, leadership lessons, and robotics opportunity

    Jonathan summarizes his key beliefs: slow takeoff, serious China competitiveness, nuanced open/closed model tradeoffs, and a more hands-on leadership philosophy. He predicts a few data-market winners with research depth, and highlights embodied AI/robotics as the largest emerging whitespace.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.