The Twenty Minute VCTuring CEO Jonathan Siddharth: Who Wins in Data Labelling & Why 99% of Knowledge Work Will Disappear
At a glance
WHAT IT’S REALLY ABOUT
Turing’s CEO on research accelerators, agents, and SaaS’s decline
- Siddharth argues the “data labeling” era is ending as frontier labs now need complex, expert-generated data and reinforcement-learning environments to train agentic systems that perform real workflows.
- He describes Turing as a “research accelerator” partnering with most frontier labs while also deploying fine-tuned, smaller on-prem models in enterprises to capture real-world failure modes and improve systems.
- He predicts slow-but-steady AI takeoff rather than sudden discontinuity, with major near-term value coming from unlocking a “model capability overhang” via scaffolding, context, evals, and partial-autonomy UX.
- He contends durable advantage shifts from software features to data-driven feedback loops created by deployment, where usage generates the gradients that continuously improve agents and workflows.
- He believes SaaS as traditionally packaged GUIs and apps is structurally threatened by cheap custom software creation, foundation model vendors moving up-stack, and agents using tools/APIs directly rather than human-oriented interfaces.
IDEAS WORTH REMEMBERING
5 ideasFrontier labs increasingly want workflow realism, not labeled trivia.
As base models get smarter, marginal gains require harder-to-produce data: expert-level tasks, domain nuance, and multi-step work outputs rather than simple prompt/response pairs or “cat pictures.”
Agents change the data pipeline: RL environments become core infrastructure.
Training agents relies less on imitation alone (SFT/RLHF) and more on reinforcement learning in simulated business “mini-worlds” with tool clones, verifiers, and curricula that produce trajectories and feedback.
Enterprise deployments are a data strategy, not just a go-to-market motion.
Siddharth argues “touching reality” reveals where models break; those failures become new training/eval data, creating a compounding loop that improves performance and defensibility over time.
The biggest blocker to enterprise AI is operational “schlep,” not model IQ.
He highlights messy, siloed data; lack of eval infrastructure; workflow redesign for partial autonomy; and change management (including tandem human+AI periods) as the practical work that makes pilots succeed.
Small, fine-tuned models can beat giant models for specific regulated workflows.
For tasks like insurance underwriting, firms may prefer on-prem, fine-tuned models (≈0.5B–10B parameters) for speed, accuracy, and data sovereignty, distilling institutional judgment without leaking it to competitors.
WORDS WORTH SAVING
5 quotesAll knowledge work is going to be automated. It's only a matter of time.
— Jonathan Siddharth
I think the era of data labeling companies is over.Uh, Turing is a research accelerator, and it's now the era of research accelerators.
— Jonathan Siddharth
If a human's job involves looking at a computer, analyzing what's on the screen, using different tools, using a keyboard and a mouse, it's going to be automated. It's only a matter of time.
— Jonathan Siddharth
I don't see an AI bubble. Like, I feel like these models are incredibly powerful today. Like, GPT-5 is, like, fucking awesome. I don't know what people were talking about when they're talking about... You know, I know there was some chatter. I think we've just gotten used to magic.
— Jonathan Siddharth
SaaS, as we know it, I think is over. I feel like quite a few SaaS apps, um, were built at a time when, um, software was relatively hard to build, uh, and complex to build.
— Jonathan Siddharth
High quality AI-generated summary created from speaker-labeled transcript.