Y CombinatorWhy Vertical LLM Agents Are The New $1 Billion SaaS Opportunities
At a glance
WHAT IT’S REALLY ABOUT
Legal AI Pioneer Reveals Blueprint For Billion-Dollar Vertical Agents
- The episode features Jake Heller, founder of Casetext, detailing how his decade-old legal tech company pivoted almost overnight to build CoCounsel, a GPT-4-powered legal AI assistant that led to a $650M acquisition by Thomson Reuters.
- He explains the long pre-LLM slog of incremental improvements in legal research, and how early access to GPT-4 transformed their product from marginally better tooling into a fundamental workflow change that lawyers could no longer ignore.
- Jake outlines how they redeployed 120 employees in 48 hours, used test-driven prompt engineering, and deeply modeled expert legal workflows to move from flashy demos (70% reliability) to mission-critical performance that lawyers trust.
- The discussion generalizes Casetext’s lessons to the broader opportunity for vertical AI agents, the importance of domain-specific integrations and evals, and how newer reasoning models like OpenAI o1 may further enhance agentic workflows.
IDEAS WORTH REMEMBERING
5 ideasEarly, decisive pivots around transformative tech can create outsized outcomes.
Within 48 hours of seeing GPT-4, Casetext shifted all 120 employees to build CoCounsel; that bet turned a solid, $100M-valued company into a $650M acquisition in months.
Vertical AI agents win by deeply modeling expert workflows, not by wrapping APIs.
Casetext decomposed real legal tasks into many discrete steps that mirror how top attorneys work (search, read, annotate, outline, draft), then turned each step into carefully tested prompts and logic.
Moving from 70% to near-100% reliability requires rigorous evals and TDD.
They built hundreds to thousands of tests per prompt, practicing test-driven development for prompting to systematically reduce hallucinations and catch regressions instead of relying on “vibes-based” prompt tweaks.
Mission-critical adoption hinges on trust; one bad experience can kill usage.
Because lawyers are conservative and extremely sensitive to errors, Casetext optimized first-week experience and accuracy, knowing that early mistakes would cause users to disengage for a long time.
Real IP lives in data, integrations, and business logic around the model.
Beyond LLM access, Casetext’s value came from proprietary legal datasets, specialized document-management integrations, robust OCR and preprocessing, and complex orchestration logic—making it hard to copy.
WORDS WORTH SAVING
5 quotesIt took maybe 48 hours for us to decide to take every single person at the company and shift what they were working on to 100% building this new product based on GPT-4.
— Jake Heller
Until the very end, until CoCounsel, a lot of what we did were, relatively speaking, incremental improvements on the legal workflow—and when there's just an incremental improvement, it's actually pretty easy to ignore.
— Jake Heller
By the time you’ve dealt with all of the edge cases—before you even hit the large language model—there might be dozens of things you’ve built into your application to actually make it work and work well.
— Jake Heller
People will pay $20 a month for the 70%, and maybe $500 or $1,000 a month for something that actually works, depending on the use case.
— Jake Heller
If we’re an example of anything, it’s that there’s a path and you can do it—the jobs aren’t going to go away, they’ll just be more interesting.
— Jake Heller
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome