How AI PMs Ship Features Users Love (Descript CEO Explains)

Laura Burkhauser went from IC PM to CEO of DeScript ($550M AI video editing platform) in just 3 years. Here's every AI feature she shipped along the way - and the exact career playbook to rise through product. Summary: https://www.news.aakashg.com/p/descript-ceo-laura-burkhauser Transcript: https://www.aakashg.com/laura-burkhauser-descript-ceo/ ---- Timestamps 0:00 Intro 1:35 Laura Welcome 1:49 Features That Led to CEO Promotion 4:19 The Great AI Boom 6:27 Feature Timeline 10:05 Rolling Out AI Tools 12:34 Ads 14:13 Measuring Success (topic after ads) 17:19 PM's Role in AI Features 24:46 Building Underlord 32:02 Ads 34:32 Quantifying Success (topic after ads) 41:31 Career Before Descript 48:25 IC to CEO Progression 53:32 Outro ---- Thanks to our sponsors: 1. Maven: Improve your PM skills with awesome courses. Discount with my link - https://maven.com/x/aakash 2. Pendo: #1 Software Experience Management Platform - http://www.pendo.com/aakash 3. Vanta: Automate compliance across 35+ frameworks like SOC 2 and ISO 27001 - http://vanta.com/aakash 4. NayaOne: Airgapped cloud-agnostic sandbox - https://nayaone.com/aakash/ 5. Kameleoon: Leading AI experimentation platform - http://www.kameleoon.com/ ---- Key Takeaways 1. Map your user journey BEFORE picking AI features. Descript identified pain points (retakes, eye contact, rambling), then asked "what just became possible with LLMs?" Build that intersection. 2. Build prepackaged buttons, not blank chat boxes. Each Descript AI tool is a carefully crafted prompt behind a single button that delivers reliable results every time. 3. Use human evals on production data before shipping. Test on real customer data, ask "would I use this as a customer?" If yes, ship. If no, don't. 4. The ultimate metric is export rate. If users apply your AI feature then remove it before exporting, it didn't meet their quality bar. 5. Switch from buttons to chat when you hit 30+ parameters. When users wanted topic selection, speaker choice, and platform optimization, chat became better than buttons. 6. Match your eval data to actual use case. Descript failed with Studio Sound because they tested on terrible audio (vacuuming, jackhammers) when real users had laptop microphones. Different models handle different quality levels. 7. Test agents with real customer language early. Don't use toy data or employee terminology. Mix sophistication levels—some advanced at video and AI, some complete beginners—to understand how real people prompt. 8. Launch AI agents to new users first. Video editing is hard and many people quit. Descript tested Underlord on activation and it won, so new users got it first before existing users. 9. Choose breadth over depth for product-wide agents. Descript chose breadth—Underlord works across all features because "we're not a point solution." Requires more context, tool coverage, and evals but serves the product vision. 10. Earn founder trust by getting command, not by being strategic. Use the product extensively. Talk to customers constantly. When you speak, people think "Smart" and invite you to more rooms. Ship features before focusing on strategy. ---- 👩‍💼 Where to find Laura Burkhauser: LinkedIn: https://www.linkedin.com/in/burkhauser/ Company: https://web.descript.com/ 👨‍💻 Where to find Aakash: Twitter: https://www.x.com/aakashg0 LinkedIn: https://www.linkedin.com/in/aagupta/ Newsletter: https://www.news.aakashg.com #Descript #AIProductManagement #CareerGrowth --- About Product Growth: The world's largest podcast focused solely on product + growth, with over 187K listeners. Hosted by Aakash Gupta, who spent 16 years in PM, rising to VP of product, this 2x/week show covers product and growth topics in depth. Subscribe and turn on notifications to get more videos like this.

Laura BurkhauserguestAakash Guptahost

Dec 14, 202554mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Descript CEO on building AI editing tools and PM leadership

Descript’s early AI feature strategy focused on packaging reliable, job-based “buttons” (e.g., remove retakes, edit for clarity) rooted in well-understood user workflows rather than novelty prompts.
The team shipped AI tools using pragmatic, human-driven evaluation against real production data, iterating via public beta, adoption/retention metrics, and whether users exported the AI-modified output.
As user needs became too parameter-heavy for fixed tools (e.g., Create Clips requesting endless knobs), Descript shifted toward Underlord, an objective-driven, open-ended co-editor agent.
Underlord’s rollout emphasized tool coverage, representative regression tests, real-customer private alpha feedback, and improved activation—especially helping novices “get over the hump” of video editing.
Burkhauser frames the PM’s unique value in AI as defining success/failure criteria for evals, while her career advice stresses deep product/customer command, shipping excellence, and humility in founder-led environments.

IDEAS WORTH REMEMBERING

5 ideas

Great AI features start with a concrete workflow pain, not the model.

Descript mapped creator workflows (scripted vs. improvised) and attached AI to specific pains like retakes, eye contact, and clarity—then hid prompts behind dependable, job-based buttons.

Ship “reliable buttons” first; use agents when customization explodes.

Fixed tools work well when inputs are bounded, but Create Clips requests kept adding parameters; Underlord emerged as the right abstraction once users needed highly customized, conversational control.

Human evals against real data are a valid starting point—if you’re disciplined.

Before formal eval stacks were common, the team tested on production-like content and shipped when results were genuinely usable; later they layered regression tests, A/B tweaks, and more automation.

PMs uniquely own the definition of “quality” for AI outputs.

Burkhauser argues only the PM can codify what “good,” “acceptable,” and “harmful” look like because it requires deep job/context understanding (e.g., judging jump-cut density, not just grammar).

Representative eval data matters more than sophisticated scoring.

Studio Sound quality regressed when evaluators used unrealistically terrible audio; the best model for ‘disaster audio’ differed from the best for the common ‘laptop mic’ use case, so the dataset must match the target workflow.

WORDS WORTH SAVING

5 quotes

The best products out there, they don't just do a job for you. They transform how you feel about yourself.

— Laura Burkhauser

Build them in these prepackaged, parameterized, job-based buttons that can give you a reliable result over and over again.

— Laura Burkhauser

You and only you are qualified to write the eval criteria for what… a good job looks like.

— Laura Burkhauser

What it didn't take into account is… how many jump cuts per 10 seconds are you putting into my video?

— Laura Burkhauser

If you're allowing for emergence, you're also allowing for a lot of, like, whack stuff to happen in your product.

— Laura Burkhauser

Product transformation and identity (“how it makes you feel”)LLM-enabled editing actions (retakes, filler words, clarity, chapters)Context window constraints and chunking strategiesAI rollout strategy: public beta, A/B tweaks, production dataMeasuring AI success: adoption, retention, export behaviorPM ownership in AI: eval criteria and pass/fail definitionsUnderlord agent: objectives, emergence, tool coverage, activation

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.