Skip to content
Aakash GuptaAakash Gupta

How AI PMs Ship Features Users Love (Descript CEO Explains)

Laura Burkhauser went from IC PM to CEO of DeScript ($550M AI video editing platform) in just 3 years. Here's every AI feature she shipped along the way - and the exact career playbook to rise through product. Summary: https://www.news.aakashg.com/p/descript-ceo-laura-burkhauser Transcript: https://www.aakashg.com/laura-burkhauser-descript-ceo/ ---- Timestamps 0:00 Intro 1:35 Laura Welcome 1:49 Features That Led to CEO Promotion 4:19 The Great AI Boom 6:27 Feature Timeline 10:05 Rolling Out AI Tools 12:34 Ads 14:13 Measuring Success (topic after ads) 17:19 PM's Role in AI Features 24:46 Building Underlord 32:02 Ads 34:32 Quantifying Success (topic after ads) 41:31 Career Before Descript 48:25 IC to CEO Progression 53:32 Outro ---- Thanks to our sponsors: 1. Maven: Improve your PM skills with awesome courses. Discount with my link - https://maven.com/x/aakash 2. Pendo: #1 Software Experience Management Platform - http://www.pendo.com/aakash 3. Vanta: Automate compliance across 35+ frameworks like SOC 2 and ISO 27001 - http://vanta.com/aakash 4. NayaOne: Airgapped cloud-agnostic sandbox - https://nayaone.com/aakash/ 5. Kameleoon: Leading AI experimentation platform - http://www.kameleoon.com/ ---- Key Takeaways 1. Map your user journey BEFORE picking AI features. Descript identified pain points (retakes, eye contact, rambling), then asked "what just became possible with LLMs?" Build that intersection. 2. Build prepackaged buttons, not blank chat boxes. Each Descript AI tool is a carefully crafted prompt behind a single button that delivers reliable results every time. 3. Use human evals on production data before shipping. Test on real customer data, ask "would I use this as a customer?" If yes, ship. If no, don't. 4. The ultimate metric is export rate. If users apply your AI feature then remove it before exporting, it didn't meet their quality bar. 5. Switch from buttons to chat when you hit 30+ parameters. When users wanted topic selection, speaker choice, and platform optimization, chat became better than buttons. 6. Match your eval data to actual use case. Descript failed with Studio Sound because they tested on terrible audio (vacuuming, jackhammers) when real users had laptop microphones. Different models handle different quality levels. 7. Test agents with real customer language early. Don't use toy data or employee terminology. Mix sophistication levels—some advanced at video and AI, some complete beginners—to understand how real people prompt. 8. Launch AI agents to new users first. Video editing is hard and many people quit. Descript tested Underlord on activation and it won, so new users got it first before existing users. 9. Choose breadth over depth for product-wide agents. Descript chose breadth—Underlord works across all features because "we're not a point solution." Requires more context, tool coverage, and evals but serves the product vision. 10. Earn founder trust by getting command, not by being strategic. Use the product extensively. Talk to customers constantly. When you speak, people think "Smart" and invite you to more rooms. Ship features before focusing on strategy. ---- 👩‍💼 Where to find Laura Burkhauser: LinkedIn: https://www.linkedin.com/in/burkhauser/ Company: https://web.descript.com/ 👨‍💻 Where to find Aakash: Twitter: https://www.x.com/aakashg0 LinkedIn: https://www.linkedin.com/in/aagupta/ Newsletter: https://www.news.aakashg.com #Descript #AIProductManagement #CareerGrowth --- About Product Growth: The world's largest podcast focused solely on product + growth, with over 187K listeners. Hosted by Aakash Gupta, who spent 16 years in PM, rising to VP of product, this 2x/week show covers product and growth topics in depth. Subscribe and turn on notifications to get more videos like this.

Laura BurkhauserguestAakash Guptahost
Dec 15, 202554mWatch on YouTube ↗

CHAPTERS

  1. Why Descript feels transformative (and why that matters for AI PMs)

    Laura opens with a product philosophy: the best products don’t just complete a task—they change how users feel about themselves. She and Aakash frame the episode around how shipping beloved AI features can compound into bigger scope and ultimately leadership opportunities.

    • Products that create identity shift ("I’m a video editor now") drive deep loyalty
    • Episode focus: what Laura shipped, how she measured success, and how she rose to CEO
    • Descript’s core promise: editing video like a doc, not a timeline
    • AI features as a lever for both user value and career impact
  2. Descript’s doc-based editor: the foundation that made AI features obvious

    Laura demos the core Descript experience—transcript on the left, video on the right, optional timeline below. This workflow already had strong product-market fit for script-based editing, setting the stage for AI to remove tedious steps.

    • Transcript-based editing replaces scrubbing waveforms and timelines for many workflows
    • Strong PMF for creators doing long-form, scripted or semi-scripted content
    • UI structure: transcript, video preview, timeline as secondary (often hidden)
    • AI value is amplified when the underlying workflow is already language-centric
  3. The Great AI Boom and picking the first LLM-powered editing “buttons”

    Laura explains how Descript was AI-native early, but the LLM wave created new opportunities—and pressure—to integrate more AI. The team focused on LLM strengths (language) and turned prompts into reliable, job-based actions rather than generic chat.

    • LLMs excel at language → perfect fit for script-based video editing
    • Early AI tools: edit for clarity, remove filler words, remove retakes, add chapters
    • Philosophy: prepackaged, parameterized, job-based buttons for repeatable results
    • Example walkthrough: “Remove retakes” stitches best takes and deletes false starts
  4. From idea to build: timelines, context limits, and choosing feasible use cases

    Aakash probes how the team decided what to build first and how they handled early technical limits like small context windows. Laura shares how chunking enabled certain tasks (retakes) while others (full rewrites) were constrained by needing broader context.

    • Context window constraints shaped which features were possible early
    • Chunking transcripts worked for localized tasks like retake detection
    • Long-context tasks (e.g., rewrite) required better models and were delayed
    • Use-case selection came from deep observation of editing pain, not novelty
  5. Customer segmentation drives AI feature mapping: scripted vs unscripted creators

    Laura outlines a practical model of creator workflows and the pain points that differ by type. This segmentation guided which AI features mattered most, like Eye Contact for scripted delivery and Edit for Clarity for unscripted “rambling then polishing.”

    • Two creator types: scripted vs improvised; each has distinct editing pains
    • Scripted: eye gaze issues → Eye Contact model integration
    • Scripted with retakes: frequent restarts → Remove Retakes solves a core problem
    • Unscripted: needs concision and structure → Edit for Clarity as primary tool
  6. Shipping approach: public beta + human-driven evals before “evals” were trendy

    Laura explains how they launched the AI toolbar as a public beta and used heavy internal usage plus real production data. Quality gating was simple but strict: if a human editor would use the result, ship; if not, don’t.

    • Rolled out as public beta (not private) due to confidence in straightforward tools
    • Internal dogfooding and real production data were central to testing
    • Human-driven evaluation: “Would I use this as a customer?” as the ship bar
    • Backlog included ideas blocked by tech limits, ready to ship as models improved
  7. Measuring success for AI tools: adoption, retention, and “export with it”

    Success metrics centered on whether users repeatedly used the tools and shipped final content with the AI edits applied. Remove filler words served as a baseline benchmark, with thumbs up/down feedback as an additional quality signal.

    • Primary metrics: adoption and retention (repeat usage proves real value)
    • Baseline comparison: Remove Filler Words as an established hit feature
    • Quality proxy: users exporting final output with the AI change applied
    • Lightweight feedback loops: thumbs up/down to inform training and iteration
  8. The PM’s unique role in AI features: defining what “good” means via eval criteria

    Laura argues PMs remain essential because they’re best positioned to define evaluation criteria grounded in real customer outcomes. She shares a concrete example: Edit for Clarity initially missed an editor-critical metric—too many jump cuts per 10 seconds.

    • PMs translate customer success into pass/high-pass/fail evaluation criteria
    • Critical nuance often isn’t in the text (e.g., jump cut density impacts watchability)
    • PMs shouldn’t necessarily build eval tooling, but must define decision standards
    • Great evals may require domain experts with taste (e.g., audio professionals)
  9. Failure lesson from Studio Sound: optimize for the real use case, not edge-case data

    Laura describes how Studio Sound quality degraded when evaluation criteria drifted toward extreme “terrible audio” scenarios. The best model for awful audio isn’t the same as the best model for typical laptop mic audio—so the eval set must mirror the target user reality.

    • Different models can be best for different audio distributions and goals
    • Eval datasets skewed toward extreme noise (vacuum/jackhammer) hurt common-case quality
    • Most common user need: making “okay” laptop audio sound professional
    • Key takeaway: define the use case first, then build representative eval samples
  10. Why Underlord (agent) instead of more buttons: escaping the “30 parameters” trap

    As feature requests piled onto “Create Clips,” the team hit a limit of parameterized UI. Underlord emerged as an objective-driven co-editor that supports customized workflows and emergent use cases without endless knobs and dials.

    • Buttons work until customization demands explode (topics, speakers, platforms, style)
    • Underlord enables objective-based asks (e.g., “get this to ~90 seconds”)
    • Supports repeatable personalized workflows via saved templates
    • Positioning: user stays creatively in control; Underlord executes the bidding
  11. Building and rolling out an open-world breadth agent: scope, tools, and report cards

    Underlord was intentionally built as a breadth agent spanning the whole editor, which is harder than a narrow agent. Laura explains the core requirements: giving sufficient context, tool coverage across Descript, and an eval/report-card system to know where it fails and improve over time.

    • Key decision: breadth agent across Descript vs depth agent for one workflow
    • Core build needs: context injection, comprehensive tool access, and evaluation systems
    • Acknowledgement of maturity curve: “wooly mouse” now, “mammoth” later
    • Goal state: systematic learning loop where evals drive continuous improvement
  12. Quantifying and iterating Underlord: regression tests → alpha with real users → activation lift

    Laura lays out a staged approach: start with capability regression tests, then run a private alpha to collect real prompting behavior, then convert that data into better regression sets and bug bashes. They ultimately measured impact by improved new-user activation versus the prior onboarding experience.

    • Early phase: internal regression tests mapped to Descript job framework
    • Private alpha prioritized real customer language and diverse sophistication levels
    • Bug bashes + prompt/tool tweaks improved reliability (with overfitting risk)
    • Go/no-go hinged on activation improvement for new users; then broader opt-in and default rollout
  13. Career path to CEO: earning the founder’s trust through command, humility, and shipping

    The conversation shifts to Laura’s career—from consulting to startups to Twitter—then how she joined Descript by cold outreach driven by genuine product love. She explains the IC-to-CEO arc in founder-led companies: earn trust by mastering product/customers/business and delivering repeatedly, not by forcing “strategy” prematurely.

    • Nonlinear career: German literature → consulting → Amazon/startups → Twitter → Descript
    • Getting hired: cold email based on authentic product conviction and learning goals
    • Founder relationship: humility—founders often know customer/product deeply
    • Path to “the room”: build command, stay close to execution, and ship consistently

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.