How I AIThe internal AI tool that's transforming how Stripe designs products | Owen Williams
Claire Vo and Owen Williams on stripe’s Protodash makes AI prototypes realistic, shareable, review-ready fast.
In this episode of How I AI, featuring Owen Williams and Claire Vo, The internal AI tool that's transforming how Stripe designs products | Owen Williams explores stripe’s Protodash makes AI prototypes realistic, shareable, review-ready fast Stripe designers found off-the-shelf AI design tools produced uncanny, off-brand “blurple slop” because they lacked awareness of Stripe’s Sail design system.
At a glance
WHAT IT’S REALLY ABOUT
Stripe’s Protodash makes AI prototypes realistic, shareable, review-ready fast
- Stripe designers found off-the-shelf AI design tools produced uncanny, off-brand “blurple slop” because they lacked awareness of Stripe’s Sail design system.
- Williams built Protodash as an opinionated prototyping starter kit (React scaffold + Sail components + MCP integration + Cursor/LLM rules) that reliably generates Stripe-native dashboards and flows about 90% of the way.
- Stripe’s devbox infrastructure made prototypes easy to run and share via URLs, enabling “demos, not memos” design reviews where stakeholders click through real interactions instead of viewing slides.
- Protodash Studio extends the system into a browser-based experience with embedded LLM chat, variant generation, self-testing via screenshots, and in-canvas annotation that queues targeted AI fixes.
- Internal-tool customization (review mode, summaries, culture-specific workflows) drove adoption—surprisingly with PMs becoming major power users for early exploration, communication, and faster user testing.
IDEAS WORTH REMEMBERING
5 ideasDesign-system grounding is the difference between usable AI prototypes and “slop.”
Protodash improves realism by forcing the model to consult Stripe’s Sail components (via MCP) and follow strict rules (e.g., don’t invent components, avoid Tailwind unless enabled), so prototypes look and feel like the real product.
Lowering the “NPM/terminal cliff” unlocks prototyping for non-engineers.
Williams designed the workflow so designers/PMs need minimal local setup—often just opening a preconfigured devbox and prompting—turning coding literacy into an accessible, AI-mediated skill.
Shareable URLs rewire reviews from presentations to interactive critique.
Running prototypes on dev boxes lets teams review by clicking through real states and flows, reducing reliance on Figma screenshots/slide decks and making feedback more concrete and actionable.
Browser-native prototyping makes iteration faster than editor-centric workflows.
Protodash Studio embeds the LLM into the prototyping environment so users can request changes, generate variants, deploy/share, and “remix” others’ prototypes without switching tools or dealing with repo setup.
Automated self-testing increases trust in AI-generated UI changes.
Having the agent take screenshots, check console/errors, and iterate until the UI matches intent helps maintain Stripe’s high quality bar and reduces time spent debugging AI’s partial or broken edits.
WORDS WORTH SAVING
5 quotesSo like blurple slop is what I would call them and like they do a really, really good job, but they don't know about your design system, right?
— Owen Williams
It's sort of been this very transformative thing because all of a sudden I'm sitting in these design reviews and like it, it's so convincing that I'm like, "Is this the real product or am I looking at like something fake?"
— Owen Williams
Like, I never want to see a slideshow again. It's like Demos, not memos.
— Owen Williams
Like yelling at Claude Code for 18 months.
— Owen Williams
As soon as I've sent the first shouty prompt, it's time to, like, reset the, like, slash clear and start again.
— Owen Williams
QUESTIONS ANSWERED IN THIS EPISODE
5 questionsHow exactly do the Sail MCP server and the Cursor/Studio rules interact—what’s the step-by-step decision flow when a user pastes a Figma link versus writing a text prompt?
Stripe designers found off-the-shelf AI design tools produced uncanny, off-brand “blurple slop” because they lacked awareness of Stripe’s Sail design system.
What were the most important “shouty rules” you added to prevent the model from hallucinating Stripe components or drifting into Tailwind/Tailwind-like styling?
Williams built Protodash as an opinionated prototyping starter kit (React scaffold + Sail components + MCP integration + Cursor/LLM rules) that reliably generates Stripe-native dashboards and flows about 90% of the way.
Where does Protodash still fail today (e.g., complex interactions, data viz, routing, accessibility, i18n), and what guardrails or tests helped most?
Stripe’s devbox infrastructure made prototypes easy to run and share via URLs, enabling “demos, not memos” design reviews where stakeholders click through real interactions instead of viewing slides.
How does “design review mode” work operationally—who comments where, how is the summary generated, and how do you prevent AI follow-ups from misrepresenting stakeholder feedback?
Protodash Studio extends the system into a browser-based experience with embedded LLM chat, variant generation, self-testing via screenshots, and in-canvas annotation that queues targeted AI fixes.
PMs became power users—what are the best practices to ensure PM-built prototypes improve collaboration rather than bypass design craft or create unrealistic expectations?
Internal-tool customization (review mode, summaries, culture-specific workflows) drove adoption—surprisingly with PMs becoming major power users for early exploration, communication, and faster user testing.
Chapter Breakdown
Why AI prototypes look wrong in product reviews (and why it matters)
Owen Williams describes sitting in Stripe design reviews where AI-generated prototypes feel immersion-breaking because they don’t match Stripe’s UI patterns. The core issue isn’t that the prototypes are low quality—it’s that they’re inconsistent with the company’s design system, making feedback harder and less trustworthy.
The “blurple slop” problem: Tailwind vibes vs. a real design system
They name the recurring aesthetic failure mode: generic Tailwind indigo/“blurple” styling that clashes with Stripe’s product feel. Owen argues that without design-system awareness, tools will keep producing polished-but-wrong UI.
Protodash V1: a Stripe-specific vibe-coding starter kit for prototypes
Owen introduces Protodash, an internal prototyping environment designed to generate Stripe-like dashboards quickly. V1 is essentially a project scaffold plus a set of AI rules that guide tools like Cursor/Claude Code to use Stripe’s components correctly.
Lowering the bar for designers: terminal fear, NPM friction, and AI as a tutor
Owen explains how his engineering background shaped the project: reduce the cognitive load so designers only need minimal commands (or none). AI changes the learning curve by letting designers ask for help on Git/NPM/terminal usage on demand.
Teaching the model Stripe: Cursor rules + Sail MCP + guardrails
The “secret sauce” is a curated rule set that tells the coding assistant how to behave: check the design system first, avoid hallucinating components, and follow a known order of operations. Owen emphasizes that LLMs will invent APIs unless explicitly constrained.
Dev boxes and shareable URLs: from local prototypes to “just click this” reviews
The team moves from local-only prototyping to running Protodash in Stripe dev boxes, producing shareable URLs for reviews. This enables “demos, not memos”—participants can interact with prototypes directly instead of watching slides.
Designing with real data states: dashboards, i18n, error paths, and multi-step flows
Claire and Owen discuss why coded prototypes are uniquely valuable for data-heavy products like Stripe: zero states, high-volume states, messy data, and internationalization. Protodash makes it practical to explore real user scenarios and complete journeys rather than single-screen mockups.
Protodash Studio: bringing the whole workflow into the browser
Owen introduces Protodash Studio, a browser-based wrapper that lets users prototype without opening Cursor at all. It adds a home/feed experience, easy boot-up, embedded AI chat, and remixing of others’ prototypes.
Live demo: in-browser variants, chart swapping, and remixable iteration
They demo prompting a new variant that replaces a stacked bar chart with a line chart, created entirely in-browser. The workflow emphasizes rapid iteration, selectable variants, and the ability to mix ideas across multiple directions.
Self-testing prototypes: screenshots, console checks, and auto-fixing mistakes
Protodash can “self-test” changes by driving the UI, taking screenshots, and checking for errors—raising the quality bar for prototypes. Owen frames this as essential for Stripe’s high craft expectations and reducing breakage during demos.
In-canvas feedback: annotate-for-AI and targeted fixes without CSS spelunking
Owen shows an “annotate-for-AI” mode where users select elements on the canvas and attach instructions (padding, tooltip behavior, etc.). This replaces fragile descriptions like class names and enables batching multiple fixes into a single AI task list.
Design review mode: comments, summaries, and AI-generated follow-ups
Protodash adds a review workflow: share a URL, collect comments directly on the prototype, summarize feedback with AI, then generate actionable follow-ups. It aims to replace the common “Google Doc + screenshots” pattern and reduce post-review busywork.
Why internal tools win: culture fit, continuous evolution, and designer PRs
Claire and Owen argue that off-the-shelf tools rarely match a company’s unique review culture and workflows. Internal tooling can encode rituals (like vibe checks), evolve quickly via contributions, and shift how teams collaborate.
PMs as power users: unblocking, better communication, and earlier user testing
Owen describes initially feeling nervous about PMs “designing,” but concludes it improves collaboration. PMs can turn PRDs into clickable, Stripe-like prototypes, test earlier with users, and communicate more effectively with design and engineering.
Case study + wrap-up: Radar prototype handoff, lo-fi modes, and lightning round
Owen highlights a high-fidelity Radar (fraud detection) prototype that changed engineering handoff—engineers can use the prototype as source-of-truth. They close with lo-fi fidelity modes (monospace/grayscale/Comic Sans inspiration) and a lightning round on parental-leave side projects and prompting tactics.
EVERY SPOKEN WORD
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome