Aakash GuptaAakash Gupta

The AI-Native PM Operating System [Live Demo]

Aakash Gupta and Mike Bal on building an AI-native PM OS with MCP-connected tools and workflows.

Aakash GuptahostMike Balguest
Feb 3, 20261h 1mWatch on YouTube ↗
What “AI-native PM” means (thinking in prompts)Operating system vs tool stack (central hub)Model Context Protocol (MCP) connectorsCursor + Claude Code workflowCMS/DB/hosting integrations (Sanity, Supabase, Render)Design workflows (Figma Make, Figma MCP)Prototyping with Google AI Studio → GitHub → CursorKnowledge retrieval (Confluence/Atlassian Rovo) + PRD/design gap checksResearch and context gathering (Manus vs Claude Research)Email/calendar/Drive connectors (Gmail, Google Drive)Licensing, permissions, and enterprise constraintsPM lifecycle usage and common AI mistakes

In this episode of Aakash Gupta, featuring Aakash Gupta and Mike Bal, The AI-Native PM Operating System [Live Demo] explores building an AI-native PM OS with MCP-connected tools and workflows AI-native PMs “think in prompts,” translating outcomes into steps and selecting the best AI+tool combination to execute them.

At a glance

WHAT IT’S REALLY ABOUT

Building an AI-native PM OS with MCP-connected tools and workflows

  1. AI-native PMs “think in prompts,” translating outcomes into steps and selecting the best AI+tool combination to execute them.
  2. An “operating system” mindset replaces a loose tool stack by using a central hub (Claude Desktop or Cursor) connected to systems of record (Jira/Confluence/Figma/GitHub/CMS) via Model Context Protocol (MCP).
  3. Live demos show AI performing real work across tools—querying/updating a CMS, comparing PRDs to designs, and pulling knowledge from Confluence and Figma without opening those apps.
  4. For prototyping, the workflow emphasizes fast concept-to-code loops (Google AI Studio → export to GitHub → iterate in Cursor) while using Figma Make mainly for design variations and edge-case states.
  5. Research is treated as “context gathering,” with Manus preferred for traceable, multi-asset outputs and controllable ingestion into the main OS to avoid polluting model memory.
  6. Adoption guidance focuses on permissions, licensing tiers, and making the ROI case to leadership by demonstrating velocity gains and measurable impact rather than “AI as a toy.”

IDEAS WORTH REMEMBERING

7 ideas

Treat AI as a workflow hub, not just a chat tool.

They argue the real leverage comes from keeping a “home base” (Claude Desktop/Cursor) and pulling data/actions from Jira/Confluence/Figma/GitHub/CMS into that interface to avoid constant tab-switching.

MCP turns “ask and do” into cross-tool execution.

With MCP and API keys, the assistant can query or modify external tools (e.g., creating a new CMS entry in Sanity) while staying in the same conversational context—often with read/write permission gating.

Use project-level context and memory intentionally.

Mike sets custom instructions per project and uses memory tooling to preserve relationships, but warns against dumping unvetted research into the core environment because model memory can anchor on the wrong assumptions.

Use Figma Make for design variation, not production-ready code.

They find Figma Make valuable for quickly generating editable, layered variations and edge-case states that designers can reuse, but not reliable enough yet for code you’d ship.

Prototype fastest by moving from AI Studio to a real dev loop.

Google AI Studio is positioned as the quickest place to one-shot prototypes and test new models, then export to GitHub/Cloud Run and continue iteration in Cursor like a standard developer workflow.

Prefer research tools that show work and generate reusable assets.

Manus is favored over Claude Research for longer tasks because it runs independently, provides traces/sources, and outputs multiple artifacts (CSVs, reports, guides) that can be selectively imported into the OS.

Adopt licenses progressively and justify with demonstrated usage.

Instead of blanket $200/month plans, they recommend starting teams on lower tiers, upgrading based on observed value/limits, and using API/usage-based billing where possible to avoid paying for idle subscriptions.

WORDS WORTH SAVING

5 quotes

AI-native PMs are actually working from what do I need to do, to what are the steps that I need to get done, to what are the best tools to get me those things.

Mike Bal

You were actually operating with a layer of abstraction from UI.

Mike Bal

I don't have to leave the tool that I'm in right now to get an answer for that, which is nice.

Mike Bal

I still feel like my frustration with Claude is chat length limits, and then usage limits.

Mike Bal

The bad thing about a lot of the LLMs is they'll pick and choose what's in the memory to really anchor themselves to.

Mike Bal

QUESTIONS ANSWERED IN THIS EPISODE

5 questions

In your Sanity demo, what safeguards do you recommend before allowing MCP write access (e.g., approval prompts, environment restrictions, audit logs)?

AI-native PMs “think in prompts,” translating outcomes into steps and selecting the best AI+tool combination to execute them.

Can you share an example “project-level custom instruction” template you use in Claude Projects to keep context tight without overloading tokens?

An “operating system” mindset replaces a loose tool stack by using a central hub (Claude Desktop or Cursor) connected to systems of record (Jira/Confluence/Figma/GitHub/CMS) via Model Context Protocol (MCP).

What’s your decision rule for when a PM should use Claude Desktop vs Cursor vs a research agent like Manus for the same task?

Live demos show AI performing real work across tools—querying/updating a CMS, comparing PRDs to designs, and pulling knowledge from Confluence and Figma without opening those apps.

How do you prevent Figma MCP from “maxing out” context—do you rely on link-to-selection, screenshots, smaller frames, or a specific summarization workflow?

For prototyping, the workflow emphasizes fast concept-to-code loops (Google AI Studio → export to GitHub → iterate in Cursor) while using Figma Make mainly for design variations and edge-case states.

For PRD-vs-design gap checks, what prompts consistently produce the most actionable discrepancy lists (and how do you validate the model’s claims quickly)?

Research is treated as “context gathering,” with Manus preferred for traceable, multi-asset outputs and controllable ingestion into the main OS to avoid polluting model memory.

EVERY SPOKEN WORD

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome