Aakash GuptaHe Uses 7 Claude Code Agents to Build Apps with 0 Employees
Aakash Gupta and Gabor Mayer on a Google PM builds iOS apps using Claude Code agents.
In this episode of Aakash Gupta, featuring Aakash Gupta and Gabor Mayer, He Uses 7 Claude Code Agents to Build Apps with 0 Employees explores a Google PM builds iOS apps using Claude Code agents Gabor shows a 21-agent Claude Code setup that mirrors real startup roles (system analyst, CTO, designers, test architect, code maintainability) to produce higher-quality outputs than single-prompt “vibe coding.”
At a glance
WHAT IT’S REALLY ABOUT
A Google PM builds iOS apps using Claude Code agents
- Gabor shows a 21-agent Claude Code setup that mirrors real startup roles (system analyst, CTO, designers, test architect, code maintainability) to produce higher-quality outputs than single-prompt “vibe coding.”
- mainstream PM practices—clear specs, dependencies, documentation, tickets, and sprints—become the scaffolding that prevents AI-generated spaghetti code and makes projects maintainable.
- He demonstrates an end-to-end workflow: dictate requirements in the Claude consumer app, generate structured Confluence documentation, generate a design system in Figma Make, produce polished screens in Figma via MCP, then auto-create Jira epics/tickets with Figma links/screenshots.
- The agents parallelize work (design wiring, backend setup, ticket creation, implementation, review) and then execute sprint-by-sprint to produce a working Flutter + Firebase + vector-RAG app in the iOS Simulator and upload it to TestFlight.
- The discussion also covers practical constraints and risks: context overload reduces design fidelity, permissions and secret access must be monitored, API keys must be stored in Firebase Secret Manager, and newer tools like Dispatch/Cowork remain fragile compared to Claude Code.
IDEAS WORTH REMEMBERING
5 ideasTreat Claude Code like a staffed org chart, not a single assistant.
Gabor assigns specialized agents (system analyst, CTO, test architect, maintainability) so each contributes a focused perspective, similar to a real team reviewing specs, tickets, and code.
The system analyst role is the “keystone” agent for quality.
He uses the system analyst to ask clarifying questions first, break down requirements, document dependencies, and generate Confluence docs and Jira tickets—reducing ambiguity before any coding starts.
Up-front scaffolding beats “one big prompt” for maintainability.
Documentation, tickets, and sprint sequencing act as guardrails that prevent unstructured AI output, reduce rework, and make it easier to extend the codebase later.
Dictation materially improves spec depth and outcomes.
Speaking a long, nuanced prompt is faster than typing and encourages richer constraints (privacy, token budgets, fallbacks), which the agents can operationalize into docs and tasks.
Too much context can degrade fidelity—decompose into tickets to preserve details.
He observes that when agents ingest large context blobs, some details get “compressed” (e.g., design palette not fully used), whereas ticket-based breakdown retains specifics.
WORDS WORTH SAVING
5 quotesAI agents are writing PRDs, designing in Figma, writing Jira tickets, and even shipping code all from 1:00 PM at 4:00 AM.
— Aakash Gupta
If you build a good specification and you break it down appropriately, then you will have a much better quality end product.
— Gabor Mayer
Vibe coding is just the rebranding of unmaintainable low-quality source code.
— Gabor Mayer
If you have a good specification, then you will have a good product. If you have a shit specification, then you will have a subpar product.
— Gabor Mayer
In two years, the gap will be so big between those who build and those who are just productivity AI users that it will be very hard to catch up.
— Gabor Mayer
QUESTIONS ANSWERED IN THIS EPISODE
5 questionsWhat’s inside your system analyst agent markdown—what exact behaviors, checklists, and failure modes do you encode?
Gabor shows a 21-agent Claude Code setup that mirrors real startup roles (system analyst, CTO, designers, test architect, code maintainability) to produce higher-quality outputs than single-prompt “vibe coding.”
Which of your 21 agents create the biggest quality lift, and which are “nice-to-have” for smaller projects?
mainstream PM practices—clear specs, dependencies, documentation, tickets, and sprints—become the scaffolding that prevents AI-generated spaghetti code and makes projects maintainable.
When you say “context compression” caused the design palette to be ignored, what were the concrete symptoms and how do you structure tickets/prompts to prevent it?
He demonstrates an end-to-end workflow: dictate requirements in the Claude consumer app, generate structured Confluence documentation, generate a design system in Figma Make, produce polished screens in Figma via MCP, then auto-create Jira epics/tickets with Figma links/screenshots.
How do you define the handoffs between Confluence docs → Jira tickets → sprint execution so agents don’t duplicate work or diverge?
The agents parallelize work (design wiring, backend setup, ticket creation, implementation, review) and then execute sprint-by-sprint to produce a working Flutter + Firebase + vector-RAG app in the iOS Simulator and upload it to TestFlight.
What specific rules does your code maintainability (“spaghetti”) agent enforce (naming, modularity, folder structure, circular deps), and how often does it block merges/changes?
The discussion also covers practical constraints and risks: context overload reduces design fidelity, permissions and secret access must be monitored, API keys must be stored in Firebase Secret Manager, and newer tools like Dispatch/Cowork remain fragile compared to Claude Code.
Chapter Breakdown
AI agents running a full startup workflow in hours
Aakash frames the episode around a new reality: agents can now handle core startup tasks—PRDs, design, ticketing, coding, and shipping—end to end. Gabor introduces the promise: going from zero to an iOS TestFlight build in a single session using an agent “team.”
The 21-agent “company” inside Claude Code (roles and responsibilities)
Gabor explains how he models Claude Code agents as a real cross-functional org, expanding from 15 to 21 specialized roles. The system analyst is the central node, but he also uses agents for CTO/architecture, brand, design, testing, performance, privacy/data governance, and maintainability.
Under the hood: the System Analyst agent definition and why it’s the linchpin
They open an agent markdown definition to show how the system analyst breaks down requirements, flags ambiguity, asks clarifying questions, and tracks dependencies. Gabor emphasizes that high-quality documentation and ticketing originate here, improving downstream build quality.
Kickoff from the consumer Claude app: voice-first ideation and role prompting
Instead of starting in a terminal, Gabor begins in the Claude consumer app to make the workflow accessible and voice-friendly. He demonstrates prompting Claude to behave like a “good system analyst” and sets the app idea: an AI hockey rules assistant.
Defining a “good system analyst” and the clarifying-questions-first workflow
Gabor shows how he first asks Claude to define the system analyst role (good vs bad), then uses that to set expectations. He instructs the agent to ask questions one at a time and not to write docs until understanding is complete—preventing premature output and messy direction changes.
Documentation + ticketing backbone: Confluence/Jira via MCP and why PM skills still matter
Gabor connects Atlassian tools via MCP so Claude can write Confluence docs and create Jira issues directly. He argues classic PM skills (requirements, decomposition, documentation) are even more valuable with agents because they prevent unmaintainable “AI spaghetti code.”
Project setup in Claude Code: agents, permissions, and guardrails
Gabor switches to Claude Code, differentiating “acting like an agent” in chat vs real project agents in Claude Code. A live hiccup forces him to recreate the project folder and re-run setup, highlighting practical guardrails: watch permissions, keep work inside the project directory, and avoid risky access requests.
Dictating the full product spec: stack, RAG sources, limits, privacy, and secrets
Gabor dictates a detailed spec for “Rule Ask”: Flutter iOS app, Firebase backend, IIHF rulebook + situation book as RAG sources with vector embeddings, and a friendly referee persona. He adds cost controls (20k-word limit with 24h cooldown), privacy constraints (no server-side user storage), and strict API key handling via Firebase Secret Manager.
Auto-generating Confluence documentation from the clarified spec
After questions are resolved, Claude writes structured Confluence pages (product overview, architecture, agent specs) via MCP. Gabor checks the docs primarily for fidelity—whether the generated artifacts accurately reflect the spoken requirements and technical approach.
Visual direction in Figma Make: brand kit, typography, and palette from inspiration
Gabor uses Figma Make to generate a comprehensive design system (typography, colors, buttons, states) using inspiration screenshots. He notes an important prompt constraint—ask for “inspiration, not copying”—to avoid tool refusal and keep the workflow moving.
Claude Code drives Figma: generating screens and a clickable prototype with a UX agent
With MCPs connected (Figma, Chrome DevTools), Claude Code creates the actual Figma screens from the spec and style guide, then adds prototype links (“arrows”) automatically. They observe that agent-driven Figma manipulation produces usable UI quickly and avoids generic ‘AI-looking’ defaults when design context is correctly anchored.
From design to execution: Jira epics/tickets with Figma links, sprints-by-tags, and parallel work
The system analyst and other agents generate Jira epics and detailed tickets, ensuring every frontend ticket includes a screenshot or Figma link so developers don’t drift into generic UI. Backend setup tickets are created first (Firebase basics, domain, secrets), then frontend and backend workstreams parallelize; sprints are approximated using tags due to MCP limitations.
Shipping the hockey rules app: simulator demo, RAG transparency, and TestFlight upload
After sprint execution, the app runs in the iOS Simulator with a working RAG pipeline and an “observer mode” showing retrieval hits, token usage, and latency. They upload the first build to TestFlight, discuss Apple processing/review timelines, and recap the full pipeline from dictation to deployment.
Career layer: AI PM certificates vs building, portfolios that prove capability, and getting started
The conversation shifts to careers: Gabor argues certificates matter less than hands-on skill and proof of building. He recommends creating portfolio apps with demonstrable “stories” (debugging retrieval, tuning scoring/thresholds) and advises newcomers to start by asking questions in their preferred LLM, then accelerate with structured programs if needed.
EVERY SPOKEN WORD
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome