Aakash GuptaHe Uses 7 Claude Code Agents to Build Apps with 0 Employees
Aakash Gupta and Gabor Mayer on how to ship an iOS app using Claude Code agents.
In this episode of Aakash Gupta, featuring Aakash Gupta and Gabor Mayer, He Uses 7 Claude Code Agents to Build Apps with 0 Employees explores how to ship an iOS app using Claude Code agents Gabor shows how he structures Claude Code as a “startup OS” with ~21 specialized agents (system analyst, CTO, designer, test architect, maintainability reviewer, privacy/data council) to mimic a real product team’s division of labor.
At a glance
WHAT IT’S REALLY ABOUT
How to ship an iOS app using Claude Code agents
- Gabor shows how he structures Claude Code as a “startup OS” with ~21 specialized agents (system analyst, CTO, designer, test architect, maintainability reviewer, privacy/data council) to mimic a real product team’s division of labor.
- The workflow starts with voice dictation in the consumer Claude app to generate a high-context PRD, then uses Atlassian MCP to turn clarified requirements into Confluence documentation and Jira epics/tickets with dependencies.
- Design is generated via Figma Make to produce a style guide, then Claude Code uses Figma MCP to create high-fidelity screens and automatically wire them into a clickable prototype, minimizing “generic AI-looking” UI outcomes.
- Development is run sprint-by-sprint in parallel (frontend + backend) using the ticket scaffold, with emphasis on maintainability, testing, privacy constraints (no server-side storage of user chat), and secret management (API keys in Firebase Secret Manager).
- The live demo culminates in a working Flutter + Firebase iOS app (“Rule Ask”) that uses RAG over the IIHF rulebook and situation book, includes an “observer mode” for transparency, runs in Simulator, and is uploaded to TestFlight, followed by career advice for PMs on building portfolios over collecting certificates.
IDEAS WORTH REMEMBERING
7 ideasTreat agentic building like running a real software team, not a single prompt.
Gabor’s core claim is that better outcomes come from role specialization (system analyst, CTO, designer, test architect, maintainability reviewer) and structured handoffs, rather than asking one model to “build the whole app” in one shot.
The system analyst agent is the keystone for quality and speed.
He uses the system analyst to force clarifying questions, break requirements into structured documentation, and generate Jira tickets; this reduces ambiguity, prevents rework, and gives coding agents unambiguous tasks.
Up-front scaffolding prevents “AI spaghetti code.”
Gabor argues many “vibe coding” failures are maintainability failures; he adds an explicit maintainability/spaghetti agent to check naming conventions, circular references, and code structure—especially important for non-engineers who can’t easily detect architectural rot.
Context can hurt: too much information leads to compression and missed details.
He notes that when he didn’t break design work into smaller tickets, the design underused parts of the palette from the style guide—his hypothesis is that long context windows still lead to prioritization/compression that drops constraints.
Documentation and ticketing aren’t bureaucracy—they’re the mechanism for repeatability.
By storing decisions in Confluence and turning them into Jira work items (with dependencies and sprint tags), he makes the build reproducible and easier to iterate, similar to how teams maintain velocity over time.
High-quality UI requires explicit linkage from tickets to Figma artifacts.
He emphasizes attaching screenshots or deep Figma links per ticket; otherwise coding agents tend to produce “default AI-looking” UI rather than implementing the intended design system.
Security and cost controls must be first-class requirements in agent workflows.
He repeatedly constrains key handling (Firebase Secret Manager only), limits user usage (20,000 words then 24-hour cooldown), and warns to scrutinize Claude Code permissions—especially requests outside the project folder (e.g., password stores).
WORDS WORTH SAVING
5 quotesAI agents are writing PRDs, designing in Figma, writing Jira tickets, and even shipping code all from 1:00 PM at 4:00 AM.
— Aakash Gupta
If you build a good specification and you break it down appropriately, then you will have a much better quality end product.
— Gabor Mayer
Vibe coding is just the rebranding of unmaintainable, low-quality source code.
— Gabor Mayer
Pay for a course for the knowledge, not for the certificate.
— Gabor Mayer
In two years, the gap will be so big between those who built and those who are just productivity AI users that it will be very hard to catch up.
— Gabor Mayer
QUESTIONS ANSWERED IN THIS EPISODE
5 questionsWhat does your “system analyst agent” template actually include (sections, checklists, and failure modes), and how has it evolved after real builds?
Gabor shows how he structures Claude Code as a “startup OS” with ~21 specialized agents (system analyst, CTO, designer, test architect, maintainability reviewer, privacy/data council) to mimic a real product team’s division of labor.
In the hockey rules app, how did you decide the RAG flow order (rulebook → situation book → web) and what heuristics determine when to fall back to web search?
The workflow starts with voice dictation in the consumer Claude app to generate a high-context PRD, then uses Atlassian MCP to turn clarified requirements into Confluence documentation and Jira epics/tickets with dependencies.
You set a 20,000-words-per-24-hours limit—how would you implement this robustly across devices if you later add accounts, and what metrics would you track to tune it?
Design is generated via Figma Make to produce a style guide, then Claude Code uses Figma MCP to create high-fidelity screens and automatically wire them into a clickable prototype, minimizing “generic AI-looking” UI outcomes.
Can you share an example of a maintainability/spaghetti agent catching a serious architectural issue early, and what automated checks you’d add beyond LLM review (linting, static analysis, CI)?
Development is run sprint-by-sprint in parallel (frontend + backend) using the ticket scaffold, with emphasis on maintainability, testing, privacy constraints (no server-side storage of user chat), and secret management (API keys in Firebase Secret Manager).
What specifically “breaks” when you give agents too much context, and how do you decide when to split work into tickets vs keep it in one conversation?
The live demo culminates in a working Flutter + Firebase iOS app (“Rule Ask”) that uses RAG over the IIHF rulebook and situation book, includes an “observer mode” for transparency, runs in Simulator, and is uploaded to TestFlight, followed by career advice for PMs on building portfolios over collecting certificates.
EVERY SPOKEN WORD
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome