Y CombinatorTokenmaxxing: How Top Builders Use AI To Do The Work Of 400 Engineers
CHAPTERS
Control vs. dependency: personal tools in the age of AI
The conversation opens with a framing question: will people control their AI tools, or will the tools control them. Garry uses the “Ferrari” metaphor to capture both the power and brittleness of modern agentic coding systems—fast and exhilarating, but requiring skill to maintain and debug.
Returning to coding after 13 years—why now
Jared sets the premise: Garry returned from a long break and suddenly shipped an extraordinary volume of code while running YC. Garry describes the surprise of reawakening his builder identity and attributes the change to the new capabilities of tools like Claude Code.
Rebuilding Postrous/Posthaven as Garry’s List—at token-scale cost and speed
Garry explains the origin of Garry’s List: building a site to organize people around California policy issues that matter to him. He recounts rebuilding his old blogging platform a third time—this time in days and for roughly the cost of an AI subscription—while adding modern agentic research capabilities.
Software that thinks like a journalist: agentic research, sourcing, and synthesis
The product evolves beyond a publishing tool into a system that performs investigative-journalism-like work. Garry describes agentic retrieval: crawling, cross-referencing sources, extracting quotables, and producing deeply sourced reports using multiple APIs and toolchains.
The rise of “tokenmaxxing”: spending tokens to buy completeness and truth
Garry introduces tokenmaxxing as a philosophy: if more tokens yield more complete, higher-quality work, spend them. He argues this will extend to nearly all knowledge work, with humans supplying goals, values, and agency while machines handle the heavy lifting.
Accidental productization: how GStack emerged from repeated prompts
Garry explains that GStack wasn’t planned—it emerged from noticing repeated interactions and converting them into reusable ‘skills.’ He describes techniques like forcing models to produce ASCII diagrams of data flows and user flows before coding, which improves completeness and reduces confusion.
The 400x workflow: queued PRs, plan-first execution, and multi-agent reviews
Garry walks through his day-to-day system for shipping: using plan mode, batching features, and running a sequence of specialized reviews before execution. The workflow is designed to scale output while preserving quality through automated checks and human-in-the-loop decision points.
QA automation and the birth of a browser harness (Playwright → Browse/QA)
Manual testing became the bottleneck, so Garry tried to automate QA using browser tooling. Slow MCP browser control pushed him to wrap Playwright, which evolved into a long-lived daemon with a CLI (‘Browse’) and a QA mode that tests UI/data mutations based on branch context.
Thin Harness, Fat Skills: where code ends and ‘latent space’ begins
Garry crystallizes his philosophy: don’t rebuild harnesses; invest in high-leverage skills (often in markdown) that guide agents effectively. He contrasts deterministic code (brittle, exact) with LLM ‘latent space’ (contextual, adaptable), and argues the art is choosing the right boundary between them.
Agents are like Ferraris: brittle systems, mechanics mindset, and self-healing loops
The group returns to the Ferrari metaphor: the tools are powerful but can fail in surprising ways. A key shift is that brittleness matters less when another agent (or Claude Code) can continuously repair and maintain the system, accelerating iteration despite instability.
Gbrain and OpenClaw: building a personal AI with better memory than grep
Garry describes moving beyond Claude Code into OpenClaw-based workflows and building ‘Gbrain’ to improve how agents use personal context. He notes that naive approaches (grep over markdown) waste context, and he reuses RAG learnings from Garry’s List—chunking, embeddings, and hybrid retrieval—into a personal system.
400x output and the lines-of-code controversy: what it measures and what it doesn’t
Garry addresses backlash over citing lines of code as a productivity metric. He argues LoC is imperfect but can be normalized (logical lines), and that AI-directed coding changes incentives versus human LoC ‘padding’; the bigger point is that the tools raise the ceiling on what a capable builder can ship.
The future of personal AI and buying back time with tokens
The episode closes on a vision: everyone will have a personal AI with their own data, integrations, and prompts—or else rely on corporate-controlled feeds and opaque incentives. Token spend becomes a way to ‘buy time’ by borrowing machine labor, turning builders into “time billionaires” through scalable machine work.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome