I let Codex run for 6 hours. Here’s what happened.

In this 30-minute episode, I walk through my favorite feature in Codex: the /goal command. I show how Goals transform AI from a turn-based assistant that needs constant ‘what’s next?’ prompting into an autonomous agent that can work for hours on complex, multi-step tasks. I share three real examples: eliminating thousands of Sentry errors, cleaning 3,900 emails down to 68, and organizing hundreds of Linear tasks. *What you’ll learn:* 1. What Goals are and how they differ from standard prompts 2. How I used /goal to eliminate hundreds of error logs in my codebase over a five-hour autonomous run 3. The non-technical use cases that make Goals incredibly powerful: cleaning up 3,900 emails in under four hours and organizing hundreds of project management tasks in Linear 4. How to write effective /goal prompts with measurable outcomes, verification methods, and constraints 5. When not to use Goals and what makes a strong versus weak Goal 6. Why Goals represent a fundamental shift in how we work with AI, from babysitting the model to managing it *Brought to you by:* Mercury—Radically different banking loved by over 300K entrepreneurs: https://mercury.com/ *In this episode, we cover:* (00:00) Introduction (01:50) What is /goal and when should you use it? (02:45) The difference between prompts and Goal-based loops (04:06) Claire’s first five-hour 45-minute autonomous coding task (05:05) How to manage a Goal lifecycle: view, pause, resume, and clear (06:06) How to write strong goals: outcomes vs. outputs (07:34) The six components of effective Goals (08:57) Example: Reducing P95 checkout latency with /goal (09:36) Demo: Using /goal to eliminate Sentry errors in ChatPRD (13:18) Demo: Burning down Vercel API errors (17:28) Non-technical use case: Cleaning 3,900 emails with /goal (21:24) Demo: Using /goal to clean up Linear project tasks (24:41) When not to use /goal (26:10) Why /goal changes everything *Tools referenced:* • Codex: https://openai.com/codex/ • Sentry: https://sentry.io/ • Vercel: https://vercel.com/ • Linear: https://linear.app/ *Other reference:* • OpenAI blog post “Using Goals in Codex”: https://developers.openai.com/cookbook/examples/codex/using_goals_in_codex *Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email jordan@penname.co._

Claire Vohost

May 27, 202630mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

How Codex Goals enable long-running autonomous work with verification loops

Codex Goals differ from normal prompts by running an autonomous loop of execute–verify–decide-next-step until evidence shows the outcome is achieved.
Effective Goals are outcome-based and measurable, pairing a clear finish line with verification methods, constraints/guardrails, scoped boundaries, and an iteration/stop policy.
Claire shares a real five-hour-45-minute autonomous coding run and shows how /goal can systematically burn down recurring production issues instead of applying one-off fixes.
Technical demos focus on using logs (Sentry/Vercel) to categorize errors, identify root causes, ship fixes via PRs, and re-validate against historical events until errors reach zero.
Non-technical demos show /goal as an operations assistant—cleaning thousands of emails and mass-triaging Linear tasks—highlighting that long-running goals can improve everyday workflow hygiene.

IDEAS WORTH REMEMBERING

5 ideas

Use /goal when you’re stuck in “keep going / try next / rerun tests” mode.

Goals shine when you’d otherwise micromanage an agent step-by-step; the loop self-directs and keeps iterating until it can prove completion.

Write goals as measurable outcomes, not vague tasks or refactors.

“Refactor this code” or “make customers happy” lacks a finish line; better goals specify what should be true and how success will be validated (benchmarks, tests, log replays, etc.).

Strong goals combine verification with guardrails to prevent “cheating.”

Claire emphasizes constraints like keeping correctness tests green—otherwise an agent could hit the metric by breaking or removing functionality (e.g., deleting a slow page).

Include boundaries and an iteration policy so the agent knows where to look and how to proceed.

Specify allowed tools/files/systems (Sentry/Vercel/checkout subsystem) and require iteration reporting (what changed, what evidence showed, what experiment is next).

Treat production error logs as a backlog you can systematically eliminate.

Her Sentry example uses /goal to categorize root causes, implement fixes, and replay historical events until “invalid operation” errors go to zero—yielding a more coherent framework, not scattered band-aids.

WORDS WORTH SAVING

5 quotes

If you find yourself in that process, using /goal in Codex might be a tool that you wanna add to your toolkit.

— Claire Vo

With Goal, when you give Codex a goal, it actually has something that it can work towards, and it will continue to loop to the next step and verify until it can measure that it has met that goal.

— Claire Vo

But the first time I used Goal, I was actually able to get a coding task running for about five hours and 45 minutes, which is longer than I've ever had anything run before.

— Claire Vo

You really wanna use Goals when you would otherwise find yourself saying the same thing after turn," like, "Keep going," "Try the next thing," "Run it again," "Now run the test," "Continue until it's actually done."

— Claire Vo

Goals are strongest when it has three properties: a durable objective, an evidence-based finish line, and a path that may require several turns of investigation.

— Claire Vo

Prompt vs goal-based loop (autonomy + verification)Goal lifecycle commands (view/pause/resume/clear)Outcomes vs outputs in goal writingSix components of effective goalsError burn-down using Sentry tracesVercel log triage to PR workflowInbox and project-management cleanup via plugins/MCP

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.