At a glance
WHAT IT’S REALLY ABOUT
GPT-5.5 Pro excels at autonomous coding, migrations, and device hacking
- GPT‑5.5 and GPT‑5.5 Pro feel meaningfully more capable and token-efficient on complex work, but their pricing makes them an “intelligence tax” that needs clear ROI.
- In ChatGPT, the model can overthink relatively simple tasks (e.g., a kids’ subtraction app), highlighting a mismatch between extreme intelligence and typical consumer workflows.
- In Codex, GPT‑5.5 Pro shines by autonomously executing large, multi-step engineering work like security issue remediation, technical debt cleanup, and complex data migrations.
- A standout example is a long-running, near hands-off, six-hour autonomous testing and validation loop that reduced production errors dramatically and uncovered only one edge case across ~2M rows.
- As a personal “high-tech eval,” GPT‑5.5 helped reverse-engineer a proprietary Bluetooth protocol to programmatically control a Divoom mini display, enabling terminal-driven notifications and custom output.
IDEAS WORTH REMEMBERING
5 ideasGPT‑5.5 Pro’s best ROI is ambition, not just speed.
Claire argues the model lets her attempt projects she previously avoided because they were too complex or too time-consuming to reliably decompose and execute—especially with messy edge cases.
ChatGPT may be a poor form factor for “too-smart” models without hard problems.
Her subtraction-app test took ~17 minutes of “thinking,” producing a serviceable result but raising the question of whether most users benefit from that level of reasoning and latency.
Codex + GPT‑5.5 Pro performs well on backlog-style batch work.
Uploading a CSV of security findings and asking it to cluster themes, propose fixes, and implement changes worked well after human/code review—and helped lead to a clean pen test outcome.
Autonomous, long-running loops are where the model differentiates.
A ~6-hour run built a scalable CLI-based smoke test harness across providers, requiring almost no intervention, and found only one remaining edge case after validating large production-like data.
Complex data migrations with unstructured AI-response history are now tractable.
She describes legacy response-format drift across providers and attachments/tools creating hard-to-sanitize records; GPT‑5.5 Pro produced a near one-shot migration covering ~98% of known edge cases.
WORDS WORTH SAVING
5 quotesI’m gonna pay the intelligence tax.
— Claire Vo
I don't know what to do with all this intelligence if you don't have complex problems to solve.
— Claire Vo
This thing will think.
— Claire Vo
Truly, it just banged its head against the wall for six hours, and I did not have to… zero prompts, zero follow-ups, zero steering.
— Claire Vo
GPT 5.5 has hit my intelligence benchmark for can you hack into this Chinese digital screen with proprietary Bluetooth transport mechanisms and bitmap compression.
— Claire Vo
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome