Skip to content
How I AIHow I AI

GLM 5.2: why I’m replacing Opus in Claude Code

I put GLM 5.2, the open-weight coding model from Z.AI, through four real tasks inside my actual codebase: a codebase architecture audit, a UI redesign, and a 45-minute autonomous bug-hunting session pulling from Sentry and Vercel logs. Total cost: $3.36 for roughly 6 million tokens, a prioritized bug-fix dashboard I’m actually shipping from, and a landing page redesign that matched Chat PRD’s design system on the first try. *What you’ll learn:* 1. What “open-weight” actually means and why it matters for cost and vendor independence 2. How to connect GLM 5.2 to Cursor and Claude Code 3. How it performs on codebase exploration and autonomous architecture summarization in a real production Next.js app 4. Whether GLM 5.2 can match an existing design system 5. How the model handles a 45-minute long-running autonomous task 6. Where GLM 5.2 stumbled 7. The actual cost breakdown *Brought to you by:* Mercury—Radically different banking loved by over 300K entrepreneurs: https://mercury.com/ *In this episode, we cover:* (00:00) What open-weight models are and why GLM 5.2 is worth testing (01:38) GLM 5.2 model overview (04:02) Capabilities and benchmark results (06:02) How to set up GLM 5.2 in Cursor (08:37) How to set up GLM 5.2 in Claude Code (11:04) Live test 1: codebase exploration and architecture audit on ChatPRD (12:43) Live test 2: generating an HTML architecture and roadmap page (16:37) Live test 3: redesigning the How I AI landing page in Cursor (20:57) Live test 4: 45-minute autonomous task, pulling Sentry errors and Vercel logs (22:35) Where it struggled (23:49) My verdict on the output (25:23) Cost breakdown *Tools referenced:* • z.ai: https://z.ai • GLM 5.2: https://z.ai/blog/glm-5.2 • OpenRouter: https://openrouter.ai • Cursor: https://cursor.com • Claude Code: https://docs.anthropic.com/en/docs/claude-code • Sentry: https://sentry.io • Vercel: https://vercel.com Other references: • SWE-Bench Pro leaderboard (coding benchmark scores referenced in episode): https://www.swebench.com • Frontier Suite and Post-Train Bench (additional benchmarks cited): https://scale.com/leaderboard • Use Claude Code with OpenRouter: https://openrouter.ai/docs/cookbook/coding-agents/claude-code-integration *Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo _Production and marketing by https://penname.co/_ _For inquiries about sponsoring the podcast, email jordan@penname.co._

Claire Vohost
Jun 24, 202627mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
June 24, 2026
Duration
27m
Channel
How I AI
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

I put GLM 5.2, the open-weight coding model from Z.AI, through four real tasks inside my actual codebase: a codebase architecture audit, a UI redesign, and a 45-minute autonomous bug-hunting session pulling from Sentry and Vercel logs. Total cost: $3.36 for roughly 6 million tokens, a prioritized bug-fix dashboard I’m actually shipping from, and a landing page redesign that matched Chat PRD’s design system on the first try. *What you’ll learn:*

  1. What “open-weight” actually means and why it matters for cost and vendor independence
  2. How to connect GLM 5.2 to Cursor and Claude Code
  3. How it performs on codebase exploration and autonomous architecture summarization in a real production Next.js app
  4. Whether GLM 5.2 can match an existing design system
  5. How the model handles a 45-minute long-running autonomous task
  6. Where GLM 5.2 stumbled
  7. The actual cost breakdown

*Brought to you by:* Mercury—Radically different banking loved by over 300K entrepreneurs: https://mercury.com/ *In this episode, we cover:* (00:00) What open-weight models are and why GLM 5.2 is worth testing (01:38) GLM 5.2 model overview (04:02) Capabilities and benchmark results (06:02) How to set up GLM 5.2 in Cursor (08:37) How to set up GLM 5.2 in Claude Code (11:04) Live test 1: codebase exploration and architecture audit on ChatPRD (12:43) Live test 2: generating an HTML architecture and roadmap page (16:37) Live test 3: redesigning the How I AI landing page in Cursor (20:57) Live test 4: 45-minute autonomous task, pulling Sentry errors and Vercel logs (22:35) Where it struggled (23:49) My verdict on the output (25:23) Cost breakdown *Tools referenced:*

Other references:

*Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo _Production and marketing by https://penname.co/_ _For inquiries about sponsoring the podcast, email jordan@penname.co._

SPEAKERS

  • Claire Vo

    host

    AI content creator and host of the “How I AI” show.

EPISODE SUMMARY

In this episode of How I AI, featuring Claire Vo, GLM 5.2: why I’m replacing Opus in Claude Code explores testing GLM 5.2 as low-cost Opus alternative in coding GLM 5.2 is presented as an open-weight, text-only model with modern tooling features (reasoning mode, function calling, caching, structured output) and a 1M-token context window.

RELATED EPISODES

How Mozilla Uses Claude Mythos to find Firefox bugs before hackers do

How Mozilla Uses Claude Mythos to find Firefox bugs before hackers do

How to write AI agent loops in Claude Code and Codex

How to write AI agent loops in Claude Code and Codex

Braintrust CEO: Evals are the new PRD for AI products

Braintrust CEO: Evals are the new PRD for AI products

Claude Fable 5 - is this Mythos model worth the wait?

Claude Fable 5 - is this Mythos model worth the wait?

How to build an iPhone app with zero technical skills

How to build an iPhone app with zero technical skills

I let Codex run for 6 hours. Here’s what happened.

I let Codex run for 6 hours. Here’s what happened.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.