Skip to content
ClaudeClaude

Picking the right model

Hands-on techniques for testing and comparing models against your use case, so you can make a confident call each time a new release ships.

May 21, 202631mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
May 21, 2026
Duration
31m
Channel
Claude
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

Hands-on techniques for testing and comparing models against your use case, so you can make a confident call each time a new release ships.

EPISODE SUMMARY

In this episode of Claude, Picking the right model explores choose AI models using private evals, not public benchmarks alone Public benchmarks and online hot takes are only directionally useful and rarely match real production workloads.

RELATED EPISODES

Before we ship a Claude model, these teams try to break it.

Before we ship a Claude model, these teams try to break it.

Ship your first Managed Agent

Ship your first Managed Agent

How we Claude Code

How we Claude Code

Agent Battle: Mine the most diamonds in 45 minutes

Agent Battle: Mine the most diamonds in 45 minutes

Evals for taste: Hill-climbing a slide-generation agent

Evals for taste: Hill-climbing a slide-generation agent

Agents that remember

Agents that remember

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.