Building more effective AI agents

Anthropic’s Alex Albert (Claude Relations) sits down with Erik (Multi-Agent Research and co-author of our blog post, Building Effective Agents) for a discussion on the evolution of agents over the past six months, including tips for building multi-agent systems, common multi-agent patterns, and best practices for using skills, MCP servers, and tools. 00:00 - Introductions 00:35 - Training Claude to tackle agentic tasks 1:30 - Making Claude more autonomous with code 3:20 - Using the Claude Agent SDK to build agents 5:00 - Tips for using Agent Skills 6:40 - The evolution of workflows and agents (workflows of agents) 8:30 - The value of simple agent architectures 9:30 - Building multi-agent systems: orchestrators, subagents, and tool calling 11:40 - Training Claude to use subagents 12:25 - Multi-agent use design patterns: parallelization, MapReduce, and test-time compute 13:20 - Coordinating problem solving with tools and subagents 14:15 - Common agent failure modes 15:00 - Best practices for getting started with building agents (context engineering, MCPs, and tools) 17:15 - The future of agents: coding, computer use, and beyond Read the original blog post: https://www.anthropic.com/engineering/building-effective-agents Learn more about Agent Skills: https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills

ErikguestAlex Alberthost

Oct 17, 202518mWatch on YouTube ↗

CHAPTERS

0:00 – 0:35
Why multi-agent systems can act like “test-time compute”
Erik frames multi-agent approaches as a way to boost answer quality by having multiple instances of Claude work on a problem. The premise mirrors how groups of people can outperform a single individual by exploring multiple angles and aggregating results.
0:35 – 1:30
How Claude is trained for agentic, multi-step work
Erik explains that Claude’s strength in agent tasks comes from training that explicitly practices open-ended, tool-using, multi-step problem solving. Reinforcement learning is applied across environments so the model learns to iterate, explore, and correct itself before producing an answer.
1:30 – 3:20
Why “great at coding” transfers to other agent tasks
They discuss why Anthropic has emphasized coding: a strong coding agent can indirectly solve many non-coding tasks by writing programs to take actions. Coding becomes a general leverage point for agents to plan, search, and produce structured artifacts.
3:20 – 5:00
Using code to create artifacts faster than direct generation
Alex and Erik highlight practical examples of Claude producing files by writing and running code, like generating spreadsheets or diagrams. Erik notes that for repetitive or detailed artifacts (e.g., complex SVG diagrams), code generation is faster and more reliable than manual-style output.
5:00 – 6:40
Claude Code / Agent SDK as a ready-made agent loop
Erik describes the Claude Code SDK as a polished scaffold that saves developers from reinventing the agent loop: tool execution, file interaction, and integrations. Although branded for coding, they emphasize it’s a general-purpose agent framework that can be customized with your own tools and logic.
6:40 – 8:30
From CLAUDE.md to reusable “Skills” that bundle resources
They introduce Skills as an evolution of instruction files: not just notes, but any reusable assets an agent can draw on. Skills can include templates, helper scripts, images, and other resources—turning one-off context into a durable capability pack.
8:30 – 9:30
From prompt chains to agent loops—and “workflows of agents”
Erik explains the shift from rigid workflows (single-shot steps) to agent loops that iterate based on feedback, improving quality. A newer pattern is “workflows of agents,” where each step in a larger pipeline is itself a closed-loop agent that verifies and retries before handing off.
9:30 – 11:40
Observability and verification push teams toward simplicity
As agent systems become more complex, tracking behavior and debugging becomes harder. Erik argues for starting with the simplest architecture possible and layering complexity only when necessary to preserve observability and control.
11:40 – 12:25
Multi-agent architecture: orchestrators, subagents, and tool-like calls
Erik distinguishes multi-agent systems from sequential “workflows of agents”: multi-agent means multiple Claudes working concurrently under a delegating orchestrator. Subagents appear to the main model as callable tools, useful for parallel search and for isolating long computations from the main context.
12:25 – 13:20
Training Claude to manage subagents like a good manager
They discuss how Claude must learn delegation: early failures resemble first-time managers giving unclear instructions. Training helps Claude provide more context, be more explicit, and request the right outputs so subagent work composes into a strong overall solution.
13:20 – 14:15
Multi-agent design patterns: parallelization, MapReduce, and tool-bucketing
Erik outlines practical patterns for multi-agent use: splitting output generation across subagents, MapReduce-style decomposition, and using multi-agent as test-time compute. Another pattern is tool-bucketing—assigning subsets of many tools to specialized subagents so each one learns a smaller toolset.
14:15 – 15:00
Failure modes: coordination overhead and “organizational” drag
They warn that multi-agent systems can be overbuilt, spending more time coordinating than progressing. Erik compares this to communication overhead in large companies, motivating research into keeping agent organizations effective with minimal chatter.
15:00 – 17:15
Getting started: context engineering and tool design that matches a UI
Erik’s best practices emphasize starting simple and viewing the system from the agent’s perspective by inspecting logs and tool-call transcripts. He also argues tools should mirror user-facing workflows (UI-level primitives) rather than low-level API endpoints to reduce tool-call friction and confusion.
17:15 – 18:57
Where agents are heading: self-verification, coding + computer use, broader domains
They predict agents will expand first in verifiable domains like software engineering, then improve by closing the loop on testing and verification. With computer use, agents can directly operate within tools like Google Docs, reducing copy/paste friction and unlocking more real-world workflows.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Why multi-agent systems can act like “test-time compute”

How Claude is trained for agentic, multi-step work

Why “great at coding” transfers to other agent tasks

Using code to create artifacts faster than direct generation

Claude Code / Agent SDK as a ready-made agent loop

From CL​AUDE.md to reusable “Skills” that bundle resources

From prompt chains to agent loops—and “workflows of agents”

Observability and verification push teams toward simplicity

Multi-agent architecture: orchestrators, subagents, and tool-like calls

Training Claude to manage subagents like a good manager

Multi-agent design patterns: parallelization, MapReduce, and tool-bucketing

Failure modes: coordination overhead and “organizational” drag

Getting started: context engineering and tool design that matches a UI

Where agents are heading: self-verification, coding + computer use, broader domains

Get more out of YouTube videos.

From CLAUDE.md to reusable “Skills” that bundle resources