Beyond the basics with Claude Code

Name: Beyond the basics with Claude Code
Uploaded: 2026-05-22T00:00:00Z
Duration: 47 min 19 s
Description: Out-of-the-box Claude Code is fine for simple coding, but large-scale software engineering needs customization so the agent can access the same information sources and workflows as engineers.

The mechanics that separate basic Claude Code use from real leverage: CLAUDE.md done well, wiring tools in with MCP, packaging team knowledge as skills, and using auto mode safely.

May 22, 202647mWatch on YouTube ↗

CHAPTERS

0:20 – 3:22
From agentic programming to agentic software engineering (talk framing)
Daisy Holman introduces the session as an advanced look at Claude Code in real software engineering environments—not just small programming tasks. She frames the goal as learning how to customize an “agentic harness” so agents can operate effectively amid real-world constraints and organizational complexity.
- •Talk focus: software engineering workflows vs. simple coding tasks
- •Why customization becomes necessary as complexity and stakeholders increase
- •Background: Claude Code team, plugins and agent teams experience
- •Thesis: scaling “ideas to production” with agentic systems
3:22 – 6:24
Why customization matters: access, knowledge, and tooling as the core needs
The talk’s main thesis is that an agent can’t truly help with your job if it can’t do what you can do—especially accessing the same information and systems. Daisy groups the requirements into three categories: access to systems, knowledge of conventions and context, and tooling that tightens feedback loops.
- •Three categories: access, knowledge, tooling
- •If Claude can’t reach what you can, it can’t be a true collaborator
- •Out-of-the-box Claude sees mostly repo + shell; that’s insufficient at scale
- •Professional engineering relies heavily on context outside source code
6:24 – 8:56
Access: connect Claude to the places decisions and signals live
Daisy explains that high-quality engineering depends on information that rarely lives in code—like Slack discussions, CI results, dashboards, and internal documentation. She shares a practical method for identifying missing connections: work a full day without leaving Claude’s interface and note every time you alt-tab.
- •Bring in Slack/team chat to capture decision rationale and constraints
- •Connect CI/CD so agents fix failures instead of humans doing it manually
- •Use dashboards/production signals to triage incidents agentically
- •Feed meeting transcripts/notes to generate follow-up PRs quickly
- •Workflow tip: log every external tool you must use, then integrate it
8:56 – 11:29
Knowledge: why fine-tuning isn’t the answer and ICL is the main lever
Institutional memory, internal vocabulary, and fast-changing conventions can’t realistically be trained into frontier models. Daisy argues fine-tuning is often inefficient and can even increase hallucinations, so teams should rely on in-context learning (ICL) and structured repository-based guidance.
- •Codebase conventions and recent changes are hard to “bake into” weights
- •Fine-tuning is often costly, slow vs. model churn, and may raise hallucinations
- •ICL is the practical mechanism: “text files” + structured context
- •The “bitter lesson” framing: general models win; customize via context instead
11:29 – 14:31
Tooling: building the agentic equivalent of IDE feedback loops
Claude Code lacks many ergonomic developer aids humans rely on (syntax highlighting, LSP diagnostics, autocomplete). Daisy reframes the goal as creating an agentic VS Code: fast, overridable feedback (like ‘red squigglies’) that nudges agents toward correctness without hard-blocking them.
- •Out-of-box edit operations are primitive; tooling must add guardrails
- •Red-squiggle feedback should be nudges, not rigid constraints
- •Post-tool-use hooks are positioned as the agent’s ‘IDE diagnostics’ layer
- •Use existing linters/LSPs/dev scripts to tighten the loop instead of reinventing
- •Generated-file reminders as a good example of overridable guidance
14:31 – 16:33
Tools that scale with intelligence vs. tools that compensate for it
Daisy distinguishes between tools that lock the model into rigid behavior (compensating for intelligence) and tools that become more powerful as models improve. The recommendation is to prefer flexible, overridable tools that amplify stronger reasoning rather than constrain it.
- •“Scales with intelligence” tools remain valuable as models get smarter
- •Overly strict tools can force unnatural coding order and reduce effectiveness
- •Aim for feedback and guidance, not prohibitions
- •Better agents come from tighter feedback loops more than smarter models
16:33 – 19:36
Context window engineering: fixed budgets and the ‘Arduino running npm’ analogy
Customization must fit into a largely fixed context window budget (e.g., ~1M tokens), and context size isn’t expanding as quickly as model capability. Daisy argues teams must be intentional: prioritize what goes into context, keep it minimal, and avoid paying for information that isn’t used.
- •Context windows are large but effectively fixed; optimization matters
- •All customizations ultimately consume context in some form
- •You can’t dump an entire monorepo/wiki—need targeted retrieval
- •Principle: “don’t pay for what you don’t use” applied to context
19:36 – 21:37
Why caching changes the rules: KV cache and stable-vs-volatile prompt layout
Daisy explains that token cost isn’t uniform: KV caching makes earlier prompt changes expensive because they invalidate cached computation. This constrains “eviction” strategies and pushes toward keeping stable, shared instructions at the start and volatile, task-specific data near the end.
- •KV cache affects cost of generating next tokens; early edits are expensive
- •Naive LRU-like tool/context eviction can backfire under KV caching
- •Place stable shared content early; per-task volatile content later
- •There’s no complete solution yet; it’s an active design constraint
21:37 – 22:38
Plugin primitives under scale: evaluating MCP, skills, hooks, and agents for monorepos
The talk shifts to a scaling lens: what happens when a monorepo has 10,000–100,000 customizations? Daisy introduces four primitives (MCP, skills, hooks, subagents) and evaluates their token footprint, reliability, and operational overhead in large engineering organizations.
- •Scaling question: what breaks at 10k/100k integrations?
- •Four primitives: MCP, skills, hooks, subagents
- •Key metric: how much must live in the system prompt vs. lazy-loaded
- •Goal: disseminate info without filling context too quickly
22:38 – 27:44
MCP at scale: great for public integrations, heavy for developer-internal workflows
MCP was designed for chatbot-like environments without shell access, making it useful for external integrations and auth/transport handling. But in Claude Code’s developer environment (with a shell), wrapping existing CLIs in MCP often adds overhead, and tool definitions can dominate the context window at scale.
- •MCP assumes no shell/files; Claude Code often already has a CLI path
- •Good for shipping public integrations; less ideal for internal dev workflows
- •Tool definitions (name/description/schema) consume system prompt tokens
- •Tool search helps by lazy-loading schemas, but discoverability isn’t free
- •Operational overhead: auth + lifecycle complexity can be unnecessary internally
27:44 – 30:17
Skills: ‘lazy system prompts’ that help—but still carry token and triggering costs
Skills are folders with markdown guidance and a one-line description in the system prompt, loaded on demand. They’re easy to create and distribute, but at large scale the always-loaded descriptions add up, and reliably triggering the right skill still requires enough description (or explicit user cues).
- •Skills = folder + markdown + summary line; easy to author and share
- •Body is pay-per-use; summary line is always loaded (non-zero overhead)
- •Trigger reliability trades off with description length and specificity
- •Lack of hierarchy/subskills limits scalability; improvements are underway
- •Governance challenge: easy creation can lead to quality inconsistency
30:17 – 33:20
Hooks: true zero-token-overhead gating and the best ‘red squiggles’ primitive
Hooks run scripts outside the context window and only inject text when needed, enabling massive scale without ballooning prompts. Daisy positions hooks as the most scalable primitive for diagnostics-style feedback loops (linters, type checks, generated-file warnings), though they can be limited by simplistic matching logic unless you pay for more intelligence.
- •Hooks execute locally on events and inject context conditionally
- •Zero token cost unless they return text; scales to huge numbers of hooks
- •Shifts constraints from context window to local compute capacity
- •Best fit for automated nudges: linting, typechecks, guardrails
- •Trade-offs: naive regex/word matching; subagent-based decisions cost tokens
33:20 – 37:26
Subagents and ‘what not to do’: avoid unconditional CLAUDE.md injection; memory is different
Subagents can offload work to separate contexts, reducing pressure on the main loop—but their short descriptions still accumulate in the parent prompt at large scale. Daisy also explains why plugins can’t simply ship unconditional CLAUDE.md prompt chunks: it’s deceptively expensive and would not scale; session-start hooks can emulate it with explicit cost. She closes by distinguishing plugins (deliberate context-engineering artifacts) from “memory” (lower-quality, short-lived info).
- •Subagents: pay mainly for call/result in main context; prompts live elsewhere
- •But parent prompt still includes per-agent descriptions; scaling remains a concern
- •Unconditional plugin CLAUDE.md would explode prompt size; every plugin would add one
- •Session-start hooks can inject always-on text, making the cost explicit
- •Plugins are deliberate context-engineering primitives; ‘memory’ is separate/low fidelity
37:26 – 42:00
How Anthropic uses Claude Code: asynchrony, parallelism, and worktrees as the foundation
Daisy describes the next phase of agentic work as running tasks asynchronously and in parallel, which forces humans to improve context switching. Git worktrees enable multiple isolated Claude Code instances that don’t step on each other, and practical UX tricks (session renaming and color) help humans quickly reorient.
- •Future workflow: asynchrony (walk away) + parallelism (many tasks)
- •Engineering reality: better context switching becomes essential
- •Git worktrees enable multiple independent checkouts/Claude sessions
- •Session rename + color reduce human context-switch latency
- •Long-lived worktrees reduce setup repetition and preserve agent ‘identity’
42:00 – 46:33
Multi-Claude coordination and automation: send-message, /loop, permissions mode, agents view, remote control
The talk ends with emerging capabilities for managing many agents: Claudes messaging each other, periodic prompting via /loop (cron), and permission automation for long-running workflows. Daisy highlights the Claude Agents UI for monitoring many sessions and remote control for lightweight check-ins to keep overnight work moving.
- •Send-message enables Claude-to-Claude knowledge transfer (with permissions)
- •/loop runs recurring prompts; useful for babysitting CI and PR pipelines
- •Auto permissions mode reduces prompt fatigue; enables overnight work (extra token cost)
- •Claude Agents view centralizes monitoring, triage, and jumping into sessions
- •Remote control allows quick phone/desktop check-ins to prevent stalls
46:33 – 47:19
Closing takeaways: give access, mind the box, and choose abstractions that scale
Daisy summarizes three guiding principles for effective Claude Code customization in enterprise settings. The focus is on integrating the information you already use, respecting the hard limits of the context window, and selecting plugin abstractions that continue to work when scaled to monorepos and huge tool ecosystems.
- •Give Claude access to the systems where work actually happens
- •Mind the context window budget and KV-cache cost dynamics
- •Prefer scalable primitives (especially hooks) over prompt-bloating ones
- •Design for monorepo-scale: 100k skills/tools/agents as a realistic scenario
- •Optimization target: tighter feedback loops over more tokens or bigger prompts

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

From agentic programming to agentic software engineering (talk framing)

Why customization matters: access, knowledge, and tooling as the core needs

Access: connect Claude to the places decisions and signals live

Knowledge: why fine-tuning isn’t the answer and ICL is the main lever

Tooling: building the agentic equivalent of IDE feedback loops

Tools that scale with intelligence vs. tools that compensate for it

Context window engineering: fixed budgets and the ‘Arduino running npm’ analogy

Why caching changes the rules: KV cache and stable-vs-volatile prompt layout

Plugin primitives under scale: evaluating MCP, skills, hooks, and agents for monorepos

MCP at scale: great for public integrations, heavy for developer-internal workflows

Skills: ‘lazy system prompts’ that help—but still carry token and triggering costs

Hooks: true zero-token-overhead gating and the best ‘red squiggles’ primitive

Subagents and ‘what not to do’: avoid unconditional CLAUDE.md injection; memory is different

How Anthropic uses Claude Code: asynchrony, parallelism, and worktrees as the foundation

Multi-Claude coordination and automation: send-message, /loop, permissions mode, agents view, remote control

Closing takeaways: give access, mind the box, and choose abstractions that scale

Get more out of YouTube videos.