CHAPTERS
Talk framing: from “agentic programming” to “agentic software engineering”
Daisy Holman introduces the goal of the session: moving past simple coding tasks and into the realities of production software engineering with Claude Code. She frames the talk around constraints like organizational context, scale, and the need for customizable agent “harnesses.”
Why customization is necessary: access, knowledge, and tooling gaps
She lays out a core thesis: Claude can’t truly collaborate on your job unless it can do what you can do—reach the same systems, understand the same rationale, and use the same workflow tools. Out-of-the-box Claude Code is fine for greenfield/leaf tasks, but insufficient for large-scale engineering work.
What “access” really means in practice (and a daily audit technique)
Daisy enumerates the external systems Claude needs to interact with to act like a real teammate, especially for incident response and decision tracking. She suggests a practical method: track every time you alt-tab/copy-paste into Claude, then connect Claude to those tools.
Why fine-tuning isn’t the answer; “ICL” as the real lever
She argues that you can’t realistically train institutional memory and rapidly changing conventions into frontier models. Instead, in-context learning (ICL) and structured context engineering are the practical path, even if it feels like ‘just text files.’
Tooling for agents: building the “agentic IDE” and tighter feedback loops
Daisy compares the current agent editing experience to pre-modern tooling (“ED”), arguing that agents need the equivalents of syntax highlighting, LSP feedback, and non-blocking nudges. The key is not a smarter model, but a faster, more informative feedback loop.
Tools that scale with intelligence vs tools that compensate for lack of intelligence
She distinguishes between rigid guardrails that constrain behavior and flexible tooling that becomes more useful as models improve. The recommended approach is to design tooling that supports better decision-making without forcing brittle workflows.
Context window engineering: fixed budgets and “don’t pay for what you don’t use”
Daisy grounds customization in the reality that context windows have largely plateaued at frontier scale, so efficient information packaging is essential. She uses an ‘npm on an Arduino’ analogy: limited memory forces intentional, minimal representations.
KV-cache constraints: why prompt “eviction” and LRU ideas break down
She explains why dynamic tool swapping is expensive: changing early prompt content invalidates cached computation, making subsequent tokens costlier. This leads to a strategy of keeping shared/stable content early and volatile, task-specific content late.
Plugin primitives at scale: MCP (Model Context Protocol) strengths and limits
MCP is presented as a broadly useful integration standard, especially for public/transport-agnostic tools and auth. But in developer environments with a shell and CLIs, MCP can be overkill—and it scales poorly when many tools’ schemas land in the system prompt.
Scaling MCP with “tool search” (lazy tool loading) and its trade-offs
She describes a mitigation: include only tool names upfront, and let Claude search for tool details when needed. However, generic tools still need to be available, and sparse descriptions can reduce the likelihood Claude knows to search.
Skills as “lazy system prompts”: easy to create, harder to govern at monorepo scale
Skills are framed as folders with markdown guidance and a short system-prompt description that tells Claude when to load the full content. They’re easy to distribute internally, but at very large scale the always-loaded descriptions and lack of hierarchy become bottlenecks.
Hooks: true zero-overhead context injection and the “red squigglies” home
Hooks are presented as the most scalable primitive because they run outside the context window and only inject text when triggered. This makes it feasible to have huge numbers of potential checks without paying token cost for the ones that don’t apply.
Subagents/agent teams and why unconditional plugin prompts don’t scale
Subagents allow specialized tasks in separate contexts, reducing pressure on the main context window, but their descriptions still add up at scale. Daisy also explains why plugins can’t just ship unconditional CLAUDE.md content: it’s deceptively expensive and would balloon prompts ecosystem-wide.
Where Claude Code is heading internally: parallelism, asynchrony, and “multiple Claudes” workflows
Daisy shifts to how the Claude Code team uses Claude Code to build itself, emphasizing parallel work and longer-running tasks. She presents practical workflow tactics: git worktrees, session renaming/colors for faster context switching, and enabling Claudes to message each other.
Automation features: /loop, permissions mode, Agents view, and remote control
She highlights features that make long-running and hands-off workflows viable: /loop for periodic re-runs, permissions mode to avoid constant prompts, a centralized Agents UI for managing many sessions, and remote control for quick check-ins away from the computer.
Closing takeaways: access, context discipline, and scalable abstractions
Daisy ends with three guidance points that summarize the talk: connect Claude to the same systems you use, engineer context intentionally, and choose plugin abstractions that won’t collapse at monorepo scale. She encourages thinking in terms of 100k+ skills/tools and the token economics of each choice.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome