Skip to content
YC Root AccessYC Root Access

Advanced Context Engineering for Agents

Dexter Horthy, founder of Human Layer, shares what his team has learned about scaling coding agents in real-world software projects. He walks through why naive back-and-forth prompting fails, how spec-first development keeps teams aligned, and why “everything is context engineering.” From compaction strategies to subagents and planning workflows, he shows how intentional context management turns AI coding from prototypes into production. Chapters: 00:09 - The Origin of Context Engineering 00:46 - Key Talks and Insights from AI Engineering 01:45 - Challenges with AI in Complex Systems 03:12 - The Shift to Spec-First Development 04:03 - Advanced Context Engineering for Coding Agents 04:48 - Intentional Compassion in Context Management 05:45 - Optimizing Context Utilization 07:27 - The Role of Subagents in Context Control 08:48 - Frequent Intentional Compaction 11:00 - Practical Implementation and Workflow 11:12 - Case Study: Fixing a Rust Code Base 11:59 - Insights on Effective Coding Practices 12:44 - Reviewing Features, Research, and Plans 13:30 - Conclusion and Future Directions

Dexter Horthyhost
Aug 25, 202514mWatch on YouTube ↗

CHAPTERS

  1. Why “Context Engineering” exists: Dex’s origin story and Twelve-Factor Agents

    Dexter Horthy explains where the term “context engineering” came from and how his earlier work (Twelve-Factor Agents) tried to formalize reliable LLM application principles before the term became popular. He frames the talk as the “what’s next” after basic agent-building advice.

  2. Key industry insights: spec-first thinking and evidence of AI-driven rework

    Dex highlights two influential talks: Sean Grove’s argument that “the spec is the asset” (not the generated code) and a Stanford study showing AI-assisted development often increases rework. These set up the need for better process and context management rather than more prompting tricks.

  3. Why agents fail in big repos: complex systems, brownfield code, and review overload

    He describes firsthand pain: huge PRs of complex systems code (race conditions, shutdown order) that are practically unreviewable. The mismatch between agent output volume and human review capacity forces a shift in how teams align and validate changes.

  4. The forced pivot: adopting spec-first development to regain alignment

    Dex explains how his team was effectively forced into spec-first development to keep everyone on the same page. Over ~8 weeks, they transitioned to validating specs/tests rather than line-by-line code review, unlocking faster shipping without drowning in diffs.

  5. Naive agent usage vs. deliberate resets: knowing when to restart context

    He critiques the common ‘argue with the model until context runs out’ workflow. When an agent goes off the rails, restarting can help, but he argues teams should go beyond ad-hoc resets toward structured context practices.

  6. Intentional Compaction: replacing /compact with curated progress artifacts

    Dex introduces “Intentional Compaction”: intentionally deciding what gets persisted to disk/memory to onboard the next agent run. He rejects automatic compaction as low-quality and instead promotes a structured progress file that preserves only what matters.

  7. What burns context window: search, reading, tool output noise, and unnecessary blobs

    He breaks down what consumes tokens during agent work—file hunting, understanding flows, edits, and tool outputs (especially large JSON from MCP tools). This motivates being selective about what the model sees and when.

  8. Context as the main lever: optimizing correctness, completeness, and signal-to-noise

    Dex frames LLMs as “pure functions” where output quality is dominated by input quality. He proposes optimizing context for correctness (no bad info), completeness (no missing info), and minimal noise, with “trajectory” as a softer, vibe-based factor.

  9. Subagents as context control: delegating search and synthesis without bloating the parent

    He explains subagents not as role-play (PM/data scientist personas) but as a mechanism to keep the parent agent’s context lean. Subagents can search and summarize, returning a compact, structured answer so the parent can act without ingesting tons of raw code.

  10. Frequent intentional compaction: a workflow built around staying under 40% context

    Dex presents the team’s core operating model: keep context utilization under ~40% and iterate through research → plan → implement cycles. Instead of carrying giant histories forward, they repeatedly compact into artifacts and start fresh windows.

  11. Research and planning artifacts: file/line-numbered research + explicit change plans

    He details what “good” research and plans look like: research outputs cite file names and line numbers to prevent repeated searching; plans enumerate changes, affected files, and verification steps. If the plan is strong, implementation becomes smooth and low-conflict.

  12. Human review is still central: review specs/plans for mental alignment, not diffs

    Dex reframes code review’s purpose as team alignment—understanding how and why the system changes. Reviewing a 200-line plan or research doc is feasible and catches issues earlier than reviewing thousands of lines of generated code.

  13. Proof it works: Rust brownfield fix, complex feature shipping, and what’s next

    He shares results: a one-shot-style fix in a 300k-line Rust codebase that was merged quickly, and a day-long session shipping ~35k LOC including complex work like Wasm support. He closes by arguing tooling will commoditize, but team/process transformation will be the durable advantage.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome