Advanced Context Engineering for Agents

Dexter Horthy, founder of Human Layer, shares what his team has learned about scaling coding agents in real-world software projects. He walks through why naive back-and-forth prompting fails, how spec-first development keeps teams aligned, and why “everything is context engineering.” From compaction strategies to subagents and planning workflows, he shows how intentional context management turns AI coding from prototypes into production. Chapters: 00:09 - The Origin of Context Engineering 00:46 - Key Talks and Insights from AI Engineering 01:45 - Challenges with AI in Complex Systems 03:12 - The Shift to Spec-First Development 04:03 - Advanced Context Engineering for Coding Agents 04:48 - Intentional Compassion in Context Management 05:45 - Optimizing Context Utilization 07:27 - The Role of Subagents in Context Control 08:48 - Frequent Intentional Compaction 11:00 - Practical Implementation and Workflow 11:12 - Case Study: Fixing a Rust Code Base 11:59 - Insights on Effective Coding Practices 12:44 - Reviewing Features, Research, and Plans 13:30 - Conclusion and Future Directions

Dexter Horthyhost

Aug 24, 202514mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Spec-first, compacted context workflows make coding agents reliable at scale

The talk argues that in an AI-coding future, durable specs and plans matter more than transient chat prompts or even raw generated code.
Evidence from industry and studies suggests AI coding boosts output but often increases rework, especially in complex or brownfield repositories, unless the workflow is redesigned.
The core technique is “intentional compaction”: proactively distilling research findings and progress into structured artifacts that preserve correctness while controlling context-window noise.
A three-phase loop—research, plan, implement—combined with frequent resets and human review aims to keep context utilization under ~40% and prevent drift.
Subagents are framed less as role-played teammates and more as a context-control mechanism that delegates searching/reading and returns tight, structured summaries to the parent agent.

IDEAS WORTH REMEMBERING

5 ideas

Treat specs as the primary artifact, not the generated code.

Dex frames “talking to an agent for hours then committing code” as equivalent to checking in a compiled JAR and discarding source; the enduring asset is the written specification and plan that can be reviewed, reused, and versioned.

AI coding fails in complex systems mainly due to misalignment and rework, not raw capability.

Citing a large Stanford study and practitioner experience, he emphasizes that benefits are often erased by sloppy output and rework—especially in legacy/brownfield codebases—unless you prevent misunderstandings early.

Intentional compaction beats ad-hoc restarts and generic auto-summarization.

Instead of “start over with a fresh context” or relying on /compact, the team writes a deliberate progress/research file that captures only what the next agent needs, minimizing bad or noisy context.

Optimize the context window for correctness first, then completeness, then size.

He proposes a priority order: the worst context contains wrong info, next is missing info, and only after that does “too much noise” matter; this reframes context management as quality control, not just token trimming.

Use subagents to reduce context load from searching and reading.

Subagents can do repository exploration and return structured results (e.g., where something happens, how data flows) so the parent agent can act without ingesting large volumes of code and tool output.

WORDS WORTH SAVING

5 quotes

In the future where AI is writing more and more of our code, the specs, the, the s- the description of what we want from our software is the important thing.

— Dexter Horthy

I know it works because I shipped six PRs last Thursday, and I haven't opened a non-markdown file in an editor in almost two months.

— Dexter Horthy

I wanna talk about the most naive way to use a coding agent, which is to shout back and forth with it until you run out of context or you give up or you cry.

— Dexter Horthy

Why are we obsessed with context? Because LLMs are pure functions.

— Dexter Horthy

And so the biggest insight from here that I would ask you to take away is that a bad line of code is a bad line of code.And a bad part of a plan can be hundreds of bad lines of code. And a bad line of research, a misunderstanding of how the system works and how data flows and where things happen can be thousands of bad lines of code.

— Dexter Horthy

Origin and definition of “context engineering”Limits of vibe coding in large reposSpec-first development and shared mental alignmentIntentional compaction vs. naive /compactContext-window budgeting (correctness, completeness, noise, trajectory)Subagents for search and context isolationResearch–Plan–Implement workflow with human review gatesBrownfield case study (large Rust codebase) and high-throughput shipping

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.