At a glance
WHAT IT’S REALLY ABOUT
How to curate AI context for reliable engineering and agents
- LLMs should be treated like software programs whose outputs depend heavily on the instruction set, tools, and information placed in the context window.
- Very long context windows are not a reliable solution today because performance can degrade sharply as token length increases on even simple tasks.
- “Needle-in-a-haystack” tests overstate long-context capability because they require minimal reasoning and attention to only a tiny portion of the input.
- Curating and focusing context (versus dumping full context) can produce large performance gains, motivating an explicit process for finding, removing, and optimizing information.
- For agents, repeated gather/glean cycles and massive histories make compaction critical, and naive summarization often provides little benefit without smarter prompts and strategies.
IDEAS WORTH REMEMBERING
5 ideasContext engineering is simply deciding what goes in the context window.
It includes prompts, retrieved knowledge, tool outputs, and history—anything the model will condition on for the current turn.
Longer context is not equivalent to better performance.
Chroma’s results suggest model performance can drop markedly as token length grows, so “just add more tokens” can reduce reliability on real tasks.
Needle-in-a-haystack success is a weak proxy for real workloads.
These tests require attending to a tiny “needle” with near-zero reasoning, unlike summarization, multi-document synthesis, or agentic tasks that need broad attention and deeper reasoning.
Focused context can outperform full context by a wide margin.
When the model gets only the most relevant information (even via oracle/human curation in evaluations), accuracy improves significantly—implying that context selection is a primary lever.
Use a two-stage “Gather then Glean” process for context quality.
First maximize recall (collect broadly, tolerate noise), then maximize precision (prune distractions) to deliver a small, high-signal context payload.
WORDS WORTH SAVING
5 quotesAnd though people may want to sell this to you as, uh, a techno machine god, um, we believe it is ultimately just software.
— Jeff
What is Context Engineering? It is quite simply deciding what's in the context window. It's that simple.
— Jeff
So broadly speaking, the goal of context engineering is to, number one, find the relevant information, number two, remove the irrelevant information, and then number three, optimize the relevant information.
— Jeff
But our assertion is that most interesting things people are doing with language models today require either more context or more reasoning or both.
— Jeff
And what we find is that like today's approaches don't really work.
— Jeff
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome