context engineering, or: why we run five agents instead of one
everyone’s talking about context engineering like it’s a new trick. it isn’t a trick. it’s the whole game now.
prompt engineering was about wording. context engineering is about what the model can actually see while it works: what’s in the window, what’s stale, what’s missing, and who decides. frame it that way and a problem shows up fast. one model holding everything is the wrong tool for the job.
one big context rots
give a single agent a long enough task and you watch it happen live. it loses the thread. it forgets the constraint you set 40 messages ago. it confidently re-adds a bug it already fixed.
researchers even have a name for the gentler version: “lost in the middle.” a model reliably uses the info at the start and end of a long context, and quietly drops the stuff in between. the more you cram in, the worse the recall gets.
so the instinct is bigger window, newer model, smarter prompt. those help at the margins. but you’re still asking one brain to hold the spec, the codebase, the style guide, the review criteria, and its own half-finished work, all at once, all the time. that isn’t something you can prompt your way out of. it’s a shape problem.
five small contexts beat one giant one
the move isn’t to make one context bigger. it’s to make several contexts smaller.
instead of one agent that knows everything badly, you run a few that each know one thing well. a maker that only holds the task and the spec. a reviewer that only holds the output and the criteria, and never sees the maker’s reasoning, so it can’t get talked into a bad answer. a coordinator that holds the plan, not the weeds.
each one runs lean. each one stays sharp. and nobody grades their own homework. the agent that wrote the code is the worst possible judge of whether it’s right, because it’s already convinced. a second agent with fresh eyes and a clean context catches what the first one can’t.
that’s the real unlock. not a smarter prompt. a second opinion.
we ship this way on purpose
we don’t just write about this. it’s how the work gets done here.
this post, and the page it links to, moved through a chain of agents. one scoped the angle and wrote it. a different one fact-checked every claim against what actually shipped. another sourced the art. a human signed off before it went live. nobody touched a step they weren’t meant to.
you can see the shape of it here: the agentic loops we run. one agent builds, another bounces it back if it’s wrong, and the task only ships once it passes.
the takeaway
context engineering isn’t prompt engineering 2. it’s the realization that the limit was never the model’s intelligence. it was how much one context could hold before it started to rot.
the fix isn’t a bigger brain. it’s more of them, each holding less.
five agents. five clean contexts. one that checks the others.
that’s the pitch.