Who's Writing the Context Window?

Right now, somewhere, an agent is mid-loop. Its context window holds tens of thousands of tokens. A person wrote a couple hundred of them — the system prompt, the original task — and walked away. Everything else assembled itself in the last few milliseconds: files read off disk, the results of the last tool call, a compressed memory of earlier turns, the schemas for everything the agent is permitted to touch. On the next cycle most of it will be discarded and built again. The thing writing that window is not a person. It is a program, running the same assembly thousands of times while no one watches.

There is a popular diagram that cannot see this.

You have seen it, or one of its cousins. Three concentric rings. At the center, prompt engineering: how to phrase the instruction. Around it, context engineering: what else to put in the window — retrieved documents, history, tool definitions. Around that, the harness: the loop, the guardrails, the tool execution. The whole thing resolves to a tidy equation. Agent equals model plus harness. It is a fair attempt to order a field moving faster than its own vocabulary. It is also drawn from the design surface, not from the running system. You can tell because the rings are sorted by the wrong thing. They go by scope — small things at the center, big things at the edge.

Scope is the wrong axis. It files identical work in different rings and unrelated work in the same one. That is what a map looks like when it is traced from other maps instead of from the territory. The axis that matters is not how big the unit of work is. It is who performs it, and when. There are two regimes and a line between them. Above the line, a human sits and composes text for a single exchange. They tune a prompt. They choose what to paste in. They read the output and decide what to send next.

This is craft, done by hand, one interaction at a time. Below the line none of that happens, because there is no one there. A program composes the text. It runs on every turn of a loop that may last hours, and it makes the same calls a person would — what to include, what to cut, what to put first — at a cadence no human could match, on inputs no human will ever see. Prompt engineering lives above the line. So does the context engineering people actually picture when they say the words: a person deciding what to paste, retrieve, summarize, or drop.

The harness lives below it. And here is what the rings get backwards. Below the line, context engineering is not a smaller thing the harness contains. It is the thing the harness does. What goes in the window, in what order, at what fidelity, and what gets evicted to make room — that is most of a harness's work on any given turn. The harness is not a layer wrapped around context engineering. It is context engineering, automated and set running.

The human does not stop working. The work changes shape. You no longer curate a window; you write the policy by which a window will be curated, on every turn, without you. That is a different job, and it is the one "harness engineering" actually names.

Watch a harness run and the inversion stops being abstract. Picture a single turn. The window holds the original task, the current to-do list, the diff from the last edit, three source files the model asked to see, and the schema for one shell tool. The model runs a command. It fails. The next turn's window is not the same window. The two source files that turned out irrelevant are gone. The error log is in. The previous turn's reasoning has been crushed to a two-line summary, and a different tool is now exposed. The task never changed. The prompt barely moved. The window was rebuilt.

Take the Ralph Loop, Geoffrey Huntley's pattern for driving an agent from a bash loop. Its defining choice is statelessness — not of the work, which persists on disk, but of the context window, which resets every iteration. The agent is handed nothing it remembers, because it remembers nothing: the loop rebuilds the working context from the filesystem, feeds it in fresh, runs one pass, lets the agent write its changes back to disk, and throws the window away. Then it does the whole thing again.

Notice where the continuity of the work lives. The prompt at the top of the loop barely moves between iterations. The model is fixed. What changes — what carries the work forward — is the state on disk and the loop's discipline in reconstituting it each time. Pull the loop out and you have a model that forgets everything between turns. Pull the model out and you still have a precise theory of what context this task needs and how to rebuild it from nothing. The Ralph Loop is a context-engineering machine with the model held almost perfectly still.

Skills make the move from the other direction. A skill is a bundle of instructions the agent loads only when the task calls for it; the rest of the time it never enters the window at all. The harness reads the situation, decides this skill belongs and those forty do not, and admits only what the task requires. Where the Ralph Loop rebuilds context by reconstruction, a skill manages it by selective admission — the window stays small not because little exists but because the harness withholds everything that hasn't earned a place in it.

Two mechanisms, opposite in motion, identical in kind. Both are the harness deciding what the model gets to see. That is the same act a person performs by hand when they curate a prompt, handed to a program — run at the program's speed, on the program's judgment. There is a reason the region below the line exists, and the field has had it in hand all along. It just filed it in the wrong place.

Meta-prompting was the first version of this move, at the smallest possible scale. The idea was modest: stop writing the prompt by hand, and write something — a template, a model — that writes the prompt for you. It got treated as a clever trick tucked inside prompt engineering. A footnote.

But the footnote was the pattern. Take a piece of context-construction a human had been doing by hand and give it to a program. Meta-prompting did this for the prompt. The harness does it for everything else — the retrieved documents, the memory, the tool results, the ordering, the eviction — and not once but on every turn of a running loop. The harness is to context engineering what meta-prompting is to prompting: the same task, lifted out of human hands and given to a machine.

Which means the rings were never levels of scope. They were stills from a migration, and meta-prompting — the footnote — was the first frame. Line the stages up: prompt by hand, then meta-prompting, then context engineering, then the harness. That is not a hierarchy of sizes. It is the same work crossing the same line, one piece of the context window at a time, from the side where a person does it to the side where a program does. The diagram caught the movement at a single instant and mistook the still for a structure.

The line is not fixed. It has been moving since the first time someone automated a prompt, and it moves one way: down. Every year more of the work that used to sit above it — done by a person, by hand, for one interaction — crosses to the side where a program does it continuously and unseen. Retrieval went first. Memory followed. The ordering, the summarization, the eviction are crossing now.

So the question the diagram should have asked, and didn't, is not how the layers stack. It is how much remains on the human side of the line, and what becomes of that side when the harness can assemble its own context better than any policy you could write for it. Nobody is writing that window by hand for much longer.