Directing the agent: what the human actually does all day

1 July 2026

If the agent writes most of the code, the constitution holds the rules, the docs hold the memory, and the gates decide what's correct, a fair question is what the human is for.

More than I expected, and it's a different job than programming. The work that stayed mine was direction — deciding what to build, and, more subtly, reshaping what got asked for into the form that actually fit the project. The reshaping turned out to be the highest-leverage thing I did.

Intent, not specs

I never wrote specs. I wrote intent, usually a sentence or two: "difficulty rises but the player has no way to catch their breath — give them one, without undoing the climb." The agent, given that plus the constitution and the design log, came back with an implementation and often a sharper design than the sentence implied.

That works because the guardrails absorb the ambiguity. I don't have to pre-specify every edge case, because the rules the agent operates under already rule out the bad resolutions of it. Loose intent is safe when the constraints are tight. It's only dangerous when the agent is improvising in a vacuum, and by this point it never was.

The real job: reshaping the dangerous ask

Here's the move I made dozens of times, and the one I'd most want anyone building this way to internalise. A request comes in — often from a player, sometimes from me — that is reasonable on its face and corrosive if taken literally. The human's job is to catch that and turn it into the version that gives the person what they actually wanted without breaking the game.

Three real cases from this project:

"Add more health." The most natural request there is, and a trap. Flat health is raw power, and the whole design grants options rather than power, so taken at face value it would have flattened the game. Reshaped, it became a health stat you invest build points into — durability with an opportunity cost, an option rather than a freebie. The player who wanted to be tanky gets a tanky build; the game's spine holds.
"Let me buy things between runs." A shop selling quantity is a power-creep treadmill, and it lets a player buy past the very choices the game is built on. Reshaped, the currency buys access and options — new tools, a single carried item — never a stockpile of power.
"Let me flee a fight I'm losing." Reasonable, but paying to skip a mandatory fight lets you buy past a matchup you should have to solve. Reshaped, fleeing was limited to optional content, and conceding a mandatory fight was made to score exactly like dying — a mercy that grants no advantage.

In each case the literal request was wrong and the want behind it was legitimate. Finding the version that honours the want and obeys the rules is judgement the agent doesn't have and a gate can't supply. It's taste, and taste stayed human.

Knowing when to change the rules

The other call that couldn't be delegated was knowing when the rules themselves had to give. The clearest instance: the design eventually wanted permanent equipment carried between runs, which cut against the "options, not power" rule.

An agent won't, and shouldn't, unilaterally overturn a core principle — that's an owner's call. So I made it deliberately: not "ignore the rule this once," but "the rule is amended, and here are the exact conditions under which persistent power is allowed" — conditional, bounded, self-countering, folded into the existing guarantees. The amendment went into the constitution, so every later task is measured against the new rule, and the agent then executed faithfully inside that envelope. Changing the constitution is the human's job; obeying it is the agent's. An agent that edits its own core constraints on a whim is an agent with no constraints.

What it felt like

Less like programming, more like editing a fast, tireless, literal-minded writer. I set direction, caught the requests that would quietly rot the thing, made the calls that were mine to make, and let the agent — bounded by the rules, checked by the gates — do the bulk of the construction. My hands were off the keyboard most of the time and my judgement was engaged all of it. The next post pulls back to the shape of the whole effort: why it came out as a hundred and forty-five small pull requests rather than a few large ones.

▶ Play Deep Keep