The definition

Agent-Native Engineering

Agent-native engineering is the practice of shipping production software by operating a fleet of AI agents instead of writing the code yourself. The engineer’s primary output shifts from diffs to directed work: every piece of work moves through the full development lifecycle (idea, research, plan, implement, ship), each phase with its own deterministic machinery rather than one generic check at the end. An idea is ephemeral, a backlog item you grow, merge, or throw away; research is the pass over the code you always make before anyone writes a line; a plan is concrete and reviewable; implementation is gated. What’s left for the engineer is the creative end: deciding what to build and why, judging whether the plan is right and whether the result actually works, and keeping the fleet on the rails as the work rides that conveyor belt.

I work this way daily: on production code at Meta, and on my own infrastructure, where an autobuilder drains my personal project roadmap overnight. This page is the definition. The essays linked below are the mechanics, receipts, and failure stories.

The unit of work is the lifecycle, not the diff

You do not hand an agent a prompt and hope. You hand it a claim on a queue item that already carries research and a plan, and you take back a verified change with the commit recorded against it. The diff is a byproduct. The thing you delegate, track, and review is the whole lifecycle of one task.

Verification is the core craft

Done is not a feeling. In an agent-native system, finishing a task mechanically runs a gauntlet: build, tests, deploy, health checks against the running service, then an acceptance check written before the work started. An agent never gets to declare its own work correct. The gates do. Most of the craft in this discipline is designing gates that catch what matters.

Parallelism replaces typing speed

One engineer’s ceiling used to be how fast they could think and type in a single editor. A fleet runs many tasks at once, in isolated worktrees, around the clock. The new ceiling is how much verified work you can specify and review, and that is a different skill with different failure modes.

Judgment moves upstream

Correcting a plan costs a sentence. Correcting a diff costs a review cycle. Correcting production costs a weekend. Agent-native engineers spend attention at the top of that gradient: they kill bad ideas at the queue, read every plan, and read diffs in proportion to blast radius. Attention is the scarce resource, so it goes where the failure modes are expensive.

What it is not

It is not AI replacing engineers. The judgment load goes up, not down. Someone has to decide what should exist, catch the plan that is subtly wrong, and know what never gets delegated: ambiguous product calls, irreversible migrations, anything security-sensitive.

It is not autocomplete rebranded. Autocomplete accelerates typing inside the old workflow, and typing was never the bottleneck. Autocomplete makes an engineer maybe 5% faster. Agent fleets multiply what a team ships, because they take over the phases where the time actually goes.

And it is not shipping unreviewed AI code. Volume of unreviewed code is not productivity; it is deferred incident response. The discipline exists precisely because verification, not generation, is the hard part.

Read the essays

The deep dive on the whole system, and the right place to start:

Coming next in the series (this page will link each as it ships):

Views my own; not speaking for my employer.