Governance and Approval Ladders - Playbook

The shift

In a normal team, control is distributed: people write code they understand, review each other, and catch things in conversation. When one person directs an AI that writes most of the code, all of that collapses into a single question asked over and over: do I let this through, or not?

Get that question right and a solo operator ships safely at the speed of a team. Get it wrong in either direction (waving everything through, or inspecting everything by hand) and you either ship disasters or grind to a halt. The answer is not "approve more" or "approve less." It is a graduated ladder, where the rigor of the gate matches the cost of being wrong.

The ladder, cheapest gate first

Each rung catches a different class of problem. Most changes clear the cheap rungs automatically, so your attention is reserved for the few that need it.

Tool-permission approval. The agent asks before it acts. You allow or deny per action, and you put the safe, repetitive operations (reads, status checks) on an allowlist so you are not rubber-stamping noise. Plan mode is approval in its purest form: the agent proposes a plan read-only, you approve it, and only then does it touch code.
Diff review. You read what changed before it merges. This is why the build loop keeps changes small: a small diff can be reviewed, a thousand-line diff gets rubber-stamped.
Automated gates, fail-closed. The machine approves the mechanical stuff and never gets tired. This project's build fails if a feature ships without its help article, without marketing coverage, with a blank feature visual, or with an em-dash in user copy. These gates approve faster and more reliably than you ever could, on exactly the rules you can encode.
Manual smoke. You approve by using it, signed in, on the real device. This is the rung the machine cannot climb for you (see the verification module). A green build is not approval.
Draft, then explicit merge. The pull request opens as a draft and waits for a deliberate human "merge." That gap between "the work is done" and "it is shipped" is where the last human judgment lives. Nothing reaches production by momentum.
Deploy approval. Know that merging to the main branch is what deploys to production here. The merge gate and the deploy gate are the same lever. If you do not know which is which, you will ship by accident.
Confirm-first for the irreversible. Sending an email, publishing copy, deleting data, spending real money, running a schema migration against the live database. These get an explicit human yes, every time, because you cannot take them back. Even spend has an automated version of this gate: the cost killswitch is approval-by-default-deny once a budget is hit.

Match rigor to blast radius

The whole system reduces to one rule: how bad is it if this is wrong?

Reversible and small (a copy tweak, a CSS fix): light touch. Allowlist it, glance at the diff, merge.
Reversible and large (a refactor, a new feature): diff review plus a real smoke test plus an explicit merge.
Irreversible, outward-facing, or costly (email, publish, delete, spend, a production migration): a human confirms, always, no exceptions.

In this very build, the most cautious moment was a job-costing feature that carried a live database migration. Everything else moved fast; that one got held, code-reviewed for the specific risk of leaking cost data to customers, and only merged on an explicit go. That is the rule in action: speed everywhere it is safe, a hard stop where it is not.

The anti-patterns

Approval theater. Clicking yes without reading. This is worse than no gate, because it feels safe while catching nothing.
Over-broad allowlists. Auto-approving writes or deploys to cut friction, until something irreversible slips through the gate you widened.
Green-means-go. Treating a passing build or a CI checkmark as approval and skipping the smoke test. (And watch which green light you trust: a deploy can fail while a comment shows a checkmark.)
The approval bottleneck. Parallelizing more agents than you can actually review. Past that point, throughput is just a growing queue of unvetted risk, not speed. The right number of parallel agents is the number whose output you can genuinely approve.
No gate on the irreversible. The one place you must never automate away the human.

The allowlist is where you encode your judgment

The allowlist is not "trust the AI more." It is you pre-deciding which actions are cheap enough to wave through, so your limited attention is spent only on the actions that are not. A well-tuned allowlist is the difference between approving a hundred trivial things by hand and approving the three that actually matter. Build it deliberately, and revisit it when something slips through.

The symmetry worth noticing

The flagship feature of this product is a customer quote-approval flow: the contractor's job becomes approving a draft, not writing one, and the customer's job becomes signing, not negotiating from scratch. That is the exact same principle this module is about, pointed at the user instead of at your own build process. The product teaches its own lesson: the highest-value position in any AI-assisted system is the approver's seat. Design your work so you sit in it.

The honest limit

Governance has a cost, and the cost is friction. Too many gates and you lose the speed that made AI-assisted development worth it. The art is the graduated ladder: automate the cheap, frequent approvals so they clear without you, and spend your human judgment only where a mistake is irreversible or user-facing. The gates are the scalable half of governance. You are the half that cannot be scaled, so spend yourself carefully.

Exercise

Write your ladder. List your common AI-assisted actions and sort each into one of three buckets: auto-clear (allowlist), diff-plus-smoke, or human-confirm-always.
Name your irreversibles. Write down every action in your project you can never take back (send, publish, delete, charge, migrate). Those go in the confirm-always bucket, no exceptions.
Encode one rung as a gate. Take your highest-risk repeating mistake and make it fail the build, so that rung approves itself from now on.
Audit your allowlist once a month. If something slipped through, the allowlist was too wide. Tighten it.