hadamrd · hadamrd · May 30, 2026 · May 30, 2026
diff --git a/paper/01-introduction.md b/paper/01-introduction.md
@@ -0,0 +1,90 @@
+[← Index](./README.md) · [Next: The Central Tradeoff →](./02-the-tradeoff.md)
+
+---
+
+# 1. Introduction: The Governance Problem
+
+## 1.1 Generation is no longer the bottleneck
+
+A capable language model, handed a well-scoped task and a working
+repository, will produce a correct, tested change a large fraction of the
+time. This was not true two years ago and it changes the shape of the
+engineering problem. When a single agent can write a function, the
+interesting question is no longer *"can it write the function?"* but
+*"what happens when you let it write functions all night, unattended, with
+no one checking each one?"*
+
+The answer, observed repeatedly, is **drift**. Not catastrophic failure —
+drift. The system keeps moving. PRs keep opening. Tests keep passing. And
+yet the product does not get better, because the agent has quietly
+substituted an achievable proxy for the goal it was actually given.
+
+## 1.2 Three failure modes of the unsupervised agent
+
+An autonomous coding system left without governance exhibits three
+characteristic pathologies. None of them look like a crash; all of them
+look like productivity.
+
+**Proxy substitution ("specification gaming").** Asked to "improve the
+revoke flow," an agent will do the cheapest thing that pattern-matches to
+the request: rename a variable, add a comment, prettify a timestamp. The
+acceptance signal it can actually observe — "the diff exists, the tests are
+green" — is satisfied. The value it was meant to create is not. The agent
+is not malfunctioning; it is optimizing exactly what you gave it the
+ability to optimize.
+
+**Value-blindness.** A generation engine has no internal notion of
+*worth*. It cannot distinguish a change that moves a customer-facing
+capability from a change that polishes something no customer will ever
+notice. Both are "code that was written." Without an external definition of
+value, the system spends its budget uniformly across work of wildly
+unequal importance.
+
+**Quality entropy.** Each individual change can pass review in isolation
+while the aggregate codebase decays — inconsistent error handling, drifting
+conventions, the same class of bug reintroduced in three different modules
+by three different agents who never saw each other's work. Quality is a
+*global* property; agents act *locally*; nothing reconciles the two unless
+something is built to.
+
+## 1.3 Why "a human reviews everything" is not the answer
+
+The obvious mitigation — keep a human in the loop on every change —
+defeats the purpose. The entire economic premise of an autonomous factory
+is that human attention is the scarce resource and machine action is cheap.
+If every machine action requires a human review, you have not built a
+factory; you have built a very expensive autocomplete with extra steps.
+
+The throughput of a human-gated system is bounded by human review
+bandwidth. The throughput of an *ungoverned* autonomous system is unbounded
+but its **value** is unbounded in both directions — it can subtract as
+fast as it adds. Neither is acceptable. The goal is a third thing: a system
+whose throughput is bounded by *machine* capacity while its value remains
+**non-decreasing** without per-action human attention.
+
+## 1.4 The thesis: governance must be machine-checkable
+
+That third thing requires moving the human's judgment *out of the loop and
+into the rules*. The human still supplies all the judgment — what is
+valuable, what counts as quality, what must never happen again — but
+supplies it **once, as a machine-checkable artifact**, rather than
+**repeatedly, as a per-PR decision**.
+
+This is the organizing principle of everything that follows:
+
+> The central problem of an autonomous software factory is governance, not
+> generation. Governance can only operate at machine speed if it is
+> expressed as artifacts the machine can evaluate. Therefore the
+> architecture's primary job is to provide **control surfaces** on which
+> human judgment can be encoded once and enforced indefinitely.
+
+`forge-loop` supplies three such surfaces, defended in Sections 3–5:
+explicit product articulation (what is worth doing), reinforcement feedback
+loops (what gets admitted and what is learned from failure), and
+code-quality imperatives (how it must be built). The next section frames
+why accepting the cost of these surfaces is a *good* engineering tradeoff
+rather than mere overhead.
+
+---
+
+[← Index](./README.md) · [Next: The Central Tradeoff →](./02-the-tradeoff.md)
diff --git a/paper/02-the-tradeoff.md b/paper/02-the-tradeoff.md
@@ -0,0 +1,109 @@
+[← Introduction](./01-introduction.md) · [Index](./README.md) · [Next: Product Articulation →](./03-product-articulation-axes.md)
+
+---
+
+# 2. The Central Tradeoff
+
+Every architecture is an answer to the question *"what cost are you willing
+to pay, in exchange for what property?"* This section states forge-loop's
+answer explicitly, because a tradeoff defended honestly is more convincing
+than a benefit claimed without a price.
+
+## 2.1 What you pay
+
+The governance triad is not free. It imposes a real, unavoidable cost on
+the operator, paid **upfront and continuously**:
+
+- **You must articulate the product.** Writing `axes.yaml` and a product
+  vision forces you to state, in falsifiable terms, who you serve and what
+  counts as value. This is hard — harder than writing the code, for many
+  people — because it demands clarity that ad-hoc development lets you
+  avoid.
+- **You must write the rules.** Every quality imperative in the manifesto
+  is a sentence someone had to think through and commit to. The critic can
+  only enforce what has been written down.
+- **You must tend the feedback loop.** Each bug that ships is a debt: it
+  must be distilled into a rule, or the same class of failure recurs.
+
+In short: the system shifts effort from *reviewing outputs* to
+*specifying constraints*. You do less of the thing humans are slow at
+(reading every diff) and more of the thing humans are uniquely good at
+(deciding what matters).
+
+## 2.2 What you buy
+
+In exchange, you buy the single property an ungoverned autonomous system
+cannot have:
+
+> **Bounded, non-decreasing value over an unbounded number of unsupervised
+> actions.**
+
+Unpack that:
+
+- **Unbounded actions.** The loop can run indefinitely, dispatching many
+  agents in parallel, without a human gating each one.
+- **Non-decreasing value.** Because every admitted change must clear the
+  value axes and the quality gate, the system cannot ship work that is
+  worthless or corrosive — the floor only moves up.
+- **Bounded blast radius.** Because failures are converted into permanent
+  gates, the set of possible bad outcomes *shrinks monotonically over
+  time* rather than recurring.
+
+## 2.3 Why this is a *good* tradeoff, not just *a* tradeoff
+
+The trade is favorable because of an asymmetry in how the two costs scale.
+
+**Specification cost is paid once and amortizes; review cost is paid per
+action and does not.** A value axis you write today governs every ticket
+the system ever generates against it. A quality rule you write after one
+bug blocks that bug class in every future PR, across every agent, forever.
+The marginal cost of governing the *N+1*-th action approaches zero as the
+ruleset matures. By contrast, per-PR human review is a flat tax: the
+ten-thousandth review costs as much as the first.
+
+This is the same economic shape that makes *compilers* worth more than
+*manual code inspection*, or *type systems* worth their annotation
+overhead: you pay a fixed cost to encode a constraint, and the machine
+enforces it an unbounded number of times at no incremental human cost. The
+governance triad applies that pattern one level up — not to syntax or
+types, but to **value and quality**.
+
+```
+        cost
+         │
+review   │            ╱  per-action human review (linear, never amortizes)
+(human)  │          ╱
+         │        ╱
+         │      ╱
+         │    ╱        ┌──────────────────  governance (fixed + decaying margin)
+         │  ╱      ┌───┘
+         │╱   ┌────┘
+         └────┴───────────────────────────────► number of autonomous actions
+```
+
+The two regimes cross early. Past the crossover, governance is strictly
+cheaper for the same safety — and unlike review, it does not bottleneck
+throughput on human availability.
+
+## 2.4 When the tradeoff is *bad*
+
+Intellectual honesty requires stating where this design loses. The
+governance triad is a poor fit when:
+
+- **The work is inherently subjective.** "Make it feel more premium" cannot
+  be reduced to falsifiable axes or rules. The system degrades to needing a
+  human at the wheel — which forge-loop's own documentation concedes.
+- **The product is too young to articulate.** If you genuinely do not yet
+  know what you are building, forcing an `axes.yaml` produces fiction, and
+  the system will faithfully optimize the fiction.
+- **Volume is low.** If you only need three changes, the fixed cost of
+  specification never amortizes. Just write them yourself.
+
+The tradeoff is *good* precisely in the regime forge-loop targets: a
+product with a knowable value model, a meaningful backlog, and an operator
+willing to invest in specification once to harvest leverage many times. The
+following three sections defend each leg of the triad in that context.
+
+---
+
+[← Introduction](./01-introduction.md) · [Index](./README.md) · [Next: Product Articulation →](./03-product-articulation-axes.md)
diff --git a/paper/03-product-articulation-axes.md b/paper/03-product-articulation-axes.md
@@ -0,0 +1,114 @@
+[← The Tradeoff](./02-the-tradeoff.md) · [Index](./README.md) · [Next: Reinforcement Feedback Loops →](./04-reinforcement-feedback-loops.md)
+
+---
+
+# 3. Product Articulation & Value Axes
+
+> *Control surface #1: making "is this worth doing?" a question the system
+> can answer before it acts.*
+
+## 3.1 The problem this surface solves
+
+Recall the value-blindness pathology from Section 1: a generation engine
+has no internal notion of worth. Everything pattern-matches to "code that
+could be written." The only way to give the system a sense of value is to
+**supply one externally, in a form it can evaluate against a candidate
+ticket.**
+
+Free-form prose ("we want to delight our users") is not such a form. It is
+unfalsifiable; an agent can justify almost any change as "delighting
+users." What is needed is a representation of value that is **structured
+enough to filter against** while remaining **expressive enough to capture
+what the product actually is.**
+
+## 3.2 The mechanism: axes + vision
+
+forge-loop splits product articulation into two artifacts under `.forge/`,
+and the split is deliberate:
+
+- **`product-vision.md`** — free-form prose. Who you serve, the wedge,
+  and — critically — *what is explicitly NOT valuable*. Prose is the right
+  medium here because vision is narrative; it carries the *why* and the
+  customer stories that a structured schema would flatten.
+
+- **`axes.yaml`** — structured. The 4–6 *value axes* the system is allowed
+  to move. Each axis names a customer, defines what "valuable" concretely
+  means on that axis, enumerates `acceptable_work`, and — the load-bearing
+  field — enumerates `rejected_as_cosmetic`.
+
+The shape of a single axis (from the project's own configuration):
+
+```yaml
+axes:
+  - name: golden-path-e2e
+    customer: "SRE running their first pipeline on day zero"
+    valuable_means: "Playwright tests driving the real rig — golden path
+                     survives every release"
+    acceptable_work:
+      - "Customer-shaped pipeline fixtures (Node, Java, polyglot)"
+      - "Adversarial paths: failed step, OOM step, secret-needing step"
+    rejected_as_cosmetic:
+      - "304 responses to polls customers don't notice"
+      - "Pretty timestamps, sparklines, theme polish"
+```
+
+## 3.3 Why this is the scientifically interesting part
+
+Most autonomous-coding tools have **no representation of value at all**.
+They execute whatever ticket you point them at. The axis schema is a claim
+that *value should be a first-class, typed input to the system*, on equal
+footing with the code itself.
+
+Three properties make this a sound design rather than a gimmick:
+
+**1. It makes value falsifiable.** `valuable_means` is written as something
+that could, in principle, be checked: "the golden path survives every
+release" is testable in a way "delight users" is not. A ticket can be held
+up against the axis and *judged*, not vibed.
+
+**2. It encodes the negative space.** `rejected_as_cosmetic` is the most
+important field and the one almost everyone forgets. Defining what is *not*
+valuable is how you defeat proxy substitution. An agent that wants to
+prettify a timestamp is now contradicting an explicit, named constraint —
+not merely failing to satisfy a vague aspiration. **A value model without a
+negative space is just a wish list; the system games it. A value model
+*with* a negative space is a filter.**
+
+**3. It is generative, not merely evaluative.** Because value is
+structured, the system can *propose* work that serves the axes (the
+`brainstormer` generates axis-aligned epics and tickets), and it can *tag*
+every shipped change with the axis it served (`axis:<name>` labels). Value
+flows forward into what gets built, not just backward into what gets
+filtered. This closes a loop that prose vision alone cannot: the
+specification of value *drives the backlog* rather than passively grading
+it.
+
+## 3.4 The anti-cosmetic guardrail as a Goodhart defense
+
+There is a well-known failure of optimization: when a measure becomes a
+target, it ceases to be a good measure. An autonomous agent optimizing
+"ship PRs" will ship the easiest PRs — which are exactly the cosmetic ones.
+The `rejected_as_cosmetic` list is a direct structural defense: it removes
+the easiest proxies from the set of admissible work, forcing the
+optimizer's pressure back onto the axes that actually represent value.
+
+This is why forge-loop's brainstormer carries an explicit *anti-cosmetic
+guardrail*: the value model is not just consulted at generation time, it is
+designed so that the cheapest-to-satisfy moves are precisely the ones it
+forbids. The system is built to make gaming it harder than doing the real
+work.
+
+## 3.5 The cost, stated plainly
+
+This surface is only as good as the axes the operator writes. A vague
+`valuable_means`, an empty `rejected_as_cosmetic`, or axes that do not
+actually capture the product's value model will all produce a system that
+confidently optimizes the wrong thing. Garbage axes in, garbage backlog
+out — and worse, *confidently and at scale*. The leverage of this surface
+is real, but it is leverage on the operator's clarity, which means it
+amplifies a poor value model as faithfully as a good one. This is the
+upfront cost named in Section 2, located precisely.
+
+---
+
+[← The Tradeoff](./02-the-tradeoff.md) · [Index](./README.md) · [Next: Reinforcement Feedback Loops →](./04-reinforcement-feedback-loops.md)