docs: T1 two-phase lifecycle, T2 Lead Architect, shared assumptions, conflict resolution

2026-03-16 20:41:13 -04:00
parent 1ed7023c08
commit b54436f474
1 changed files with 62 additions and 18 deletions
@@ -1,6 +1,6 @@
 # Tiered Agent Team System — Design Document

-_Started: 2026-03-14. Last updated: 2026-03-16._
+_Started: 2026-03-14. Last updated: 2026-03-16 (evening)._

 ---

@@ -63,12 +63,27 @@ T1 is not just a decomposer — it is the dispatch planner. Its output declares:

 T1 does not prescribe how each tier operates internally. That is the tier's own concern.

+### T1 Lifecycle — Two Explicit Phases
+
+T1 is invoked twice per run, each with a distinct prompt and purpose:
+
+**Phase 1 — Plan:**
+1. T1 produces initial dispatch plan (workstreams, tier paths, parallelism, retry budget)
+2. T1 self-critiques its own plan in a single follow-up pass ("what could go wrong, what did I miss?") and amends
+3. Amended plan surfaces to Andrew for approval — no T2s spawn until approval is given
+
+**Phase 2 — Accept:**
+After the full T2→T3→T4→T5 pipeline completes, T1 is re-invoked with the final output. It validates against the original goal anchor and either accepts (opens PR) or rejects (escalates back down).
+
+Both phases are named explicitly in the task brief schema and tracked on the blackboard.
+
 ### Each Tier Owns the Layer Below

 Control flow is distributed, not centralised:

 - T1 manages its T2s
- T2 manages its T3s
+- T2 Lead manages T2 specialists and their domain boundaries
+- T2 specialists each own their T3s
 - **T3 manages its T4s** — including dependency graph, parallelism, and T5 commissioning
 - The runner is thin: bootstrap T1, monitor the blackboard, handle final result and notifications

@@ -88,29 +103,48 @@ Different tiers suit different internal coordination patterns. These are baked i

 | Tier | Pattern | Rationale |
 |------|---------|-----------|
-| T1 | Single agent | Must be authoritative; no committee |
-| T2 | Group chat / round-table | Specialist architects (security, perf, data, API) debate and reach consensus before committing to a design |
-| T3 | Light mesh | Peer coordination to negotiate task boundaries and avoid T4 conflicts before dispatch |
+| T1 | Single agent, two phases | Must be authoritative; plan phase + accept phase |
+| T2 Lead | Coordinator | Spawned first; defines boundaries + shared assumptions; drives conflict resolution; produces canonical architecture |
+| T2 Specialists | Parallel fan-out | Each works independently within its domain; reads Lead's boundaries + shared assumptions doc before starting |
+| T3 | Light mesh | Peer coordination within same T2 domain to negotiate task boundaries before T4 dispatch |
 | T4 | Swarm + pipeline hybrid | Independent tasks run as swarm; dependent tasks pipeline (T4-A's output feeds T4-B). T3 declares which is which. |
 | T5 | Parallel fan-out + consensus | Each T5 reviews its slice independently, then compares notes for a joint verdict — catches both artifact bugs and integration issues |

+### T2 Flow in Detail
+
+1. T1 spawns **T2 Lead Architect** with goal + workstream context
+2. Lead defines explicit **domain boundaries** (who owns what, hard edges)
+3. Lead publishes **shared assumptions doc** — cross-cutting concerns, key conventions, architectural constraints (auth approach, data formats, API patterns, etc.)
+4. T1 spawns **T2 specialists** with boundaries + shared assumptions baked into their briefs
+5. Specialists work in parallel, each within their defined domain
+6. Lead reads all proposals, drives **conflict resolution** with relevant specialists if needed (cycle limit in config — fixed, not per-workstream)
+7. Lead produces **canonical architecture** → written to blackboard as distinct artifact
+8. T1 (Accept phase) validates canonical architecture against goal anchor
+9. Canonical architecture becomes T3 briefs — each T2 specialist hands off to its own T3s
+
 ---

 ## Horizontal Scaling Within Tiers

 ```
-T1 (1 agent — authoritative)
-├── T2: Backend Architect  ─┐
-├── T2: Frontend Architect  ├─ round-table consensus
-└── T2: Infra Architect    ─┘
+T1 — Phase 1: Plan (self-critique → Andrew approval)
 │
-    └── T3: Squad Lead (per workstream)  ─┐
-            │                             ├─ light mesh across T3s
-            ├── T4: Worker A  ─┐          │
-            ├── T4: Worker B  ─┼─ swarm / pipeline (T3 decides)
-            └── T4: Worker C  ─┘
+├── T2: Lead Architect (boundaries + shared assumptions first)
+│   ├── T2: Backend Architect  ─┐
+│   ├── T2: Frontend Architect  ├─ parallel, within defined domains
+│   └── T2: Infra Architect    ─┘
+│       │
+│       └── (Lead synthesises → conflict resolution if needed → canonical architecture)
 │
-                    └── T5: Verifier(s) — fan-out + consensus
+├── T2 Backend Architect owns:
+│   ├── T3: API Squad Lead  ─┐
+│   └── T3: DB Squad Lead   ─┴─ light mesh within domain
+│           ├── T4: Worker A  ─┐
+│           ├── T4: Worker B  ─┼─ swarm / pipeline (T3 decides)
+│           └── T4: Worker C  ─┘
+│                   └── T5: Verifier(s) — fan-out + consensus
+│
+└── T1 — Phase 2: Accept (validates against goal anchor → PR)
 ```

 ---
@@ -216,13 +250,23 @@ T4 and T5 default to the **coding agent runtime** when available. Falls back to

 **T1 dynamic dispatch** — T1 assesses scope and prescribes tier path and workstream parallelism. It does not prescribe internal tier coordination patterns.

+**T1 two-phase lifecycle** — T1 has two explicit named phases: Plan and Accept. Plan phase includes self-critique (single pass) then human approval gate before T2s spawn. Accept phase validates final output against goal anchor. Both phases tracked on blackboard with distinct prompts.
+
+**T1 self-critique** — Single pass only. Diminishing returns on multiple self-critique iterations; the human review after is the real safety net. Self-critique catches obvious gaps; Andrew catches strategic ones.
+
 **Distributed ownership** — Each tier owns the layer below it. Runner is thin. Tradeoff: distributed control makes the system extensible but debugging requires good blackboard tooling, not central runner traces.

 **T5 always mandatory** — No skipping verification. Things should work and work well before surfacing to T1.

 **T3 owns T4 and T5** — T3 manages its T4s (dependency graph, swarm vs pipeline, parallelism) and commissions T5 verification of T4 outputs. Runner does not orchestrate T4/T5 centrally.

-**Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: round-table. T3: light mesh. T4: swarm+pipeline. T5: fan-out+consensus.
+**T2 Lead Architect** — Dedicated T2 role, not a new tier. Spawned first by T1. Owns: domain boundary definition, shared assumptions doc, conflict resolution between specialists, canonical architecture synthesis. Specialists spawn after Lead publishes boundaries + assumptions. Each T2 specialist owns its own T3s — no T3 spans T2 domains.
+
+**T2 conflict resolution** — Lead sends targeted briefs back to conflicting specialists. Cycle limit is a fixed config value (not per-workstream). Single T1 self-critique parallel: fixed limit, not variable.
+
+**T2 shared assumptions** — Lead publishes cross-cutting concerns (auth, data formats, API conventions, etc.) before specialists start. Specialists design with shared baseline; implicit dependencies pre-empted rather than caught in synthesis.
+
+**Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: Lead + parallel specialists. T3: light mesh within T2 domain. T4: swarm+pipeline. T5: fan-out+consensus.

 **Output / review** — Nothing merges to main without Andrew's explicit approval. T1 opens a PR and surfaces it to Andrew. Notification is dual: Hans messages Andrew directly + PR opened on VCS. Merge is gated on human sign-off.