docs: T1 two-phase lifecycle, T2 Lead Architect, shared assumptions, conflict resolution

2026-03-16 20:41:13 -04:00
parent 1ed7023c08
commit b54436f474
1 changed files with 62 additions and 18 deletions
@@ -1,6 +1,6 @@
 # Tiered Agent Team System — Design Document
-_Started: 2026-03-14. Last updated: 2026-03-16._
+_Started: 2026-03-14. Last updated: 2026-03-16 (evening)._
 ---
@@ -63,12 +63,27 @@ T1 is not just a decomposer — it is the dispatch planner. Its output declares:
 T1 does not prescribe how each tier operates internally. That is the tier's own concern.
 ### T1 Lifecycle — Two Explicit Phases
 T1 is invoked twice per run, each with a distinct prompt and purpose:
 **Phase 1 — Plan:**
 1. T1 produces initial dispatch plan (workstreams, tier paths, parallelism, retry budget)
 2. T1 self-critiques its own plan in a single follow-up pass ("what could go wrong, what did I miss?") and amends
 3. Amended plan surfaces to Andrew for approval — no T2s spawn until approval is given
 **Phase 2 — Accept:**
 After the full T2→T3→T4→T5 pipeline completes, T1 is re-invoked with the final output. It validates against the original goal anchor and either accepts (opens PR) or rejects (escalates back down).
 Both phases are named explicitly in the task brief schema and tracked on the blackboard.
 ### Each Tier Owns the Layer Below
 Control flow is distributed, not centralised:
 - T1 manages its T2s
- T2 manages its T3s
+- T2 Lead manages T2 specialists and their domain boundaries
 - T2 specialists each own their T3s
 - **T3 manages its T4s** — including dependency graph, parallelism, and T5 commissioning
 - The runner is thin: bootstrap T1, monitor the blackboard, handle final result and notifications
@@ -88,29 +103,48 @@ Different tiers suit different internal coordination patterns. These are baked i
 | Tier | Pattern | Rationale |
 |------|---------|-----------|
-| T1 | Single agent | Must be authoritative; no committee |
+| T1 | Single agent, two phases | Must be authoritative; plan phase + accept phase |
-| T2 | Group chat / round-table | Specialist architects (security, perf, data, API) debate and reach consensus before committing to a design |
+| T2 Lead | Coordinator | Spawned first; defines boundaries + shared assumptions; drives conflict resolution; produces canonical architecture |
-| T3 | Light mesh | Peer coordination to negotiate task boundaries and avoid T4 conflicts before dispatch |
+| T2 Specialists | Parallel fan-out | Each works independently within its domain; reads Lead's boundaries + shared assumptions doc before starting |
 | T3 | Light mesh | Peer coordination within same T2 domain to negotiate task boundaries before T4 dispatch |
 | T4 | Swarm + pipeline hybrid | Independent tasks run as swarm; dependent tasks pipeline (T4-A's output feeds T4-B). T3 declares which is which. |
 | T5 | Parallel fan-out + consensus | Each T5 reviews its slice independently, then compares notes for a joint verdict — catches both artifact bugs and integration issues |
 ### T2 Flow in Detail
 1. T1 spawns **T2 Lead Architect** with goal + workstream context
 2. Lead defines explicit **domain boundaries** (who owns what, hard edges)
 3. Lead publishes **shared assumptions doc** — cross-cutting concerns, key conventions, architectural constraints (auth approach, data formats, API patterns, etc.)
 4. T1 spawns **T2 specialists** with boundaries + shared assumptions baked into their briefs
 5. Specialists work in parallel, each within their defined domain
 6. Lead reads all proposals, drives **conflict resolution** with relevant specialists if needed (cycle limit in config — fixed, not per-workstream)
 7. Lead produces **canonical architecture** → written to blackboard as distinct artifact
 8. T1 (Accept phase) validates canonical architecture against goal anchor
 9. Canonical architecture becomes T3 briefs — each T2 specialist hands off to its own T3s
 ---
 ## Horizontal Scaling Within Tiers
 ```
-T1 (1 agent — authoritative)
+T1 — Phase 1: Plan (self-critique → Andrew approval)
 ├── T2: Backend Architect  ─┐
 ├── T2: Frontend Architect  ├─ round-table consensus
 └── T2: Infra Architect    ─┘
 │
-    └── T3: Squad Lead (per workstream)  ─┐
+├── T2: Lead Architect (boundaries + shared assumptions first)
-            │                             ├─ light mesh across T3s
+│   ├── T2: Backend Architect  ─┐
-            ├── T4: Worker A  ─┐          │
+│   ├── T2: Frontend Architect  ├─ parallel, within defined domains
-            ├── T4: Worker B  ─┼─ swarm / pipeline (T3 decides)
+│   └── T2: Infra Architect    ─┘
-            └── T4: Worker C  ─┘
+│       │
 │       └── (Lead synthesises → conflict resolution if needed → canonical architecture)
 │
-                    └── T5: Verifier(s) — fan-out + consensus
+├── T2 Backend Architect owns:
 │   ├── T3: API Squad Lead  ─┐
 │   └── T3: DB Squad Lead   ─┴─ light mesh within domain
 │           ├── T4: Worker A  ─┐
 │           ├── T4: Worker B  ─┼─ swarm / pipeline (T3 decides)
 │           └── T4: Worker C  ─┘
 │                   └── T5: Verifier(s) — fan-out + consensus
 │
 └── T1 — Phase 2: Accept (validates against goal anchor → PR)
 ```
 ---
@@ -216,13 +250,23 @@ T4 and T5 default to the **coding agent runtime** when available. Falls back to
 **T1 dynamic dispatch** — T1 assesses scope and prescribes tier path and workstream parallelism. It does not prescribe internal tier coordination patterns.
 **T1 two-phase lifecycle** — T1 has two explicit named phases: Plan and Accept. Plan phase includes self-critique (single pass) then human approval gate before T2s spawn. Accept phase validates final output against goal anchor. Both phases tracked on blackboard with distinct prompts.
 **T1 self-critique** — Single pass only. Diminishing returns on multiple self-critique iterations; the human review after is the real safety net. Self-critique catches obvious gaps; Andrew catches strategic ones.
 **Distributed ownership** — Each tier owns the layer below it. Runner is thin. Tradeoff: distributed control makes the system extensible but debugging requires good blackboard tooling, not central runner traces.
 **T5 always mandatory** — No skipping verification. Things should work and work well before surfacing to T1.
 **T3 owns T4 and T5** — T3 manages its T4s (dependency graph, swarm vs pipeline, parallelism) and commissions T5 verification of T4 outputs. Runner does not orchestrate T4/T5 centrally.
-**Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: round-table. T3: light mesh. T4: swarm+pipeline. T5: fan-out+consensus.
+**T2 Lead Architect** — Dedicated T2 role, not a new tier. Spawned first by T1. Owns: domain boundary definition, shared assumptions doc, conflict resolution between specialists, canonical architecture synthesis. Specialists spawn after Lead publishes boundaries + assumptions. Each T2 specialist owns its own T3s — no T3 spans T2 domains.
 **T2 conflict resolution** — Lead sends targeted briefs back to conflicting specialists. Cycle limit is a fixed config value (not per-workstream). Single T1 self-critique parallel: fixed limit, not variable.
 **T2 shared assumptions** — Lead publishes cross-cutting concerns (auth, data formats, API conventions, etc.) before specialists start. Specialists design with shared baseline; implicit dependencies pre-empted rather than caught in synthesis.
 **Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: Lead + parallel specialists. T3: light mesh within T2 domain. T4: swarm+pipeline. T5: fan-out+consensus.
 **Output / review** — Nothing merges to main without Andrew's explicit approval. T1 opens a PR and surfaces it to Andrew. Notification is dual: Hans messages Andrew directly + PR opened on VCS. Merge is gated on human sign-off.