From b54436f47428d50309ee0139307021b25c9995d1 Mon Sep 17 00:00:00 2001 From: Hans Heinemann Date: Mon, 16 Mar 2026 20:41:13 -0400 Subject: [PATCH] docs: T1 two-phase lifecycle, T2 Lead Architect, shared assumptions, conflict resolution --- docs/design.md | 80 ++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 62 insertions(+), 18 deletions(-) diff --git a/docs/design.md b/docs/design.md index c0c7d5c..fe6228c 100644 --- a/docs/design.md +++ b/docs/design.md @@ -1,6 +1,6 @@ # Tiered Agent Team System — Design Document -_Started: 2026-03-14. Last updated: 2026-03-16._ +_Started: 2026-03-14. Last updated: 2026-03-16 (evening)._ --- @@ -63,12 +63,27 @@ T1 is not just a decomposer — it is the dispatch planner. Its output declares: T1 does not prescribe how each tier operates internally. That is the tier's own concern. +### T1 Lifecycle — Two Explicit Phases + +T1 is invoked twice per run, each with a distinct prompt and purpose: + +**Phase 1 — Plan:** +1. T1 produces initial dispatch plan (workstreams, tier paths, parallelism, retry budget) +2. T1 self-critiques its own plan in a single follow-up pass ("what could go wrong, what did I miss?") and amends +3. Amended plan surfaces to Andrew for approval — no T2s spawn until approval is given + +**Phase 2 — Accept:** +After the full T2→T3→T4→T5 pipeline completes, T1 is re-invoked with the final output. It validates against the original goal anchor and either accepts (opens PR) or rejects (escalates back down). + +Both phases are named explicitly in the task brief schema and tracked on the blackboard. + ### Each Tier Owns the Layer Below Control flow is distributed, not centralised: - T1 manages its T2s -- T2 manages its T3s +- T2 Lead manages T2 specialists and their domain boundaries +- T2 specialists each own their T3s - **T3 manages its T4s** — including dependency graph, parallelism, and T5 commissioning - The runner is thin: bootstrap T1, monitor the blackboard, handle final result and notifications @@ -88,29 +103,48 @@ Different tiers suit different internal coordination patterns. These are baked i | Tier | Pattern | Rationale | |------|---------|-----------| -| T1 | Single agent | Must be authoritative; no committee | -| T2 | Group chat / round-table | Specialist architects (security, perf, data, API) debate and reach consensus before committing to a design | -| T3 | Light mesh | Peer coordination to negotiate task boundaries and avoid T4 conflicts before dispatch | +| T1 | Single agent, two phases | Must be authoritative; plan phase + accept phase | +| T2 Lead | Coordinator | Spawned first; defines boundaries + shared assumptions; drives conflict resolution; produces canonical architecture | +| T2 Specialists | Parallel fan-out | Each works independently within its domain; reads Lead's boundaries + shared assumptions doc before starting | +| T3 | Light mesh | Peer coordination within same T2 domain to negotiate task boundaries before T4 dispatch | | T4 | Swarm + pipeline hybrid | Independent tasks run as swarm; dependent tasks pipeline (T4-A's output feeds T4-B). T3 declares which is which. | | T5 | Parallel fan-out + consensus | Each T5 reviews its slice independently, then compares notes for a joint verdict — catches both artifact bugs and integration issues | +### T2 Flow in Detail + +1. T1 spawns **T2 Lead Architect** with goal + workstream context +2. Lead defines explicit **domain boundaries** (who owns what, hard edges) +3. Lead publishes **shared assumptions doc** — cross-cutting concerns, key conventions, architectural constraints (auth approach, data formats, API patterns, etc.) +4. T1 spawns **T2 specialists** with boundaries + shared assumptions baked into their briefs +5. Specialists work in parallel, each within their defined domain +6. Lead reads all proposals, drives **conflict resolution** with relevant specialists if needed (cycle limit in config — fixed, not per-workstream) +7. Lead produces **canonical architecture** → written to blackboard as distinct artifact +8. T1 (Accept phase) validates canonical architecture against goal anchor +9. Canonical architecture becomes T3 briefs — each T2 specialist hands off to its own T3s + --- ## Horizontal Scaling Within Tiers ``` -T1 (1 agent — authoritative) -├── T2: Backend Architect ─┐ -├── T2: Frontend Architect ├─ round-table consensus -└── T2: Infra Architect ─┘ - │ - └── T3: Squad Lead (per workstream) ─┐ - │ ├─ light mesh across T3s - ├── T4: Worker A ─┐ │ - ├── T4: Worker B ─┼─ swarm / pipeline (T3 decides) - └── T4: Worker C ─┘ - │ - └── T5: Verifier(s) — fan-out + consensus +T1 — Phase 1: Plan (self-critique → Andrew approval) +│ +├── T2: Lead Architect (boundaries + shared assumptions first) +│ ├── T2: Backend Architect ─┐ +│ ├── T2: Frontend Architect ├─ parallel, within defined domains +│ └── T2: Infra Architect ─┘ +│ │ +│ └── (Lead synthesises → conflict resolution if needed → canonical architecture) +│ +├── T2 Backend Architect owns: +│ ├── T3: API Squad Lead ─┐ +│ └── T3: DB Squad Lead ─┴─ light mesh within domain +│ ├── T4: Worker A ─┐ +│ ├── T4: Worker B ─┼─ swarm / pipeline (T3 decides) +│ └── T4: Worker C ─┘ +│ └── T5: Verifier(s) — fan-out + consensus +│ +└── T1 — Phase 2: Accept (validates against goal anchor → PR) ``` --- @@ -216,13 +250,23 @@ T4 and T5 default to the **coding agent runtime** when available. Falls back to **T1 dynamic dispatch** — T1 assesses scope and prescribes tier path and workstream parallelism. It does not prescribe internal tier coordination patterns. +**T1 two-phase lifecycle** — T1 has two explicit named phases: Plan and Accept. Plan phase includes self-critique (single pass) then human approval gate before T2s spawn. Accept phase validates final output against goal anchor. Both phases tracked on blackboard with distinct prompts. + +**T1 self-critique** — Single pass only. Diminishing returns on multiple self-critique iterations; the human review after is the real safety net. Self-critique catches obvious gaps; Andrew catches strategic ones. + **Distributed ownership** — Each tier owns the layer below it. Runner is thin. Tradeoff: distributed control makes the system extensible but debugging requires good blackboard tooling, not central runner traces. **T5 always mandatory** — No skipping verification. Things should work and work well before surfacing to T1. **T3 owns T4 and T5** — T3 manages its T4s (dependency graph, swarm vs pipeline, parallelism) and commissions T5 verification of T4 outputs. Runner does not orchestrate T4/T5 centrally. -**Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: round-table. T3: light mesh. T4: swarm+pipeline. T5: fan-out+consensus. +**T2 Lead Architect** — Dedicated T2 role, not a new tier. Spawned first by T1. Owns: domain boundary definition, shared assumptions doc, conflict resolution between specialists, canonical architecture synthesis. Specialists spawn after Lead publishes boundaries + assumptions. Each T2 specialist owns its own T3s — no T3 spans T2 domains. + +**T2 conflict resolution** — Lead sends targeted briefs back to conflicting specialists. Cycle limit is a fixed config value (not per-workstream). Single T1 self-critique parallel: fixed limit, not variable. + +**T2 shared assumptions** — Lead publishes cross-cutting concerns (auth, data formats, API conventions, etc.) before specialists start. Specialists design with shared baseline; implicit dependencies pre-empted rather than caught in synthesis. + +**Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: Lead + parallel specialists. T3: light mesh within T2 domain. T4: swarm+pipeline. T5: fan-out+consensus. **Output / review** — Nothing merges to main without Andrew's explicit approval. T1 opens a PR and surfaces it to Andrew. Notification is dual: Hans messages Andrew directly + PR opened on VCS. Merge is gated on human sign-off.