docs: lock in visibility layer, resolve all 5 open design questions

- Resolve T3 mesh mechanics: blackboard-based draft/commit cycle
- Resolve T1 plan output schema: formal JSON structure with workstreams + parallelism groups
- Resolve T5 consensus: T3 aggregates joint verdict (pass/partial/fail), partial retries failed slices only
- Resolve path amendment mechanism: event-based, runner notifies higher tier, no approval gate
- Resolve failure handling: confirmed distributed ownership, runner owns T1 + terminal only

Add run visibility layer:
- Human-readable live log (normal + verbose modes)
- Configurable inspection gates (t1_plan always, t2_synthesis recommended, others optional)
- strict_mode flag for full gating on early runs
- cli/agency.py: run, watch, inspect, approve, reject, pause, resume
- gate_pending halt loop in team_runner, gate_approved/rejected resume
- Expanded blackboard event vocabulary (gate_*, path_amendment, log)
- t3_task_lists table for mesh coordination state
- Inspection gate flow added to buildspec Key Flows

Build order updated: 16 steps (added cli/ step, clarified runner gate responsibilities)
This commit is contained in:
2026-03-30 13:43:19 -04:00
parent 882b769d21
commit a721db63f6
2 changed files with 424 additions and 29 deletions

View File

@@ -1,7 +1,7 @@
# Tiered Agent Team System — Build Spec
_Started: 2026-03-15. Status: Pre-build._
_See agent-teams-design.md for the design doc and decisions log._
_Started: 2026-03-15. Last updated: 2026-03-30._
_See design.md for the design doc and decisions log._
---
@@ -68,6 +68,9 @@ agent-teams/
│ ├── team.yaml — example run configuration
│ └── role_registry.yaml — maps (tier, domain) → agent personality file
├── cli/
│ └── agency.py — run, watch, inspect, approve, reject, pause, resume
├── runs/ — runtime state, one subdir per run_id
│ └── .gitkeep
@@ -131,12 +134,43 @@ CREATE TABLE events (
event_id TEXT PRIMARY KEY,
run_id TEXT NOT NULL,
brief_id TEXT,
kind TEXT NOT NULL, -- spawned | completed | failed | escalated | retried
kind TEXT NOT NULL, -- see event vocabulary below
detail TEXT, -- JSON
created_at TEXT NOT NULL
);
```
**Event kind vocabulary:**
```
-- lifecycle
spawned | completed | failed | escalated | retried
-- visibility / gates
gate_pending -- runner hit an inspection gate, waiting for human
gate_approved -- human approved via CLI or notify
gate_rejected -- human rejected, tier re-invoked
gate_paused -- manual pause via CLI
gate_resumed -- manual resume via CLI
-- amendments / informational
path_amendment -- mid-run tier proposed a tier path change
log -- human-readable log line (detail: {level, message})
```
**t3_task_lists** *(T3 mesh coordination)*
```sql
CREATE TABLE t3_task_lists (
entry_id TEXT PRIMARY KEY,
run_id TEXT NOT NULL,
workstream_id TEXT NOT NULL,
t3_agent_id TEXT NOT NULL,
status TEXT NOT NULL, -- draft | committed
tasks TEXT NOT NULL, -- JSON array of proposed T4 task descriptors
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
```
---
## Task Brief Schema
@@ -283,6 +317,19 @@ retry_defaults:
bad_output: 3
partial: 2
blocked: 0 # always escalate immediately
visibility:
strict_mode: false # true = all gates on (recommended for first runs)
log_level: normal # normal | verbose (verbose = per-T4 start/done lines)
inspection_gates:
t1_plan: true # always — required by design
t2_lead: false # optional — review boundaries before specialists spawn
t2_synthesis: true # recommended — review architecture before implementation
t3_plan: false # verbose — useful early on, disable once T3 is trusted
t5_verdict: false # review T5 joint verdict before T3 marks workstream done
gate_timeout_minutes: 60 # auto-reject if no human response within this window
t3_mesh_timeout_minutes: 10 # max time for T3s to commit task lists before runner escalates
```
---
@@ -388,7 +435,29 @@ spawn T4 with brief
→ notify T3
```
### 4. Review Gate
### 4. Inspection Gate Flow
```
runner reaches configured gate (e.g. t2_synthesis)
→ write event(gate_pending, detail={tier, summary, what_happens_next})
→ notify_adapter.send(tier summary to Andrew via Hans)
→ halt: poll blackboard for gate_approved or gate_rejected
gate_approved:
→ write event(gate_approved)
→ continue run
gate_rejected:
→ write event(gate_rejected, detail={reason})
→ re-invoke tier with rejection reason in brief context
→ loop back to gate_pending when tier completes again
gate_timeout (gate_timeout_minutes elapsed):
→ treat as gate_rejected
→ notify Andrew: "Gate timed out, re-invoking tier"
```
### 5. Review Gate
```
T1 completes integration
@@ -412,19 +481,20 @@ T1 completes integration
1. `git submodule add https://github.com/msitarzewski/agency-agents agents/` — pull the talent pool
2. `config/role_registry.yaml` — map tier+domain → agent personality files
3. `core/task_brief.py` — schema + validation (everything depends on this)
4. `core/blackboard.py` — SQLite store, all table definitions
3. `core/task_brief.py` — schema + validation (everything depends on this); include T1 Plan Output Schema
4. `core/blackboard.py` — SQLite store, all table definitions including `t3_task_lists`; full event kind vocabulary
5. `adapters/base/*` — all four abstract interfaces
6. `adapters/llm/anthropic.py` — first LLM implementation
7. `core/escalation.py` — retry + failure routing logic
7. `core/escalation.py` — retry + failure routing logic (called by tiers, not runner centrally)
8. `adapters/runtime/openclaw.py` — wire up sessions_spawn + personality injection
9. `adapters/runtime/claude_code.py` — coding agent runtime, personality via --system-prompt
10. `core/team_runner.py` — full run lifecycle, runtime + personality selection
11. `prompts/` — fallback tier prompts (used when no agent_personality set)
12. `adapters/vcs/github.py` — PR creation + branch management
13. `adapters/notify/openclaw.py` — Hans notification
14. `config/team.yaml` — example config
15. `README.md` — how to run, how to add adapters, how to extend the roster
10. `core/team_runner.py` — full run lifecycle: gate logic (gate_pending halt loop, gate_approved resume), path amendment monitor, T1 failure + terminal escalation only
11. `cli/agency.py` — run, watch, inspect, approve, reject, pause, resume; `watch` tails blackboard events and renders live log; `inspect` renders run tree
12. `prompts/` — fallback tier prompts (used when no agent_personality set)
13. `adapters/vcs/github.py` — PR creation + branch management
14. `adapters/notify/openclaw.py` — Hans notification; used for gate surfaces (tier summary to Andrew)
15. `config/team.yaml` — example config with full visibility block
16. `README.md` — how to run, how to add adapters, how to extend the roster; include `agency` CLI reference
---