docs: lock in visibility layer, resolve all 5 open design questions
- Resolve T3 mesh mechanics: blackboard-based draft/commit cycle - Resolve T1 plan output schema: formal JSON structure with workstreams + parallelism groups - Resolve T5 consensus: T3 aggregates joint verdict (pass/partial/fail), partial retries failed slices only - Resolve path amendment mechanism: event-based, runner notifies higher tier, no approval gate - Resolve failure handling: confirmed distributed ownership, runner owns T1 + terminal only Add run visibility layer: - Human-readable live log (normal + verbose modes) - Configurable inspection gates (t1_plan always, t2_synthesis recommended, others optional) - strict_mode flag for full gating on early runs - cli/agency.py: run, watch, inspect, approve, reject, pause, resume - gate_pending halt loop in team_runner, gate_approved/rejected resume - Expanded blackboard event vocabulary (gate_*, path_amendment, log) - t3_task_lists table for mesh coordination state - Inspection gate flow added to buildspec Key Flows Build order updated: 16 steps (added cli/ step, clarified runner gate responsibilities)
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# Tiered Agent Team System — Build Spec
|
||||
|
||||
_Started: 2026-03-15. Status: Pre-build._
|
||||
_See agent-teams-design.md for the design doc and decisions log._
|
||||
_Started: 2026-03-15. Last updated: 2026-03-30._
|
||||
_See design.md for the design doc and decisions log._
|
||||
|
||||
---
|
||||
|
||||
@@ -68,6 +68,9 @@ agent-teams/
|
||||
│ ├── team.yaml — example run configuration
|
||||
│ └── role_registry.yaml — maps (tier, domain) → agent personality file
|
||||
│
|
||||
├── cli/
|
||||
│ └── agency.py — run, watch, inspect, approve, reject, pause, resume
|
||||
│
|
||||
├── runs/ — runtime state, one subdir per run_id
|
||||
│ └── .gitkeep
|
||||
│
|
||||
@@ -131,12 +134,43 @@ CREATE TABLE events (
|
||||
event_id TEXT PRIMARY KEY,
|
||||
run_id TEXT NOT NULL,
|
||||
brief_id TEXT,
|
||||
kind TEXT NOT NULL, -- spawned | completed | failed | escalated | retried
|
||||
kind TEXT NOT NULL, -- see event vocabulary below
|
||||
detail TEXT, -- JSON
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
**Event kind vocabulary:**
|
||||
```
|
||||
-- lifecycle
|
||||
spawned | completed | failed | escalated | retried
|
||||
|
||||
-- visibility / gates
|
||||
gate_pending -- runner hit an inspection gate, waiting for human
|
||||
gate_approved -- human approved via CLI or notify
|
||||
gate_rejected -- human rejected, tier re-invoked
|
||||
gate_paused -- manual pause via CLI
|
||||
gate_resumed -- manual resume via CLI
|
||||
|
||||
-- amendments / informational
|
||||
path_amendment -- mid-run tier proposed a tier path change
|
||||
log -- human-readable log line (detail: {level, message})
|
||||
```
|
||||
|
||||
**t3_task_lists** *(T3 mesh coordination)*
|
||||
```sql
|
||||
CREATE TABLE t3_task_lists (
|
||||
entry_id TEXT PRIMARY KEY,
|
||||
run_id TEXT NOT NULL,
|
||||
workstream_id TEXT NOT NULL,
|
||||
t3_agent_id TEXT NOT NULL,
|
||||
status TEXT NOT NULL, -- draft | committed
|
||||
tasks TEXT NOT NULL, -- JSON array of proposed T4 task descriptors
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task Brief Schema
|
||||
@@ -283,6 +317,19 @@ retry_defaults:
|
||||
bad_output: 3
|
||||
partial: 2
|
||||
blocked: 0 # always escalate immediately
|
||||
|
||||
visibility:
|
||||
strict_mode: false # true = all gates on (recommended for first runs)
|
||||
log_level: normal # normal | verbose (verbose = per-T4 start/done lines)
|
||||
inspection_gates:
|
||||
t1_plan: true # always — required by design
|
||||
t2_lead: false # optional — review boundaries before specialists spawn
|
||||
t2_synthesis: true # recommended — review architecture before implementation
|
||||
t3_plan: false # verbose — useful early on, disable once T3 is trusted
|
||||
t5_verdict: false # review T5 joint verdict before T3 marks workstream done
|
||||
gate_timeout_minutes: 60 # auto-reject if no human response within this window
|
||||
|
||||
t3_mesh_timeout_minutes: 10 # max time for T3s to commit task lists before runner escalates
|
||||
```
|
||||
|
||||
---
|
||||
@@ -388,7 +435,29 @@ spawn T4 with brief
|
||||
→ notify T3
|
||||
```
|
||||
|
||||
### 4. Review Gate
|
||||
### 4. Inspection Gate Flow
|
||||
|
||||
```
|
||||
runner reaches configured gate (e.g. t2_synthesis)
|
||||
→ write event(gate_pending, detail={tier, summary, what_happens_next})
|
||||
→ notify_adapter.send(tier summary to Andrew via Hans)
|
||||
→ halt: poll blackboard for gate_approved or gate_rejected
|
||||
|
||||
gate_approved:
|
||||
→ write event(gate_approved)
|
||||
→ continue run
|
||||
|
||||
gate_rejected:
|
||||
→ write event(gate_rejected, detail={reason})
|
||||
→ re-invoke tier with rejection reason in brief context
|
||||
→ loop back to gate_pending when tier completes again
|
||||
|
||||
gate_timeout (gate_timeout_minutes elapsed):
|
||||
→ treat as gate_rejected
|
||||
→ notify Andrew: "Gate timed out, re-invoking tier"
|
||||
```
|
||||
|
||||
### 5. Review Gate
|
||||
|
||||
```
|
||||
T1 completes integration
|
||||
@@ -412,19 +481,20 @@ T1 completes integration
|
||||
|
||||
1. `git submodule add https://github.com/msitarzewski/agency-agents agents/` — pull the talent pool
|
||||
2. `config/role_registry.yaml` — map tier+domain → agent personality files
|
||||
3. `core/task_brief.py` — schema + validation (everything depends on this)
|
||||
4. `core/blackboard.py` — SQLite store, all table definitions
|
||||
3. `core/task_brief.py` — schema + validation (everything depends on this); include T1 Plan Output Schema
|
||||
4. `core/blackboard.py` — SQLite store, all table definitions including `t3_task_lists`; full event kind vocabulary
|
||||
5. `adapters/base/*` — all four abstract interfaces
|
||||
6. `adapters/llm/anthropic.py` — first LLM implementation
|
||||
7. `core/escalation.py` — retry + failure routing logic
|
||||
7. `core/escalation.py` — retry + failure routing logic (called by tiers, not runner centrally)
|
||||
8. `adapters/runtime/openclaw.py` — wire up sessions_spawn + personality injection
|
||||
9. `adapters/runtime/claude_code.py` — coding agent runtime, personality via --system-prompt
|
||||
10. `core/team_runner.py` — full run lifecycle, runtime + personality selection
|
||||
11. `prompts/` — fallback tier prompts (used when no agent_personality set)
|
||||
12. `adapters/vcs/github.py` — PR creation + branch management
|
||||
13. `adapters/notify/openclaw.py` — Hans notification
|
||||
14. `config/team.yaml` — example config
|
||||
15. `README.md` — how to run, how to add adapters, how to extend the roster
|
||||
10. `core/team_runner.py` — full run lifecycle: gate logic (gate_pending halt loop, gate_approved resume), path amendment monitor, T1 failure + terminal escalation only
|
||||
11. `cli/agency.py` — run, watch, inspect, approve, reject, pause, resume; `watch` tails blackboard events and renders live log; `inspect` renders run tree
|
||||
12. `prompts/` — fallback tier prompts (used when no agent_personality set)
|
||||
13. `adapters/vcs/github.py` — PR creation + branch management
|
||||
14. `adapters/notify/openclaw.py` — Hans notification; used for gate surfaces (tier summary to Andrew)
|
||||
15. `config/team.yaml` — example config with full visibility block
|
||||
16. `README.md` — how to run, how to add adapters, how to extend the roster; include `agency` CLI reference
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user