docs: purge OpenClaw/Hans specifics from core design

Portability audit — all platform-specific concerns moved to adapter layer:

- Gate Approval UX (Resolved Mechanics): rewritten as platform-agnostic.
  Core: runner writes gate_pending, calls notify_adapter.send(), polls
  blackboard for gate_approved. Universal path: agency CLI writes directly
  to blackboard. Adapter handles its own inbound response bridge internally.

- pending_gates.json removed from core directory structure and runner
  responsibilities — adapter-internal state, not a core concern.

- 'User → Hans → team_runner.start()' → 'User → team_runner.start()'
  Core has no dependency on a specific caller.

- 'notify_adapter.send(...to Andrew via Hans)' → 'notify_adapter.send()'
  throughout design.md and buildspec.md.

- anthropic.py description: 'via OpenClaw or direct API' → 'direct API'
  (anthropic adapter never goes via OpenClaw)

- Output/review decision: 'Hans messages Andrew' → 'notify_adapter.send()'
- Run visibility decision: 'Andrew via Hans' → 'via notify_adapter.send()'
- Decisions log: gate approval and visibility entries rewritten accordingly

Adapter layer correctly unchanged:
  adapters/notify/openclaw.py — OpenClaw-specific, owns its inbound bridge
  adapters/runtime/openclaw.py — OpenClaw sessions_spawn, correctly isolated
  team.yaml example config — adapter selection is config, not core
This commit is contained in:
2026-03-30 14:31:55 -04:00
parent 8f143e779d
commit 1c99e40f98
2 changed files with 20 additions and 32 deletions

View File

@@ -40,7 +40,7 @@ agent-teams/
│ │ ├── notify.py — abstract notification interface │ │ ├── notify.py — abstract notification interface
│ │ └── runtime.py — abstract agent runtime interface │ │ └── runtime.py — abstract agent runtime interface
│ ├── llm/ │ ├── llm/
│ │ ├── anthropic.py — Claude via OpenClaw or direct API │ │ ├── anthropic.py — Claude via direct Anthropic API
│ │ ├── openai.py — GPT / o-series │ │ ├── openai.py — GPT / o-series
│ │ └── ollama.py — local models │ │ └── ollama.py — local models
│ ├── vcs/ │ ├── vcs/
@@ -74,8 +74,6 @@ agent-teams/
├── runs/ — runtime state, one subdir per run_id ├── runs/ — runtime state, one subdir per run_id
│ └── .gitkeep │ └── .gitkeep
├── pending_gates.json — live file: gates currently awaiting approval (written by runner, read by Hans)
└── README.md └── README.md
``` ```
@@ -387,7 +385,7 @@ t5:
### 1. Run Kickoff ### 1. Run Kickoff
``` ```
User → Hans → team_runner.start(goal, config) User → team_runner.start(goal, config) # via CLI or any caller
→ generate run_id → generate run_id
→ init blackboard (create runs/<run_id>/blackboard.db) → init blackboard (create runs/<run_id>/blackboard.db)
→ build T1 brief (goal_anchor = goal, retry_budget from config) → build T1 brief (goal_anchor = goal, retry_budget from config)
@@ -442,7 +440,7 @@ spawn T4 with brief
``` ```
runner reaches configured gate (e.g. t2_synthesis) runner reaches configured gate (e.g. t2_synthesis)
→ write event(gate_pending, detail={tier, summary, what_happens_next}) → write event(gate_pending, detail={tier, summary, what_happens_next})
→ notify_adapter.send(tier summary to Andrew via Hans) → notify_adapter.send(tier summary + gate context)
→ halt: poll blackboard for gate_approved or gate_rejected → halt: poll blackboard for gate_approved or gate_rejected
gate_approved: gate_approved:
@@ -490,11 +488,11 @@ T1 completes integration
7. `core/escalation.py` — retry + failure routing logic (called by tiers, not runner centrally) 7. `core/escalation.py` — retry + failure routing logic (called by tiers, not runner centrally)
8. `adapters/runtime/openclaw.py` — wire up sessions_spawn + personality injection 8. `adapters/runtime/openclaw.py` — wire up sessions_spawn + personality injection
9. `adapters/runtime/claude_code.py` — coding agent runtime, personality via --system-prompt 9. `adapters/runtime/claude_code.py` — coding agent runtime, personality via --system-prompt
10. `core/team_runner.py` — full run lifecycle: spawn loop (monitors briefs table for `status=pending`, calls runtime_adapter.spawn()), gate logic (gate_pending halt, writes pending_gates.json, gate_approved/rejected resume), path amendment monitor, T3 mesh timeout → T2 escalation, T1 failure + terminal escalation only 10. `core/team_runner.py` — full run lifecycle: spawn loop (monitors briefs table for `status=pending`, calls runtime_adapter.spawn()), gate logic (gate_pending halt, calls notify_adapter.send(), polls for gate_approved/rejected resume), path amendment monitor, T3 mesh timeout → T2 escalation, T1 failure + terminal escalation only
11. `cli/agency.py` — run, watch, inspect, approve, reject, pause, resume; `watch` tails blackboard events and renders live log; `inspect` renders run tree 11. `cli/agency.py` — run, watch, inspect, approve, reject, pause, resume; `watch` tails blackboard events and renders live log; `inspect` renders run tree
12. `prompts/` — fallback tier prompts (used when no agent_personality set) 12. `prompts/` — fallback tier prompts (used when no agent_personality set)
13. `adapters/vcs/github.py` — PR creation + branch management 13. `adapters/vcs/github.py` — PR creation + branch management
14. `adapters/notify/openclaw.py`Hans notification; used for gate surfaces (tier summary to Andrew) 14. `adapters/notify/openclaw.py`OpenClaw notification adapter; bridges gate summaries and run events to the operator via OpenClaw; manages its own inbound response state for gate approval routing
15. `config/team.yaml` — example config with full visibility block 15. `config/team.yaml` — example config with full visibility block
16. `README.md` — how to run, how to add adapters, how to extend the roster; include `agency` CLI reference 16. `README.md` — how to run, how to add adapters, how to extend the roster; include `agency` CLI reference

View File

@@ -20,7 +20,7 @@ All eight open questions resolved 2026-03-30. Details in Decisions Log.
6. **Who makes spawn calls for T3+ tiers** → Runner monitors briefs table for `status=pending` rows and makes all spawn calls. "Distributed ownership" means the tier's output determines brief content — runner is the mechanical arm. Gates (hold on `gate_pending`) live naturally in the runner's spawn loop. 6. **Who makes spawn calls for T3+ tiers** → Runner monitors briefs table for `status=pending` rows and makes all spawn calls. "Distributed ownership" means the tier's output determines brief content — runner is the mechanical arm. Gates (hold on `gate_pending`) live naturally in the runner's spawn loop.
7. **Gate approval UX**Both Signal reply (via Hans) and direct CLI are supported — both write to the same blackboard. Runner only cares that a `gate_approved` event exists, not who wrote it. Hans maintains `pending_gates.json` in workspace for multi-run disambiguation. 7. **Gate approval UX**`agency approve <run_id>` CLI writes `gate_approved` directly to the blackboard — the universal path, works on any platform. Runner only cares that the event exists, not how it got there. Notify adapter implementations handle their own inbound response routing (e.g. bridging a chat reply to a CLI call) as internal adapter state — not a core concern.
8. **T3 mesh timeout** → Escalate to T2 (domain boundary problem, T2 should re-scope). If T2 also exhausts its retry budget, escalates up the normal ladder to T1 → Andrew gate. No force-commit fallback (would hide the problem and cause bad T4 dispatch). 8. **T3 mesh timeout** → Escalate to T2 (domain boundary problem, T2 should re-scope). If T2 also exhausts its retry budget, escalates up the normal ladder to T1 → Andrew gate. No force-commit fallback (would hide the problem and cause bad T4 dispatch).
@@ -224,7 +224,7 @@ T2 Lead → writes integration summary → blackboard
T1 Accept T1 Accept
→ validate against goal anchor → validate against goal anchor
→ open PR, notify Andrew via Hans → open PR, notify_adapter.send(pr summary + url)
``` ```
### Medium Complexity — T1→T3→T4→T5 ### Medium Complexity — T1→T3→T4→T5
@@ -363,29 +363,19 @@ This keeps gate logic in one place (the runner's spawn loop), makes all spawn ca
### Gate Approval UX ### Gate Approval UX
Two paths, both valid, same outcome — runner only cares that a `gate_approved` event exists in the blackboard: **Core mechanic (platform-agnostic):**
**Signal (via Hans):** 1. Runner writes `gate_pending` to blackboard
Andrew receives the tier summary from Hans in Signal. Replies "approve" or "reject: reason". Hans resolves which run + gate the reply refers to using `workspace/pending_gates.json` (maintained by runner on each `gate_pending` event), then runs `agency approve <run_id>` or `agency reject <run_id> --reason "..."` on Andrew's behalf. Hans confirms back: "✅ Approved — T3 spawning now." 2. Runner calls `notify_adapter.send()` with tier summary + gate context (`run_id`, `gate`, `summary`, `what_happens_next`)
3. Runner polls blackboard for `gate_approved` or `gate_rejected`
4. `agency approve <run_id>` / `agency reject <run_id> --reason "..."` writes the event directly to the blackboard — the universal approval path, works on any platform with filesystem access
**Direct CLI:** Runner never reads from a state file, never talks to a notify adapter for inbound responses. It only polls the blackboard.
Andrew runs `agency approve <run_id>` from his terminal. Zero-friction when already at a machine.
**`pending_gates.json` format:** **Adapter responsibility:**
```json Each notify adapter handles its own inbound response routing. How a human's approval gets translated into an `agency approve` CLI call is entirely the adapter's concern — not core. Example: an OpenClaw adapter bridges a chat reply to the CLI. A Slack adapter wires up a slash command. A webhook adapter listens on an endpoint. All produce the same result: `gate_approved` written to blackboard.
{
"gates": [
{
"run_id": "abc123",
"gate": "t2_synthesis",
"pending_since": "2026-03-30T14:00:00Z",
"summary": "T2 synthesis ready — canonical architecture written"
}
]
}
```
If only one gate is pending, Hans can resolve "approve" without an explicit run_id. If multiple are pending, Hans asks Andrew to specify. Any internal state the adapter needs to resolve ambiguous responses (e.g. which run_id an approval refers to when multiple gates are pending) is managed by the adapter, not the core.
--- ---
@@ -566,7 +556,7 @@ Log level `verbose` adds per-T4-start/done lines. Default is `normal` (tier-leve
Configurable pause points. When the runner hits a gate, it: Configurable pause points. When the runner hits a gate, it:
1. Writes a `gate_pending` event to the blackboard 1. Writes a `gate_pending` event to the blackboard
2. Fires `notify_adapter.send()` with a tier summary to Andrew (via Hans) 2. Fires `notify_adapter.send()` with the tier summary + gate context
3. Halts — no next tier spawns until `gate_approved` or `gate_rejected` is written 3. Halts — no next tier spawns until `gate_approved` or `gate_rejected` is written
The tier summary surfaced at each gate includes: The tier summary surfaced at each gate includes:
@@ -660,7 +650,7 @@ Run abc123 — "Build webhook ingestion system"
**Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: Lead + parallel specialists. T3: light mesh within T2 domain. T4: swarm+pipeline. T5: fan-out+consensus. **Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: Lead + parallel specialists. T3: light mesh within T2 domain. T4: swarm+pipeline. T5: fan-out+consensus.
**Output / review** — Nothing merges to main without Andrew's explicit approval. T1 opens a PR and surfaces it to Andrew. Notification is dual: Hans messages Andrew directly + PR opened on VCS. Merge is gated on human sign-off. **Output / review** — Nothing merges to main without explicit human approval. T1 opens a PR and fires `notify_adapter.send()` with the PR summary. Merge is gated on human sign-off. The notify adapter implementation determines how the notification is delivered.
**Platform agnosticism** — Core is provider and platform agnostic. Capability levels (`reasoning-heavy`, `capable`, `fast-cheap`) map to models in config. Mixing providers across tiers is supported. **Platform agnosticism** — Core is provider and platform agnostic. Capability levels (`reasoning-heavy`, `capable`, `fast-cheap`) map to models in config. Mixing providers across tiers is supported.
@@ -674,7 +664,7 @@ Run abc123 — "Build webhook ingestion system"
**Spawn call ownership** — Runner is the single point of contact with the runtime adapter. Tiers write `status=pending` child briefs to the blackboard; runner's spawn loop detects and spawns them. Gate logic (hold on `gate_pending`) lives in the spawn loop — no gate plumbing needed in agents. Agents only need blackboard read/write access. **Spawn call ownership** — Runner is the single point of contact with the runtime adapter. Tiers write `status=pending` child briefs to the blackboard; runner's spawn loop detects and spawns them. Gate logic (hold on `gate_pending`) lives in the spawn loop — no gate plumbing needed in agents. Agents only need blackboard read/write access.
**Gate approval UX**Both Signal reply (Hans as bridge) and direct `agency approve` CLI are supported. Same blackboard write either way; runner doesn't care which path was used. Hans maintains `pending_gates.json` in workspace to resolve ambiguous replies when multiple gates are pending. Single pending gate → "approve" is unambiguous. **Gate approval UX**`agency approve <run_id>` CLI is the universal approval path — writes `gate_approved` directly to blackboard. Runner only polls blackboard; it does not depend on any specific notification platform. Each notify adapter handles its own inbound response bridge as internal adapter state. Core has no `pending_gates.json` or platform-specific approval logic.
**T3 mesh timeout** — Escalate to T2 (the specialist that owns the domain). Timeout means T3s can't agree on task boundaries — a domain boundary problem T2 should fix by re-scoping. If T2 exhausts its retry budget, normal escalation ladder handles it (T1 → Andrew gate). No force-commit fallback. **T3 mesh timeout** — Escalate to T2 (the specialist that owns the domain). Timeout means T3s can't agree on task boundaries — a domain boundary problem T2 should fix by re-scoping. If T2 exhausts its retry budget, normal escalation ladder handles it (T1 → Andrew gate). No force-commit fallback.
@@ -688,4 +678,4 @@ Run abc123 — "Build webhook ingestion system"
**Failure handling (distributed)** — Confirmed distributed ownership (2026-03-30). `escalation.py` is logic tiers execute (or runner executes on tier's behalf on timeout/crash), not a central runner concern. Runner only owns: T1 failure, terminal human escalation. See updated Failure Handling table. **Failure handling (distributed)** — Confirmed distributed ownership (2026-03-30). `escalation.py` is logic tiers execute (or runner executes on tier's behalf on timeout/crash), not a central runner concern. Runner only owns: T1 failure, terminal human escalation. See updated Failure Handling table.
**Run visibility layer** — Added 2026-03-30. Human-readable live log, configurable inspection gates, and `cli/agency.py` inspection/control commands. Designed for debugging and quality evaluation at each tier during early runs. `strict_mode: true` enables all gates. Gates surface tier artifacts + "what happens next" summary to Andrew via Hans. Resolves Q3 (T5 consensus surfaces as gate event with human-readable summary). T5 gate (optional) lets Andrew review joint verdict before T3 marks workstream done. **Run visibility layer** — Added 2026-03-30. Human-readable live log, configurable inspection gates, and `cli/agency.py` inspection/control commands. Designed for debugging and quality evaluation at each tier during early runs. `strict_mode: true` enables all gates. Gates surface tier artifacts + "what happens next" summary via `notify_adapter.send()` — platform-agnostic. Resolves Q3 (T5 consensus surfaces as gate event with human-readable summary). T5 gate (optional) lets the operator review joint verdict before T3 marks workstream done.