docs: purge OpenClaw/Hans specifics from core design

Portability audit — all platform-specific concerns moved to adapter layer: - Gate Approval UX (Resolved Mechanics): rewritten as platform-agnostic. Core: runner writes gate_pending, calls notify_adapter.send(), polls blackboard for gate_approved. Universal path: agency CLI writes directly to blackboard. Adapter handles its own inbound response bridge internally. - pending_gates.json removed from core directory structure and runner responsibilities — adapter-internal state, not a core concern. - 'User → Hans → team_runner.start()' → 'User → team_runner.start()' Core has no dependency on a specific caller. - 'notify_adapter.send(...to Andrew via Hans)' → 'notify_adapter.send()' throughout design.md and buildspec.md. - anthropic.py description: 'via OpenClaw or direct API' → 'direct API' (anthropic adapter never goes via OpenClaw) - Output/review decision: 'Hans messages Andrew' → 'notify_adapter.send()' - Run visibility decision: 'Andrew via Hans' → 'via notify_adapter.send()' - Decisions log: gate approval and visibility entries rewritten accordingly Adapter layer correctly unchanged: adapters/notify/openclaw.py — OpenClaw-specific, owns its inbound bridge adapters/runtime/openclaw.py — OpenClaw sessions_spawn, correctly isolated team.yaml example config — adapter selection is config, not core
2026-03-30 14:31:55 -04:00
parent 8f143e779d
commit 1c99e40f98
2 changed files with 20 additions and 32 deletions
@@ -40,7 +40,7 @@ agent-teams/
 │   │   ├── notify.py        — abstract notification interface
 │   │   └── runtime.py       — abstract agent runtime interface
 │   ├── llm/
-│   │   ├── anthropic.py     — Claude via OpenClaw or direct API
+│   │   ├── anthropic.py     — Claude via direct Anthropic API
 │   │   ├── openai.py        — GPT / o-series
 │   │   └── ollama.py        — local models
 │   ├── vcs/
@@ -74,8 +74,6 @@ agent-teams/
 ├── runs/                    — runtime state, one subdir per run_id
 │   └── .gitkeep
 │
 ├── pending_gates.json       — live file: gates currently awaiting approval (written by runner, read by Hans)
 │
 └── README.md
 ```
@@ -387,7 +385,7 @@ t5:
 ### 1. Run Kickoff
 ```
-User → Hans → team_runner.start(goal, config)
+User → team_runner.start(goal, config)  # via CLI or any caller
  → generate run_id
  → init blackboard (create runs/<run_id>/blackboard.db)
  → build T1 brief (goal_anchor = goal, retry_budget from config)
@@ -442,7 +440,7 @@ spawn T4 with brief
 ```
 runner reaches configured gate (e.g. t2_synthesis)
  → write event(gate_pending, detail={tier, summary, what_happens_next})
-  → notify_adapter.send(tier summary to Andrew via Hans)
+  → notify_adapter.send(tier summary + gate context)
  → halt: poll blackboard for gate_approved or gate_rejected
  gate_approved:
@@ -490,11 +488,11 @@ T1 completes integration
 7. `core/escalation.py` — retry + failure routing logic (called by tiers, not runner centrally)
 8. `adapters/runtime/openclaw.py` — wire up sessions_spawn + personality injection
 9. `adapters/runtime/claude_code.py` — coding agent runtime, personality via --system-prompt
-10. `core/team_runner.py` — full run lifecycle: spawn loop (monitors briefs table for `status=pending`, calls runtime_adapter.spawn()), gate logic (gate_pending halt, writes pending_gates.json, gate_approved/rejected resume), path amendment monitor, T3 mesh timeout → T2 escalation, T1 failure + terminal escalation only
+10. `core/team_runner.py` — full run lifecycle: spawn loop (monitors briefs table for `status=pending`, calls runtime_adapter.spawn()), gate logic (gate_pending halt, calls notify_adapter.send(), polls for gate_approved/rejected resume), path amendment monitor, T3 mesh timeout → T2 escalation, T1 failure + terminal escalation only
 11. `cli/agency.py` — run, watch, inspect, approve, reject, pause, resume; `watch` tails blackboard events and renders live log; `inspect` renders run tree
 12. `prompts/` — fallback tier prompts (used when no agent_personality set)
 13. `adapters/vcs/github.py` — PR creation + branch management
-14. `adapters/notify/openclaw.py` — Hans notification; used for gate surfaces (tier summary to Andrew)
+14. `adapters/notify/openclaw.py` — OpenClaw notification adapter; bridges gate summaries and run events to the operator via OpenClaw; manages its own inbound response state for gate approval routing
 15. `config/team.yaml` — example config with full visibility block
 16. `README.md` — how to run, how to add adapters, how to extend the roster; include `agency` CLI reference
@@ -20,7 +20,7 @@ All eight open questions resolved 2026-03-30. Details in Decisions Log.
 6. **Who makes spawn calls for T3+ tiers** → Runner monitors briefs table for `status=pending` rows and makes all spawn calls. "Distributed ownership" means the tier's output determines brief content — runner is the mechanical arm. Gates (hold on `gate_pending`) live naturally in the runner's spawn loop.
-7. **Gate approval UX** → Both Signal reply (via Hans) and direct CLI are supported — both write to the same blackboard. Runner only cares that a `gate_approved` event exists, not who wrote it. Hans maintains `pending_gates.json` in workspace for multi-run disambiguation.
+7. **Gate approval UX** → `agency approve <run_id>` CLI writes `gate_approved` directly to the blackboard — the universal path, works on any platform. Runner only cares that the event exists, not how it got there. Notify adapter implementations handle their own inbound response routing (e.g. bridging a chat reply to a CLI call) as internal adapter state — not a core concern.
 8. **T3 mesh timeout** → Escalate to T2 (domain boundary problem, T2 should re-scope). If T2 also exhausts its retry budget, escalates up the normal ladder to T1 → Andrew gate. No force-commit fallback (would hide the problem and cause bad T4 dispatch).
@@ -224,7 +224,7 @@ T2 Lead → writes integration summary → blackboard
 T1 Accept
  → validate against goal anchor
-  → open PR, notify Andrew via Hans
+  → open PR, notify_adapter.send(pr summary + url)
 ```
 ### Medium Complexity — T1→T3→T4→T5
@@ -363,29 +363,19 @@ This keeps gate logic in one place (the runner's spawn loop), makes all spawn ca
 ### Gate Approval UX
-Two paths, both valid, same outcome — runner only cares that a `gate_approved` event exists in the blackboard:
+**Core mechanic (platform-agnostic):**
-**Signal (via Hans):**
+1. Runner writes `gate_pending` to blackboard
-Andrew receives the tier summary from Hans in Signal. Replies "approve" or "reject: reason". Hans resolves which run + gate the reply refers to using `workspace/pending_gates.json` (maintained by runner on each `gate_pending` event), then runs `agency approve <run_id>` or `agency reject <run_id> --reason "..."` on Andrew's behalf. Hans confirms back: "✅ Approved — T3 spawning now."
+2. Runner calls `notify_adapter.send()` with tier summary + gate context (`run_id`, `gate`, `summary`, `what_happens_next`)
 3. Runner polls blackboard for `gate_approved` or `gate_rejected`
 4. `agency approve <run_id>` / `agency reject <run_id> --reason "..."` writes the event directly to the blackboard — the universal approval path, works on any platform with filesystem access
-**Direct CLI:**
+Runner never reads from a state file, never talks to a notify adapter for inbound responses. It only polls the blackboard.
 Andrew runs `agency approve <run_id>` from his terminal. Zero-friction when already at a machine.
-**`pending_gates.json` format:**
+**Adapter responsibility:**
-```json
+Each notify adapter handles its own inbound response routing. How a human's approval gets translated into an `agency approve` CLI call is entirely the adapter's concern — not core. Example: an OpenClaw adapter bridges a chat reply to the CLI. A Slack adapter wires up a slash command. A webhook adapter listens on an endpoint. All produce the same result: `gate_approved` written to blackboard.
 {
  "gates": [
    {
      "run_id": "abc123",
      "gate": "t2_synthesis",
      "pending_since": "2026-03-30T14:00:00Z",
      "summary": "T2 synthesis ready — canonical architecture written"
    }
  ]
 }
 ```
-If only one gate is pending, Hans can resolve "approve" without an explicit run_id. If multiple are pending, Hans asks Andrew to specify.
+Any internal state the adapter needs to resolve ambiguous responses (e.g. which run_id an approval refers to when multiple gates are pending) is managed by the adapter, not the core.
 ---
@@ -566,7 +556,7 @@ Log level `verbose` adds per-T4-start/done lines. Default is `normal` (tier-leve
 Configurable pause points. When the runner hits a gate, it:
 1. Writes a `gate_pending` event to the blackboard
-2. Fires `notify_adapter.send()` with a tier summary to Andrew (via Hans)
+2. Fires `notify_adapter.send()` with the tier summary + gate context
 3. Halts — no next tier spawns until `gate_approved` or `gate_rejected` is written
 The tier summary surfaced at each gate includes:
@@ -660,7 +650,7 @@ Run abc123 — "Build webhook ingestion system"
 **Orchestration patterns** — Baked into tier prompts and runner tier-handling logic, not prescribed by T1. T2: Lead + parallel specialists. T3: light mesh within T2 domain. T4: swarm+pipeline. T5: fan-out+consensus.
-**Output / review** — Nothing merges to main without Andrew's explicit approval. T1 opens a PR and surfaces it to Andrew. Notification is dual: Hans messages Andrew directly + PR opened on VCS. Merge is gated on human sign-off.
+**Output / review** — Nothing merges to main without explicit human approval. T1 opens a PR and fires `notify_adapter.send()` with the PR summary. Merge is gated on human sign-off. The notify adapter implementation determines how the notification is delivered.
 **Platform agnosticism** — Core is provider and platform agnostic. Capability levels (`reasoning-heavy`, `capable`, `fast-cheap`) map to models in config. Mixing providers across tiers is supported.
@@ -674,7 +664,7 @@ Run abc123 — "Build webhook ingestion system"
 **Spawn call ownership** — Runner is the single point of contact with the runtime adapter. Tiers write `status=pending` child briefs to the blackboard; runner's spawn loop detects and spawns them. Gate logic (hold on `gate_pending`) lives in the spawn loop — no gate plumbing needed in agents. Agents only need blackboard read/write access.
-**Gate approval UX** — Both Signal reply (Hans as bridge) and direct `agency approve` CLI are supported. Same blackboard write either way; runner doesn't care which path was used. Hans maintains `pending_gates.json` in workspace to resolve ambiguous replies when multiple gates are pending. Single pending gate → "approve" is unambiguous.
+**Gate approval UX** — `agency approve <run_id>` CLI is the universal approval path — writes `gate_approved` directly to blackboard. Runner only polls blackboard; it does not depend on any specific notification platform. Each notify adapter handles its own inbound response bridge as internal adapter state. Core has no `pending_gates.json` or platform-specific approval logic.
 **T3 mesh timeout** — Escalate to T2 (the specialist that owns the domain). Timeout means T3s can't agree on task boundaries — a domain boundary problem T2 should fix by re-scoping. If T2 exhausts its retry budget, normal escalation ladder handles it (T1 → Andrew gate). No force-commit fallback.
@@ -688,4 +678,4 @@ Run abc123 — "Build webhook ingestion system"
 **Failure handling (distributed)** — Confirmed distributed ownership (2026-03-30). `escalation.py` is logic tiers execute (or runner executes on tier's behalf on timeout/crash), not a central runner concern. Runner only owns: T1 failure, terminal human escalation. See updated Failure Handling table.
-**Run visibility layer** — Added 2026-03-30. Human-readable live log, configurable inspection gates, and `cli/agency.py` inspection/control commands. Designed for debugging and quality evaluation at each tier during early runs. `strict_mode: true` enables all gates. Gates surface tier artifacts + "what happens next" summary to Andrew via Hans. Resolves Q3 (T5 consensus surfaces as gate event with human-readable summary). T5 gate (optional) lets Andrew review joint verdict before T3 marks workstream done.
+**Run visibility layer** — Added 2026-03-30. Human-readable live log, configurable inspection gates, and `cli/agency.py` inspection/control commands. Designed for debugging and quality evaluation at each tier during early runs. `strict_mode: true` enables all gates. Gates surface tier artifacts + "what happens next" summary via `notify_adapter.send()` — platform-agnostic. Resolves Q3 (T5 consensus surfaces as gate event with human-readable summary). T5 gate (optional) lets the operator review joint verdict before T3 marks workstream done.