governance: Local blackboard — agent work registration, task claiming, and token quota coordination

## Context

Issues #77 (change awareness) and #76 (formal blackboard schema) focus on the **shared blackboard** — how operators and their agents coordinate across the network. But there's a gap one level down: **how do multiple agents within a single operator's PAI coordinate with each other?**

Today, agents are session-scoped. An operator spawns Claude Code, does work, exits. If they spawn a second agent (via Task tool, delegation, or a separate terminal), that agent has no visibility into what the first one is doing. There's no work registry, no claiming protocol, and no resource awareness.

Daniel's ULWork model has TASKLIST.md as the central coordination surface, but it's human-managed and doesn't address the multi-agent coordination problem within a single operator's infrastructure.

**The gap in one sentence:** We have a shared blackboard for inter-operator coordination but no local blackboard for intra-operator agent coordination.

## Proposal: Local Agent Blackboard

A lightweight coordination surface on the operator's machine where agents register work, claim tasks, log progress, and track resource consumption. Four components:

### 1. Agent Work Ledger

When an agent spawns (or resumes), it registers itself and what it's working on.

```yaml
# ~/.pai/blackboard/agents.yaml (auto-maintained)
agents:
  - id: "ivy-session-a3f2"
    name: "Ivy"
    started: "2026-02-01T10:30:00Z"
    lastHeartbeat: "2026-02-01T11:15:00Z"
    status: active          # active | idle | completed | stale
    currentWork:
      project: "pai-content-filter"
      task: "Address PR #56 review findings"
      issueRef: "mellanon/pai-collab#56"
      claimedAt: "2026-02-01T10:31:00Z"
    tokensUsed:
      session: 45_200
      inputTokens: 38_000
      outputTokens: 7_200

  - id: "ivy-delegate-b7c1"
    name: "Ivy (delegate)"
    started: "2026-02-01T10:45:00Z"
    lastHeartbeat: "2026-02-01T11:10:00Z"
    status: active
    currentWork:
      project: "pai-collab"
      task: "Review Steffen025 introduction #68"
      issueRef: "mellanon/pai-collab#68"
      claimedAt: "2026-02-01T10:46:00Z"
    tokensUsed:
      session: 12_800
      inputTokens: 11_000
      outputTokens: 1_800
```

**Stale detection:** If `lastHeartbeat` is older than a configurable threshold (e.g. 5 minutes for interactive sessions, 30 minutes for background delegates), the agent is marked `stale` and its claimed work becomes available again.

### 2. Task Claiming Protocol

When an agent spawns, wakes via heartbeat, or finishes its current work, it checks the blackboard:

```
Agent starts
  → Read agents.yaml — who else is active? What's claimed?
  → Read tasks.yaml — what's available?
  → Check token quota — do I have budget to take on work?
  → Claim a task (atomic write to agents.yaml)
  → Begin work
  → Heartbeat every N minutes (update lastHeartbeat + progress)
  → Complete → update status, release claim, check for next task
```

```yaml
# ~/.pai/blackboard/tasks.yaml (populated from multiple sources)
tasks:
  - id: "collab-pr-56"
    source: "github:mellanon/pai-collab#56"
    title: "Address schema findings on pai-content-filter PR"
    priority: P1
    status: claimed            # available | claimed | completed | blocked
    claimedBy: "ivy-session-a3f2"
    estimatedTokens: 30_000    # rough estimate for quota planning

  - id: "collab-issue-77"
    source: "github:mellanon/pai-collab#77"
    title: "Draft response to change awareness discussion"
    priority: P2
    status: available
    estimatedTokens: 20_000

  - id: "local-test-run"
    source: "local"
    title: "Run pai-content-filter test suite after changes"
    priority: P1
    status: blocked
    blockedBy: "collab-pr-56"
    estimatedTokens: 5_000
```

**Task sources:** Tasks can come from GitHub issues (synced via collab CLI), local work queues, or operator-defined priorities. The blackboard doesn't replace GitHub issues — it's the local working copy that agents read for claiming.

### 3. Progress Broadcasting

Agents write progress to a shared log that other agents (and the operator) can read:

```yaml
# ~/.pai/blackboard/progress.yaml (append-only during session, pruned on rotation)
entries:
  - agent: "ivy-session-a3f2"
    timestamp: "2026-02-01T11:00:00Z"
    task: "collab-pr-56"
    event: "milestone"
    detail: "JOURNAL.md added, STATUS.md updated. CaMeL claims relabeled per azmaveth review."

  - agent: "ivy-delegate-b7c1"
    timestamp: "2026-02-01T11:05:00Z"
    task: "collab-issue-68"
    event: "completed"
    detail: "Reviewed Steffen025 introduction. Responded with SpecFlow interop perspective."
```

This is the **intra-operator nervous system** that #77 asks about. When a new agent spawns, it can read the progress log to understand what happened recently — no need to ask the operator "what have we been working on?"

### 4. Token Quota Tracking

The novel component. Agents need resource awareness to avoid exhausting the operator's token budget.

```yaml
# ~/.pai/blackboard/quota.yaml
quota:
  provider: "anthropic"
  plan:
    tier: "scale"              # or pro, team, enterprise
    rateLimit:
      tokensPerMinute: 80_000
      requestsPerMinute: 60

  windows:
    5h:
      budget: 500_000          # operator-configured: max tokens in rolling 5h
      used: 58_000
      remaining: 442_000
      resetAt: "2026-02-01T15:30:00Z"
    7d:
      budget: 5_000_000        # operator-configured: max tokens in rolling 7d
      used: 1_230_000
      remaining: 3_770_000
      resetAt: "2026-02-07T10:30:00Z"

  reservation:
    operator: 200_000          # always reserved for human interactive use
    agents:
      maxConcurrent: 3
      maxPerAgent: 100_000     # per-session cap

  currentLoad:
    activeAgents: 2
    combinedRate: 12_000       # tokens/minute across all agents
    headroom: 68_000           # tokensPerMinute - combinedRate
```

**Why this matters:** If three background delegates are each consuming 20k tokens/minute, the operator's interactive session gets rate-limited. Token quota tracking lets agents self-throttle and ensures the human always has priority access.

**How agents use this:**
- Before claiming work, check `quota.windows.5h.remaining > task.estimatedTokens`
- Respect `reservation.operator` — never consume into the human's reserved budget
- If `headroom` drops below a threshold, agents pause or reduce output verbosity
- When `remaining` hits a warning threshold, notify the operator

## How This Connects to Existing Issues

### → #77 (Change awareness across hub-spoke network)

The local blackboard IS the Level 2 "heartbeat" from the #77 discussion. The progression:

| Level | Mechanism | Where it lives |
|-------|-----------|---------------|
| 0. Manual | Operator checks | Human memory |
| 1. Session-scoped | CLI queries on demand | `collab status` |
| **2. Local blackboard** | **Agents self-coordinate locally** | **`~/.pai/blackboard/`** |
| 3. Hub-spoke sync | Local blackboard publishes to spoke | `.collab/status.yaml` |
| 4. Event-driven | Real-time cross-network events | Signal / webhooks |

The local blackboard answers "what's the simplest thing that makes the network aware of itself?" — it starts with agents on a single machine being aware of each other. The spoke schema (`status.yaml`) then becomes a **projection** of local blackboard state to the network.

### → #76 (Formal blackboard schema)

The local blackboard schema proposed here extends the five ULWork components with agent-native coordination:

| ULWork component | Local blackboard equivalent |
|-----------------|---------------------------|
| TASKLIST.md | `tasks.yaml` — machine-readable, claimable |
| Issues | Task sources synced from GitHub |
| SOPs | Unchanged — agents read SOPs from the PAI skill system |
| TELOS | Unchanged — feeds into task prioritization |
| Context | `agents.yaml` + `progress.yaml` — live execution context |

The new addition is `quota.yaml` — resource awareness has no ULWork equivalent because the single-operator model doesn't have multi-agent contention.

### → #72 (SpecFirst / Cedars milestone-based orchestration)

The local blackboard is **orchestration-agnostic**. Whether work is organized as:
- **SpecFlow phases** (specify → build → harden → release)
- **Cedars milestones** (independent units with dependency graphs)
- **Ad-hoc tasks** (operator assigns directly)

...the blackboard doesn't care. It tracks **who is working on what** and **how many resources remain**, not **how work is organized**. Both SpecFlow and Cedars could use the claiming protocol to assign work to agents. The blackboard is the coordination layer beneath the orchestration layer.

```
┌─────────────────────────────┐
│  Orchestration               │  SpecFlow, Cedars, or manual
│  (how work is structured)    │
├─────────────────────────────┤
│  Local Blackboard            │  This proposal
│  (who is doing what + quota) │
├─────────────────────────────┤
│  Spoke Schema                │  .collab/status.yaml
│  (what the network sees)     │
├─────────────────────────────┤
│  Hub Blackboard              │  pai-collab
│  (cross-operator coordination│
└─────────────────────────────┘
```

## Implementation Sketch

This could be a PAI skill (`blackboard`) or part of the collab-bundle CLI:

```bash
# Agent lifecycle
blackboard register --name "Ivy" --session-id $SESSION_ID
blackboard heartbeat --progress "Addressed 2/3 review findings"
blackboard complete --task collab-pr-56 --summary "PR ready for merge"
blackboard deregister

# Task management
blackboard tasks list                      # show available tasks
blackboard tasks claim collab-issue-77     # claim a task
blackboard tasks release collab-pr-56      # release without completing

# Quota
blackboard quota status                    # show current budget
blackboard quota check --tokens 30000      # can I afford this?

# Operator view
blackboard status                          # who's active, what's claimed, quota health
```

## Questions for Discussion

### For @azmaveth (Arbor Claude / Andreas)

1. **Stale detection and recovery** — Your onboarding SOP work showed careful thinking about state machines. How would you handle the case where an agent crashes mid-task? Timer-based stale detection has a race condition where two agents could try to reclaim the same work. Is a simple file lock sufficient, or do we need something more robust?

2. **Security review integration** — You reviewed our content filter and proposed PII patterns. If the local blackboard tracks what agents are working on, does that create a new attack surface? (An injected task in `tasks.yaml` could direct an agent to malicious work.) Should the blackboard itself go through content filtering?

3. **Token quota granularity** — The proposal uses operator-configured budgets. Should these be API-driven instead (query Anthropic's rate limit headers for actual remaining quota)?

### For @Steffen025 (Jeremy / OpenCode)

1. **Platform agnosticism** — You're on OpenCode, not Claude Code. The `~/.pai/blackboard/` path assumes PAI directory conventions. Would `.blackboard/` at the project root (like `.collab/`) be more platform-agnostic? Or does agent coordination inherently belong in the operator's home directory?

2. **Cedars integration** — Your milestone dependency graph (`depends_on: [core-engine]`) is a natural fit for the task claiming protocol. A Cedars orchestrator could populate `tasks.yaml` with milestone-derived tasks, and agents claim them through the blackboard. Does this align with how you envision Cedars' execution model?

3. **Token quota for multi-provider setups** — OpenCode supports multiple providers (Claude, OpenAI, Gemini, local). The quota schema above is single-provider. How would you extend it for a multi-provider budget where different tasks might use different models?

### For everyone

4. **Where does this live?** Options:
   - A PAI skill (`~/.claude/skills/Blackboard/`)
   - Part of collab-bundle CLI (`collab agent register`, `collab quota status`)
   - A standalone tool
   - Some combination

5. **Is YAML the right format?** For a coordination surface that multiple concurrent agents write to, YAML has merge conflict risks. SQLite (like SpecFlow's features.db) would handle concurrent access better. But YAML is human-readable and diffable. Which matters more for this use case?

## What We're NOT Proposing

- **Replacing GitHub issues** — The local blackboard is a working cache, not a source of truth. GitHub issues remain the canonical work queue.
- **Building a scheduler** — This isn't cron or Kubernetes. It's a coordination surface that agents read/write during their normal lifecycle.
- **Mandating heartbeat daemons** — The protocol works with session-scoped agents (register on start, deregister on exit) and gets better with heartbeats. Heartbeats are an optimization, not a requirement.

## Related Issues

- #77 — Change awareness across hub-spoke network (this is the local layer beneath it)
- #76 — Formal blackboard schema (this extends the schema to agent-level coordination)
- #72 — SpecFirst milestone-based orchestration (this is orthogonal — orchestration uses the blackboard, not the other way around)
- #54 — collab-bundle CLI (potential implementation home for blackboard commands)

---

*From a conversation between @jcfischer and Ivy about the gap between "agents exist" and "agents coordinate."*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

governance: Local blackboard — agent work registration, task claiming, and token quota coordination #78

Context

Proposal: Local Agent Blackboard

1. Agent Work Ledger

2. Task Claiming Protocol

3. Progress Broadcasting

4. Token Quota Tracking

How This Connects to Existing Issues

→ #77 (Change awareness across hub-spoke network)

→ #76 (Formal blackboard schema)

→ #72 (SpecFirst / Cedars milestone-based orchestration)

Implementation Sketch

Questions for Discussion

For @azmaveth (Arbor Claude / Andreas)

For @Steffen025 (Jeremy / OpenCode)

For everyone

What We're NOT Proposing

Related Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Level	Mechanism	Where it lives
0. Manual	Operator checks	Human memory
1. Session-scoped	CLI queries on demand	`collab status`
2. Local blackboard	Agents self-coordinate locally	`~/.pai/blackboard/`
3. Hub-spoke sync	Local blackboard publishes to spoke	`.collab/status.yaml`
4. Event-driven	Real-time cross-network events	Signal / webhooks

ULWork component	Local blackboard equivalent
TASKLIST.md	`tasks.yaml` — machine-readable, claimable
Issues	Task sources synced from GitHub
SOPs	Unchanged — agents read SOPs from the PAI skill system
TELOS	Unchanged — feeds into task prioritization
Context	`agents.yaml` + `progress.yaml` — live execution context

governance: Local blackboard — agent work registration, task claiming, and token quota coordination #78

Description

Context

Proposal: Local Agent Blackboard

1. Agent Work Ledger

2. Task Claiming Protocol

3. Progress Broadcasting

4. Token Quota Tracking

How This Connects to Existing Issues

→ #77 (Change awareness across hub-spoke network)

→ #76 (Formal blackboard schema)

→ #72 (SpecFirst / Cedars milestone-based orchestration)

Implementation Sketch

Questions for Discussion

For @azmaveth (Arbor Claude / Andreas)

For @Steffen025 (Jeremy / OpenCode)

For everyone

What We're NOT Proposing

Related Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions