Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions SESSION_FOCUS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,20 @@

*Current sprint, SDK status, and active work. Updated by operator and autonomous sessions.*

*Last updated: 2026-05-14 (Sprint 50)*
*Last updated: 2026-05-14 (Sprint 52)*

---

## Current Sprint

**See `docs/SPRINT.md` for full sprint plan and task details.** Do not duplicate sprint content here — SPRINT.md is the source of truth for task scope, status, and dependencies.

### Sprint 52 Summary: Python SDK Conformance Test Wiring (COMPLETE)

| Task | Status | Notes |
|------|--------|-------|
| T1: Wire ATP + Society/Role conformance vectors into SDK tests | DONE | Operator burst-4 shipped 4 conformance JSON suites to `web4-standard/testing/conformance/` but no Python test asserted against them. Sprint 52 wires the two best-aligned suites: ATP (11 vectors + 2 meta → 13 pass, audit "best-aligned pair" confirmed empirically) and Society/Role (9 vectors + 2 meta → 8 pass, 3 strict-xfail with documented divergences citing audit P4, missing assigner predicate, constructor-vs-imperative federation pattern). 2 new test files, 0 product code modifications, 24 new tests (2691 pass + 3 xfail), mypy --strict clean, ruff clean. T3/V3 and R6/R7 conformance deferred. |

### Sprint 51 Summary: Minimum Viable Society Validation + Constraint Alignment (COMPLETE)

| Task | Status | Notes |
Expand Down Expand Up @@ -202,9 +208,9 @@ See `docs/SPRINT.md` for full history. Highlights: JSON-LD serialization for all

- **Version**: 0.26.0
- **Modules**: 23 library modules + MCP server entry point (trust, lct, atp, federation, r6, mrh, acp, dictionary, entity, capability, errors, metabolic, binding, society, role, reputation, security, protocol, mcp, attestation, validation, deserialize, generate, mcp_server)
- **Tests**: 2668 passing
- **Tests**: 2691 passing + 3 strict-xfail (Sprint 52 documented divergences: combined-state enum, assigner predicate, imperative federation actions)
- **CLI**: `web4 info/validate/list-schemas/roundtrip/generate/selftest/trust` (7 subcommands)
- **Exports**: 368 symbols via `web4/__init__.py`
- **Exports**: 369 symbols via `web4/__init__.py`
- **from_dict()**: 58 classmethods across 10 modules — all classes with to_dict()/as_dict() have matching from_dict()
- **Dispatcher**: 23 types via `web4.from_jsonld()` (19 class-based + 3 function-based + TrustQuery)
- **Generator**: 23 types via `web4.generate()` — minimal valid JSON-LD documents
Expand Down
63 changes: 62 additions & 1 deletion docs/SPRINT.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,73 @@
# Web4 Sprint Plan

**Created**: 2026-03-14
**Updated**: 2026-05-14 (Sprint 51)
**Updated**: 2026-05-14 (Sprint 52)
**Phase**: Development
**Track**: web4 (Legion)

---

## Sprint 52: Python SDK Conformance Test Wiring (2026-05-14)

Operator burst-4 (commits `a2727b45`, `92454d6`, `0c39a9b6` at 12:03–12:04 PDT)
shipped the conformance test corpus to `web4-standard/testing/conformance/`
(4 JSON suites, 35 vectors total). The vectors are declared cross-language —
"Any Web4 implementation MUST produce identical results" — but no Python SDK
test currently asserts this. Sprint 52 wires the two best-aligned suites
(ATP and Society/Role) into the SDK test runner. This addresses Nova GPT's
#1 quick-win request (test vectors + conformance) on the Python side, and
partially advances Kimi's K2 gap (conformance test suite missing).

### T1: Wire ATP and Society/Role conformance vectors into Python SDK tests
**Status**: DONE
**Completed**: 2026-05-14
**Authorized by**: Operator burst-4 (conformance corpus shipped) + Nova/Kimi
cross-reviewer convergence (test vectors + conformance was Nova's #1 quick-win;
K2 named by Kimi rounds 1–4). Policy-reviewed and approved with binding
condition: any failing vector MUST be `pytest.mark.xfail` with reason; no
silent fixes (no assertion weakening, no vector edits, no SDK behavioral
changes to make vectors pass).
**Scope**:
1. **ATP conformance** (`tests/test_conformance_atp.py`): loads
`testing/conformance/atp-operations.json` (11 vectors across account,
transfer, sliding-scale categories) and asserts the Python `web4.atp`
module produces matching outputs. Sprint 49 audit named ATP the
best-aligned cross-language pair ("identical core semantics") — expected
high pass rate confirmed: **11/11 pass + 2 meta tests = 13 pass, 0 xfail**.
2. **Society/Role conformance** (`tests/test_conformance_society.py`):
loads `testing/conformance/society-roles.json` (9 vectors across
bootstrap, role, federation, minimum-viable categories) and asserts
the Python `web4.role` module produces matching outputs. **8 pass + 3
strict-xfail with documented divergences**:
- `soc-002` (5-state lifecycle): Python splits combined enum into
`SocietyPhase` (3) + `MetabolicState` (separate axis). Cites audit P4.
- `role-004` (assigner-permission table): Python `role.py` does not
encode role-based permission to assign other roles.
- `fed-001` (imperative join/secede): Python `federation.Society` uses
constructor-hierarchy pattern (`parent=Society`, `children` list), not
imperative join/secede actions.
3. Sprint plan + session focus bookkeeping.

**Result**: 2 new test files, 0 modifications to product code (verification-
only). 24 new tests (2691 passed + 3 xfailed), mypy --strict clean,
ruff lint/format clean.

**Findings produced by xfails**: the three documented divergences are now
executable test markers, not just documentary audit findings. If the SDK ever
gains the corresponding surfaces (combined-state enum, assigner predicate,
imperative federation actions), the strict xfails will turn into XPASS
failures, forcing review and removal — preventing silent surface drift.

**Out of bounds**: T3/V3 conformance vectors (`tensor-operations.json`) and
R6/R7 conformance vectors (`r6-r7-actions.json`) were NOT wired. Sprint 47
documented 8 T3/V3 divergences between Rust and Python; their conformance
wiring needs a separate sprint that can also catalogue divergences (Python
SDK matches spec, Rust/WASM diverges). R6/R7 vectors may have been authored
before PR #187 Constraint shape changes and need a freshness check before
wiring.

---

## Sprint 51: Minimum Viable Society Validation + Constraint Alignment (2026-05-14)

Resolves two remaining autonomous-actionable items from the Sprint 49
Expand Down
192 changes: 192 additions & 0 deletions web4-standard/implementation/sdk/tests/test_conformance_atp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
"""Cross-language conformance tests for ATP/ADP operations.

Loads ``web4-standard/testing/conformance/atp-operations.json`` and asserts
that the Python ``web4.atp`` module produces the documented expected outputs.

The conformance vectors were shipped by the operator (commit 92454d6) and are
declared cross-language: "Any Web4 implementation MUST produce identical results
for these inputs."

Sprint 49 cross-language audit named ATP as the best-aligned pair across Rust
and Python ("identical core semantics"), so a high pass rate is expected.
Where a vector cannot be satisfied without behavioral changes to the SDK, the
test is marked ``xfail`` with a reason citing the specific divergence —
silent fixes (assertion weakening, vector edits, or SDK edits to make vectors
pass) are explicitly forbidden by the Sprint 52 policy review.

Suite version: 0.1.0
"""

from __future__ import annotations

import json
import os
from typing import Any, Dict, List

import pytest

from web4.atp import ATPAccount, sliding_scale, transfer

CONFORMANCE_DIR = os.path.join(os.path.dirname(__file__), "..", "..", "..", "testing", "conformance")


def _load_suite() -> Dict[str, Any]:
path = os.path.join(CONFORMANCE_DIR, "atp-operations.json")
with open(path) as f:
return json.load(f)


SUITE = _load_suite()


# ── Account vectors ──────────────────────────────────────────────


@pytest.mark.parametrize(
"vector",
SUITE["account_vectors"],
ids=lambda v: v["id"],
)
def test_account_vector(vector: Dict[str, Any]) -> None:
operation = vector["operation"]
expected = vector["expected"]

if operation == "new":
balance = vector["input"]["initial_balance"]
account = ATPAccount(available=balance, initial_balance=balance)
assert account.available == expected["available"]
assert account.locked == expected["locked"]
assert account.adp == expected["adp"]
assert account.total == expected["total"]
assert account.energy_ratio == expected["energy_ratio"]
return

initial = vector["initial"]
account = ATPAccount(
available=initial["available"],
locked=initial["locked"],
adp=initial["adp"],
)

if operation == "lock":
amount = vector["input"]["amount"]
ok = account.lock(amount)
assert ok is True
assert account.available == expected["available"]
assert account.locked == expected["locked"]
assert account.total == expected["total"]
return

if operation == "commit":
amount = vector["input"]["amount"]
committed = account.commit(amount)
assert committed == amount
assert account.available == expected["available"]
assert account.locked == expected["locked"]
assert account.adp == expected["adp"]
assert account.total == expected["total"]
assert account.energy_ratio == expected["energy_ratio"]
return

if operation == "rollback":
amount = vector["input"]["amount"]
rolled = account.rollback(amount)
assert rolled == amount
assert account.available == expected["available"]
assert account.locked == expected["locked"]
assert account.adp == expected["adp"]
return

if operation == "energy_ratio":
# atp-005: zero-balance neutral
assert account.energy_ratio == expected["energy_ratio"]
return

pytest.fail(f"Unknown operation in account vector {vector['id']}: {operation}")


# ── Transfer vectors ─────────────────────────────────────────────


@pytest.mark.parametrize(
"vector",
SUITE["transfer_vectors"],
ids=lambda v: v["id"],
)
def test_transfer_vector(vector: Dict[str, Any]) -> None:
sender = ATPAccount(available=vector["sender"]["available"])
receiver = ATPAccount(available=vector["receiver"]["available"])

inp = vector["input"]
amount = inp["amount"]
fee_rate = inp.get("fee_rate", 0.05)
max_balance = inp.get("max_balance")

expected = vector["expected"]

if expected.get("error"):
with pytest.raises(ValueError):
transfer(sender, receiver, amount, fee_rate=fee_rate, max_balance=max_balance)
return

result = transfer(sender, receiver, amount, fee_rate=fee_rate, max_balance=max_balance)

if "fee" in expected:
assert result.fee == expected["fee"]
assert result.actual_credit == expected["actual_credit"]
assert result.overflow == expected.get("overflow", 0.0)
assert result.sender_balance == expected["sender_balance"]
assert result.receiver_balance == expected["receiver_balance"]

if expected.get("conservation_holds"):
# invariant: sender_deducted == actual_credit + fee + overflow
sender_deducted = vector["sender"]["available"] - result.sender_balance
assert sender_deducted == pytest.approx(result.actual_credit + result.fee + result.overflow)


# ── Sliding-scale vectors ────────────────────────────────────────


@pytest.mark.parametrize(
"vector",
SUITE["sliding_scale_vectors"],
ids=lambda v: v["id"],
)
def test_sliding_scale_vector(vector: Dict[str, Any]) -> None:
inp = vector["input"]
result = sliding_scale(
quality=inp["quality"],
base_payment=inp["base_payment"],
zero_threshold=inp["zero_threshold"],
full_threshold=inp["full_threshold"],
)

if "expected" in vector:
assert result == vector["expected"]
else:
tolerance = vector.get("tolerance", 1e-10)
assert result == pytest.approx(vector["expected_approx"], abs=tolerance)


# ── Suite-level meta ─────────────────────────────────────────────


def test_suite_metadata() -> None:
"""The suite version and shape are part of the conformance contract."""
assert SUITE["suite"] == "ATP/ADP Operations"
assert SUITE["version"] == "0.1.0"

# Counts match the README's documented vector budget.
assert len(SUITE["account_vectors"]) == 5
assert len(SUITE["transfer_vectors"]) == 3
assert len(SUITE["sliding_scale_vectors"]) == 3


def test_all_vectors_have_ids() -> None:
"""Every vector must have a stable id for cross-language reference."""
seen: List[str] = []
for category in ("account_vectors", "transfer_vectors", "sliding_scale_vectors"):
for v in SUITE[category]:
assert "id" in v, f"vector missing id in {category}: {v}"
assert v["id"] not in seen, f"duplicate vector id: {v['id']}"
seen.append(v["id"])
Loading
Loading