diff --git a/CHANGELOG.md b/CHANGELOG.md index c515e1f..6b67789 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,24 @@ All notable changes to the OpenArmature specification are documented in this fil The format is adapted from [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) — subsection labels render as bold paragraphs (rather than H3) to keep the rendered docs-site right-rail TOC focused on releases, and there is no `[Unreleased]` section since the spec tags after every acceptance PR. The spec follows [Semantic Versioning](https://semver.org/). +## [0.16.0] — 2026-05-15 + +**Added** + +- **pipeline-utilities §10.10 — new canonical configuration-time category `checkpoint_state_migration_chain_ambiguous`.** Raised when the registered migration set contains an ambiguity that prevents the engine from picking a unique chain. Two cases trigger the category: a duplicate `(from_version, to_version)` pair at registration (per §10.12.1) and multiple distinct shortest paths between a source / target version pair at chain resolution (per §10.12.2). Non-transient. Mutually exclusive with the other three migration-related categories (`checkpoint_record_invalid`, `checkpoint_state_migration_missing`, `checkpoint_state_migration_failed`) on any given resume; chain-ambiguous routes first because it fires at build or load time before any migration runs or deserialization is attempted. ([proposal 0018](proposals/0018-state-migration-chain-ambiguity.md)) +- Conformance fixture `047-state-migration-chain-ambiguous` (pipeline-utilities), covering both the duplicate-pair-at-registration case and the ambiguous-shortest-paths-at-resolution case via the new `expected_chain_ambiguity_error` harness primitive. The primitive accepts the named category surfacing at either build time or during resume, preserving §10.12.2's compile-time-SHOULD / load-time-acceptable carve-out so implementations detecting ambiguity at either point pass the same fixture. + +**Changed** + +- **pipeline-utilities §10.12.1 — duplicate-pair sentence names the category.** "MUST raise a configuration-time error (the chain is ambiguous)" → "MUST raise `checkpoint_state_migration_chain_ambiguous` (per §10.10) at registration or compile time, before any resume attempt." ([proposal 0018](proposals/0018-state-migration-chain-ambiguity.md)) +- **pipeline-utilities §10.12.2 step 2 — multi-shortest-path clause names the category.** "MUST raise a configuration-time error — the same category §10.12.1 raises for duplicate `(from_version, to_version)` pairs" → "MUST raise `checkpoint_state_migration_chain_ambiguous` (per §10.10)." The "Implementations SHOULD detect ambiguity at compile time when feasible" guidance immediately following remains unchanged. ([proposal 0018](proposals/0018-state-migration-chain-ambiguity.md)) +- **pipeline-utilities §10.10 — mutual-exclusion paragraph rewritten** to list all four migration-related categories with the new routing precedence (registry well-formedness → version compatibility → chain application → deserialization). ([proposal 0018](proposals/0018-state-migration-chain-ambiguity.md)) + +**Notes** + +- **Pre-1.0 MINOR bump.** Although v0.15.0 already mandated "a configuration-time error" for both ambiguity cases, naming a canonical category that didn't exist before is implementation-visible: implementations that previously raised an arbitrary configuration error (a language-native `ValueError`, a generic `Error`, etc.) must now surface `checkpoint_state_migration_chain_ambiguous` to pass fixture 047. Matches the precedent set by proposal 0014's category additions (`checkpoint_state_migration_missing` / `_failed`), which shipped as the v0.12.0 MINOR bump. The change is small in scope (rename the category surfaced for one specific case) but is correctly classified MINOR per pre-1.0 SemVer. +- Per the "Skip-ahead implementation" governance principle, implementations that have not yet shipped against v0.15.0 may target v0.16.0 directly without implementing v0.15.0 first. + ## [0.15.0] — 2026-05-14 **Added** diff --git a/README.md b/README.md index 8d4d7c8..53e8ca6 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ specification text, conformance fixtures, governance rules, and numbered RFC-style proposals. **No implementation code lives here.** Implementations are in sibling repositories. -**Current spec version:** [v0.15.0](CHANGELOG.md) +**Current spec version:** [v0.16.0](CHANGELOG.md) --- @@ -69,7 +69,7 @@ and architecture are in [`docs/openarmature.md`](docs/openarmature.md). | Capability | Introduced | Latest | Fixtures | Scope | |---|---|---|---|---| | [graph-engine](spec/graph-engine/spec.md) | 0.1.0 | 0.11.0 | 21 | Typed state, async nodes, conditional/static edges, reducers, subgraph composition, observer hooks | -| [pipeline-utilities](spec/pipeline-utilities/spec.md) | 0.5.0 | 0.12.0 | 46 | Middleware (canonical retry + timing), parallel fan-out, checkpointing (with state migration), parallel branches | +| [pipeline-utilities](spec/pipeline-utilities/spec.md) | 0.5.0 | 0.16.0 | 47 | Middleware (canonical retry + timing), parallel fan-out, checkpointing (with state migration), parallel branches | | [llm-provider](spec/llm-provider/spec.md) | 0.4.0 | 0.14.0 | 28 | Stateless LLM-provider abstraction with canonical error categories, image content blocks for user messages, structured output via `response_schema`, and OpenAI-compatible wire mapping | | [observability](spec/observability/spec.md) | 0.7.0 | 0.10.0 | 11 | Cross-backend correlation IDs, OpenTelemetry mapping (spans, log correlation, detached trace mode) | | [prompt-management](spec/prompt-management/spec.md) | 0.15.0 | 0.15.0 | 12 | Named/versioned template fetch + render; composite backends with infrastructure-only fallback; PromptGroup tracing primitive; strict-undefined-by-default variable injection | diff --git a/docs/index.md b/docs/index.md index 276caa0..b7d42cc 100644 --- a/docs/index.md +++ b/docs/index.md @@ -125,7 +125,7 @@ drift. --- - 118 conformance fixtures across five capabilities. Implementations run + 119 conformance fixtures across five capabilities. Implementations run them; if they pass, behavior matches every other conforming runtime. No "implementation-defined" footguns. diff --git a/docs/proposals.md b/docs/proposals.md index 479937b..893b2ac 100644 --- a/docs/proposals.md +++ b/docs/proposals.md @@ -24,5 +24,6 @@ lifecycle and the proposal template. | [0015](proposals/0015-llm-provider-multimodal-images.md) | Image content blocks for user messages | llm-provider | Accepted | | [0016](proposals/0016-llm-provider-structured-output.md) | Structured output | llm-provider | Accepted | | [0017](proposals/0017-prompt-management-core.md) | Prompt management core | prompt-management | Accepted | +| [0018](proposals/0018-state-migration-chain-ambiguity.md) | State migration chain ambiguity | pipeline-utilities | Accepted | Click any column header to sort. diff --git a/docs/proposals/0018-state-migration-chain-ambiguity.md b/docs/proposals/0018-state-migration-chain-ambiguity.md new file mode 120000 index 0000000..d161e36 --- /dev/null +++ b/docs/proposals/0018-state-migration-chain-ambiguity.md @@ -0,0 +1 @@ +../../proposals/0018-state-migration-chain-ambiguity.md \ No newline at end of file diff --git a/proposals/0018-state-migration-chain-ambiguity.md b/proposals/0018-state-migration-chain-ambiguity.md index 0319bce..77202d8 100644 --- a/proposals/0018-state-migration-chain-ambiguity.md +++ b/proposals/0018-state-migration-chain-ambiguity.md @@ -1,9 +1,9 @@ # 0018: Pipeline Utilities — State Migration Chain Ambiguity Category and Fixture -- **Status:** Draft +- **Status:** Accepted - **Author:** Chris Colinsky - **Created:** 2026-05-15 -- **Accepted:** +- **Accepted:** 2026-05-15 - **Targets:** spec/pipeline-utilities/spec.md (modifies §10.10 to add error category; modifies §10.12.1 and §10.12.2 to reference the category by name); spec/pipeline-utilities/conformance/ (adds fixture 047) - **Related:** 0014 (state-migration hooks) - **Supersedes:** diff --git a/spec/pipeline-utilities/conformance/047-state-migration-chain-ambiguous.md b/spec/pipeline-utilities/conformance/047-state-migration-chain-ambiguous.md new file mode 100644 index 0000000..f52346e --- /dev/null +++ b/spec/pipeline-utilities/conformance/047-state-migration-chain-ambiguous.md @@ -0,0 +1,65 @@ +# 047 — State Migration Chain Ambiguous + +Two cases under one fixture exercising the canonical configuration-time +error category `checkpoint_state_migration_chain_ambiguous` (per §10.10). +Both ambiguity rules in §10.12 raise the same category; this fixture +covers both. + +**Spec sections exercised:** + +- §10.10 — `checkpoint_state_migration_chain_ambiguous` (configuration-time, + non-transient). Mutually exclusive with the other three migration-related + categories on any given resume. +- §10.12.1 — Two migrations registered with the same `(from_version, + to_version)` pair MUST raise + `checkpoint_state_migration_chain_ambiguous` at registration or + compile time, before any resume attempt. +- §10.12.2 step 2 — When chain resolution finds multiple distinct + shortest paths between a source and target version (same edge + count, different edge sequences), the engine MUST raise + `checkpoint_state_migration_chain_ambiguous`. Implementations + SHOULD detect this at compile time when feasible; load-time + detection is acceptable. + +**New harness primitive:** `expected_chain_ambiguity_error: ` +accepts the named category surfacing at either build time or during +resume. Preserves §10.12.2's compile-time-SHOULD / load-time-acceptable +carve-out so implementations that detect ambiguity at either point pass +the same fixture without forcing the spec to over-tighten to MUST +compile-time. + +**What passes:** + +- **`duplicate_pair_at_registration`** — the + `expected_chain_ambiguity_error` assertion fires when two + migrations register the same `(v1, v2)` pair. Duplicate-pair + detection is independent of function identity per §10.12.1, so + both migrations reference the same `should_not_run` mock; the + ambiguity check fires before either function is invoked. + Implementations that detect at registration time satisfy the + assertion via the build-step exception. +- **`ambiguous_shortest_paths_at_resolution`** — the + `expected_chain_ambiguity_error` assertion fires when the + registered migration set forms a diamond (`v1 → v2 → v4` AND + `v1 → v3 → v4`) and the engine must resolve a chain from `v1` to + `v4`. Implementations that detect at compile time satisfy via the + build-step exception; implementations that defer to load time + satisfy via the resume-step exception. + +**What fails:** + +- The engine silently picks one migration (registration-order, an + arbitrary choice, etc.) when faced with duplicate + `(from, to)` pairs — would mean §10.12.1's MUST-raise rule is + not honored. +- The engine silently picks one shortest path when faced with the + diamond migration graph — would mean §10.12.2's MUST-raise rule + is not honored. +- The engine raises a different category (e.g., + `checkpoint_state_migration_missing` because the engine treats + the ambiguity as no-path-found) — would mean the routing + invariant in §10.10 is not honored. +- The engine raises the right category but at neither build nor + resume time (e.g., wraps it inside an unrelated exception path) + — would mean the `expected_chain_ambiguity_error` primitive's + either-timing acceptance is not satisfied. diff --git a/spec/pipeline-utilities/conformance/047-state-migration-chain-ambiguous.yaml b/spec/pipeline-utilities/conformance/047-state-migration-chain-ambiguous.yaml new file mode 100644 index 0000000..2eefa39 --- /dev/null +++ b/spec/pipeline-utilities/conformance/047-state-migration-chain-ambiguous.yaml @@ -0,0 +1,96 @@ +# State migration — chain ambiguity. Two cases: +# (1) Two migrations with identical (from_version, to_version) registered +# against the same compiled graph (per §10.12.1). +# (2) A diamond migration graph yielding multiple distinct shortest paths +# between a source / target version pair (per §10.12.2). +# Both MUST raise the canonical configuration-time category +# `checkpoint_state_migration_chain_ambiguous` (per §10.10). +# +# The `expected_chain_ambiguity_error` primitive accepts the named +# category surfacing at either build time or during resume — preserves +# §10.12.2's compile-time-SHOULD / load-time-acceptable carve-out so +# implementations that detect ambiguity at either point pass the same +# fixture. + +cases: + - name: duplicate_pair_at_registration + description: | + Two migrations register with the same (from_version, to_version) + pair. Per §10.12.1, the engine MUST raise + `checkpoint_state_migration_chain_ambiguous` at registration or + compile time, before any resume attempt. + state: + schema_version: "v2" + fields: + x: + type: int + default: 0 + entry: noop + nodes: + noop: + update_pure: {} + edges: + - {from: noop, to: END} + initial_state: {} + checkpointer: sqlite + migrations: + # Two migrations registered against the same (from, to) pair. + # Duplicate-pair detection is independent of function identity + # per §10.12.1 — both reference the same `should_not_run` mock + # because the ambiguity check fires before either function is + # ever invoked. If `should_not_run` is somehow called, an + # implementation has missed the ambiguity check. + - from_version: "v1" + to_version: "v2" + migrate: should_not_run + - from_version: "v1" + to_version: "v2" + migrate: should_not_run + expected_chain_ambiguity_error: checkpoint_state_migration_chain_ambiguous + + - name: ambiguous_shortest_paths_at_resolution + description: | + A diamond migration graph: v1 -> v2 -> v4 AND v1 -> v3 -> v4. Both + paths from v1 to v4 have edge count 2; neither is canonically + shortest. Per §10.12.2, the engine MUST raise + `checkpoint_state_migration_chain_ambiguous`. Implementations + SHOULD detect this at compile time by scanning the registered + migration graph; load-time detection (during the resume attempt) + is acceptable per the same section. The + `expected_chain_ambiguity_error` primitive accepts either timing. + state: + schema_version: "v4" + fields: + x: + type: int + default: 0 + entry: noop + nodes: + noop: + update_pure: {} + edges: + - {from: noop, to: END} + initial_state: {} + checkpointer: sqlite + seeded_record: + schema_version: "v1" + state: {x: 1} + completed_positions: [] + migrations: + # Diamond: v1 -> v2 -> v4 AND v1 -> v3 -> v4. Both shortest paths + # from v1 to v4 have length 2. + - from_version: "v1" + to_version: "v2" + migrate: should_not_run + - from_version: "v2" + to_version: "v4" + migrate: should_not_run + - from_version: "v1" + to_version: "v3" + migrate: should_not_run + - from_version: "v3" + to_version: "v4" + migrate: should_not_run + resume: + from_seeded_record: true + expected_chain_ambiguity_error: checkpoint_state_migration_chain_ambiguous diff --git a/spec/pipeline-utilities/spec.md b/spec/pipeline-utilities/spec.md index ed4fccb..0d54471 100644 --- a/spec/pipeline-utilities/spec.md +++ b/spec/pipeline-utilities/spec.md @@ -10,6 +10,7 @@ Canonical behavioral specification for the OpenArmature pipeline-utilities capab - §10 Checkpointing added by [proposal 0008](../../proposals/0008-pipeline-utilities-checkpointing.md) - §11 Parallel branches added by [proposal 0011](../../proposals/0011-pipeline-utilities-parallel-branches.md) - §10.2 `schema_version` reframed as user-facing; §10.10 `checkpoint_record_invalid` description amended and two new error categories (`checkpoint_state_migration_missing`, `checkpoint_state_migration_failed`) added; §10.12 State migrations added by [proposal 0014](../../proposals/0014-pipeline-utilities-state-migration.md) + - §10.10 gained canonical configuration-time category `checkpoint_state_migration_chain_ambiguous`; §10.12.1 and §10.12.2 updated to reference the category by name; mutual-exclusion paragraph rewritten for four migration-related categories by [proposal 0018](../../proposals/0018-state-migration-chain-ambiguity.md) This specification is language-agnostic. Each implementation (Python, TypeScript, …) maps its own idioms onto the behavioral contract described here. Conformance is verified by the fixtures under `conformance/`. @@ -1003,13 +1004,35 @@ Non-transient (a buggy migration is deterministic; retrying without changing the code will not succeed). The error MUST carry the failing migration's `from_version` and `to_version`, and the underlying exception as cause (per the language's idiom). -The three migration-related categories — `checkpoint_record_invalid`, -`checkpoint_state_migration_missing`, `checkpoint_state_migration_failed` — are mutually -exclusive on any given resume: the engine evaluates version compatibility first (routing +New canonical configuration-time category: `checkpoint_state_migration_chain_ambiguous` — +raised when the registered migration set contains an ambiguity that prevents the engine +from picking a unique chain. Two cases trigger this category: + +- **At registration (per §10.12.1).** Two migrations registered with the same + `from_version` AND the same `to_version`. The engine MUST raise this category at + registration time (or at compile time when migrations are bound to the compiled graph, + per the host language's binding semantics) so the configuration error surfaces before + any resume attempt. +- **At chain resolution (per §10.12.2).** A request to resolve a chain from + `from_version` A to `to_version` B finds two or more distinct shortest paths (same + edge count, different edge sequences). Implementations SHOULD detect this at compile + time when feasible by scanning the registered migration graph; load-time detection + is acceptable when compile-time analysis is not. + +Non-transient. The error MUST identify the offending `(from_version, to_version)` pair +(for the registration case) or the source / target version pair and a description of the +conflicting paths (for the resolution case), in a form appropriate to the host language. + +The four migration-related categories — `checkpoint_record_invalid`, +`checkpoint_state_migration_missing`, `checkpoint_state_migration_failed`, and +`checkpoint_state_migration_chain_ambiguous` — are mutually exclusive on any given resume: +the engine evaluates registry well-formedness first (routing through +`checkpoint_state_migration_chain_ambiguous` if a duplicate-pair or multi-shortest-path +ambiguity is detected at build or load time), then version compatibility (routing through `checkpoint_state_migration_missing` if no chain exists), then applies the chain -(routing through `checkpoint_state_migration_failed` if a migration raises), then attempts -deserialization (routing through `checkpoint_record_invalid` if the post-migration state -cannot deserialize). +(routing through `checkpoint_state_migration_failed` if a migration raises), then +attempts deserialization (routing through `checkpoint_record_invalid` if the +post-migration state cannot deserialize). Version mismatches on Checkpointer backends that cannot support state migration (per §10.12.1) bypass the migration system entirely and route directly to @@ -1088,9 +1111,10 @@ checkpoint load. A compiled graph's migration set is **ordered by `(from_version, to_version)` pair**. The order of registration does not affect chain resolution; chains are resolved by version pair, not by registration order. Two migrations with the same `from_version` and same `to_version` -MUST raise a configuration-time error (the chain is ambiguous). Two migrations with the same -`from_version` and different `to_version` define a branched migration graph; chain resolution -(§10.12.2) is responsible for picking a path. +MUST raise `checkpoint_state_migration_chain_ambiguous` (per §10.10) at registration or +compile time, before any resume attempt. Two migrations with the same `from_version` and +different `to_version` define a branched migration graph; chain resolution (§10.12.2) is +responsible for picking a path. #### 10.12.2 Chain resolution @@ -1106,13 +1130,12 @@ Chain resolution proceeds: 2. Find the shortest path (fewest edges) from the record's `schema_version` to the current state schema's `schema_version`. Implementations MUST resolve by shortest-path (BFS is the natural algorithm). When multiple distinct shortest paths exist (same edge count, - different edge sequences), this is an ambiguous chain and the engine MUST raise a - configuration-time error — the same category §10.12.1 raises for duplicate - `(from_version, to_version)` pairs. The user MUST restructure their migration graph to - leave a single canonical shortest path between every reachable version pair. - Implementations SHOULD detect ambiguity at compile time when feasible (by scanning the - registered migration graph); load-time detection is acceptable when compile-time - analysis is not. + different edge sequences), this is an ambiguous chain and the engine MUST raise + `checkpoint_state_migration_chain_ambiguous` (per §10.10). The user MUST restructure + their migration graph to leave a single canonical shortest path between every reachable + version pair. Implementations SHOULD detect ambiguity at compile time when feasible (by + scanning the registered migration graph); load-time detection is acceptable when + compile-time analysis is not. 3. If at least one path exists, apply the migrations along the path in order: each migration's output becomes the next migration's input. The final serialized state is passed to the current state class's deserialization step (per §10.1 round-trip integrity).