Signed receipt components for Haystack pipelines. The integration follows the architecture discussed in deepset-ai/haystack#11039:
- Local receipts first: emit an append-only JSONL receipt stream that verifies offline.
- Shadow/enforce mode: default to non-blocking shadow mode; fail closed only for explicitly configured high-risk components.
- Optional anchoring second: asynchronously anchor selected receipt hashes to an external trust domain without making anchoring a runtime dependency.
This is intentionally a standalone prototype, not a Haystack core change. It uses Haystack's documented custom component model: a @component, a run() method, and @component.output_types. If Haystack is not installed, the components still work as plain Python classes for tests and examples.
pip install -e ".[test]"
# Optional, only if testing inside a real Haystack pipeline:
pip install -e ".[haystack,test]"from haystack_receipts import DocumentReceiptComponent, JsonlReceiptSink, ReceiptSigner
signer = ReceiptSigner.generate(kid="haystack-demo")
sink = JsonlReceiptSink("receipts.jsonl")
receipt = DocumentReceiptComponent(
component_name="retriever",
signer=signer,
sink=sink,
mode="shadow",
)
# In a real Haystack pipeline, connect retriever.documents -> receipt.documents.
out = receipt.run(documents=[{"content": "example document"}])
assert out["documents"]Verify the receipt chain offline:
haystack-receipts-verify receipts.jsonl --key haystack-demo=<public-key-hex>Generic pass-through component for any JSON-serializable value. It emits a receipt and returns the original value unchanged.
Haystack-friendly component for retriever/ranker/document outputs. It accepts and returns documents, so it can be inserted after components that output documents.
Component for generators or text-producing stages. It accepts and returns text.
Each receipt payload includes:
component_nameandcomponent_typepipeline_run_idinput_hashandoutput_hash- optional
component_version_hashandconfig_hash - optional OpenTelemetry
trace_idandspan_id mode:shadoworenforcedecision:recorded,denied, orerrorsequenceandpreviousReceiptHashfor chain integrity- Ed25519 signature over JCS-canonical payload bytes
Raw component inputs and outputs are not stored by default. Only hashes are recorded.
The default sink. Appends each signed receipt envelope to a local JSONL file. Verification does not require any external service.
Wraps any sink and emits in a background worker. Use this for external anchors so the pipeline is not blocked by network latency.
A minimal optional sink that POSTs receipt hashes to an HTTP endpoint. This can point at Mycelium Trails, Rekor, OpenTimestamps gateways, or any internal evidence anchor.
An optional Mycelium Trails adapter. It POSTs a compact hash commitment to a Mycelium-compatible /trails endpoint using the public TrailRecord shape:
payment_hashis the receipt hashaction_refisSHA-256(agent_id:operation:scope:timestamp_seconds)claims.evidence_hashrepeats the receipt hash for verifier lookup- component metadata is included as claims, not raw inputs or outputs
Use it behind AsyncSink so external anchoring never blocks the pipeline in shadow mode:
from haystack_receipts import AsyncSink, JsonlReceiptSink, MyceliumAnchorSink
local = JsonlReceiptSink("receipts.jsonl")
anchor = AsyncSink(MyceliumAnchorSink(agent_id="my-rag-pipeline"))The local JSONL receipt stream remains canonical. Mycelium anchoring is an optional upgrade for teams that want a timestamped external trust domain.
shadow: signing/sink failures are recorded in the returned metadata but never block the pipeline.enforce: signing/sink failures raiseReceiptError. Use this only for high-risk components that must fail closed.
Run the local example without Haystack installed:
python examples/local_pipeline.py
python -m haystack_receipts.verify examples/out/receipts.jsonl --key haystack-demo=<printed-public-key>This prototype is not a compliance certification. It provides evidence that can support audit logging and incident reconstruction. Whether it satisfies a legal obligation depends on deployment, retention, access controls, and applicable regulatory guidance.
Apache-2.0.