-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
area:corefd5 core libraryfd5 core libraryaudit-trailAudit trail / provenance chain featureAudit trail / provenance chain featureepicParent issue tracking multiple sub-issuesParent issue tracking multiple sub-issuespriority:highShould be done in the current milestoneShould be done in the current milestone
Description
Summary
Transform fd5's edit capability into a tamper-evident audit trail embedded inside the HDF5 file. Every attribute change becomes a logged, immutable entry — like git commits for HDF5 metadata — optionally tied to verified identity (ORCID, GitHub, email).
Design
Audit log storage
- Root attribute
_fd5_audit_log: JSON array of commit entries - Included in the Merkle tree hash (NOT in
EXCLUDED_ATTRS) → tamper-evident automatically - Each entry records
parent_hash(content_hash before the edit), NOT the new hash (avoids circular dependency)
Commit entry schema
{
"parent_hash": "sha256:abc...",
"timestamp": "2026-03-02T14:30:00Z",
"author": {
"type": "orcid",
"id": "0000-0002-1825-0097",
"name": "Lars Gerchow"
},
"message": "Updated calibration factor",
"changes": [
{
"action": "edit",
"path": "/sensors/temperature",
"attr": "calibration_factor",
"old": "1.0",
"new": "1.05"
}
]
}Hash chain
State S0 ──edit──▶ State S1 ──edit──▶ State S2
H0 H1 H2 (= current content_hash)
- Entry N records
parent_hash = H_{N-1} - The new hash H_N is implicitly the next entry's parent_hash, or the current content_hash
Identity (~/.fd5/identity.toml)
[identity]
type = "orcid"
id = "0000-0002-1825-0097"
name = "Lars Gerchow"Supported types: orcid, github, email, anonymous
Chain verification
Extend verify to validate audit chain integrity:
- Walk entries, check parent_hash continuity
- Final entry's implicit new hash = current content_hash
- Detect gaps, tampered entries, broken chains
Sub-issues
- Python: Audit log data model + read/write
- Python: Identity system (
~/.fd5/identity.toml) - Python:
fd5 editCLI command with audit logging - Python:
fd5 logCLI command - Python: Chain verification in
fd5 validate - Rust: Audit log data model + read/write in fd5 crate
- Rust: Identity system
- Rust: Edit with audit logging in fd5 crate
- Rust: Chain verification in fd5 crate
- h5v:
:logcommand to display audit history - h5v:
:editwith audit trail integration - h5v:
:identitycommand
Approach
- RED-GREEN TDD: write failing tests first, then implement
- Python and Rust tracks in parallel (same spec, independent implementations)
- h5v depends on Rust fd5 crate changes
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area:corefd5 core libraryfd5 core libraryaudit-trailAudit trail / provenance chain featureAudit trail / provenance chain featureepicParent issue tracking multiple sub-issuesParent issue tracking multiple sub-issuespriority:highShould be done in the current milestoneShould be done in the current milestone