Archive Utils Final PR#59
Open
adeshusa wants to merge 41 commits into
Open
Conversation
…he completion of method 1 and 2
…d outputs to correct types
added methods 1 and 2 as well as additional methods to convert vizfol…
Method 7
Methods 7 and 8
Merging divergent commits.
Refactoring method 8
rebranched outline.py into 3
Renamed archive + also turned it into an external module
new spec
…confirmed it works now
Fixed the old code, made bug changes, created new updated test file.
Created a base line demo that we can use for the video
Added more info into demo
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Assessment of Goals and Implementation (Technical)
The implementation achieves the core objective: a modular archive subsystem that supports incremental writes, deterministic reads, and policy-driven validation on top of a stable VizFold 1.0 Zarr hierarchy.
Architecture Responsibility
The system is logically decoupled into three primary modules:
core.py: Defines normalization, addressing, and validation primitives.store.py: Performs canonical, typed writes into the archive layout.load.py: Handles deterministic reads and external ingestion (.pkl, text attention).Goal Coverage
.txt,.pkl) into a canonical schema usingtensor_to_numpy.metadata/,representations/,attention/, andstructure/.End-to-End Flow
cli.py,demo.py, or API calls ingest/store methods.Result: A robust contract providing many input forms, one canonical representation, and deterministic read semantics.
Method Intent (One-liners)
Core & Storage
tensor_to_numpy: Normalize tensor-like objects to NumPy before persistence.tensor_to_zarr_array: Write normalized arrays to canonical Zarr dataset paths._validate_layer_index: Enforce valid layer addressing prior to mutation.validate_archive: Perform strict or lenient archive integrity checks.store_metadata: Persist run/config provenance undermetadata/.store_single_representation: Write one single-representation layer.store_pair_representation: Write one pair-representation layer.store_attention: Write one attention tensor for{attention_type, layer_index}.store_structure_coordinates: Write structure outputs (atom_positions,atom_mask,ptm).Ingestion & Loading
ingest_attention_txt: Parse text attention artifacts into canonical tensors._extract_best_matching_array: Score and select best candidate arrays from pickle payloads.ingest_output_pkl: Ingest pickle outputs with key-match traceability metadata.load_metadata: Read metadata as Python-native values.load_single_representation: Read one single layer deterministically.load_pair_representation: Read one pair layer deterministically.load_attention_head: Read one attention head slice for analysis.ArchiveOrchestrator: Coordinate staged ingest/store/validate operations and summarize state.Technical Quality Notes
key_matches.Summary: The implementation is functionally complete for the stated goals and architecturally robust for future extension.