Improve registry-based archive generation reliability and performance by jordanpadams · Pull Request #240 · NASA-PDS/deep-archive

jordanpadams · 2026-01-07T22:19:10Z

🗒️ Summary

This PR significantly improves the reliability and performance of registry-based deep archive generation through three key enhancements:

More reliable collection discovery - Uses ref_lidvid_collection from bundle properties instead of the brittle /members endpoint
Automatic retry with exponential backoff - Handles transient API/database failures gracefully
Optimized API payload size - Requests only required fields, reducing payload from 11MB to 110KB (99% reduction!)

Changes Made

Fix #236 - Use ref_lidvid_collection for collection discovery:

Parse ref_lidvid_collection directly from bundle properties to discover collections
Avoids reliability issues with the /products/{bundle-lidvid}/members endpoint
More accurate since it uses exact collection LIDVIDs from bundle metadata

Fix #237 - Add retry logic with exponential backoff:

Implemented automatic retry for HTTP 500, 502, 503, 504 errors
Uses exponential backoff: 2s, 4s, 8s, 16s, 32s over 5 attempts
Enhanced error handling for JSONDecodeError exceptions
Added detailed logging for debugging API failures

Fix #238 - Optimize API payload size:

Added fields parameter to API requests to fetch only required fields
Fields requested: file URLs, MD5 checksums, and pagination metadata
Eliminated unnecessary metadata from responses

⚙️ Test Data and/or Report

Automated Tests

✅ All tests pass:

tox
...
  py312: OK (2.48=setup[0.89]+cmd[1.59] seconds)
  py313: OK (1.81=setup[0.72]+cmd[1.09] seconds)
  docs: OK (1.02=setup[0.45]+cmd[0.57] seconds)
  lint: OK (6.37=setup[0.01]+cmd[6.37] seconds)
  congratulations :) (11.90 seconds)

✅ Pre-commit hooks: All passed during push

Manual Testing

Successfully tested with the New Horizons SWAP bundle referenced in #236:

pds-deep-registry-archive -s PDS_SBN urn:nasa:pds:nh_swap::2.1

Results:

✅ Successfully generated AIP and SIP packages
✅ All 8 collections discovered via ref_lidvid_collection
✅ Retry logic handled transient 500 errors encountered during testing
✅ API payload reduced from 11MB to 110KB per request (99% reduction!)

During testing, encountered the transient registry errors that led to the retry implementation (#237). The retry logic successfully handled these failures with exponential backoff.

Performance Metrics

Metric	Before	After	Improvement
Payload Size	11 MB	110 KB	99% reduction
Collection Discovery	`/members` endpoint	`ref_lidvid_collection`	More reliable
Error Handling	Immediate failure	Auto-retry (5 attempts)	Better resilience

♻️ Related Issues

Fixes pds-deep-registry-archive does not include all collections within a bundle in SIP/AIP files second round #236 - /members endpoint reliability issues
Fixes Registry transient error not handled gracefully – JSONDecodeError on 500 response #237 - Transient API errors causing failures
Fixes Optimize API payload size by requesting only required fields #238 - Large API payload sizes

🤓 Reviewer Checklist

Reviewers: Please verify the following before approving this pull request.

Security & Quality

SonarCloud: Confirmed no new High or Critical security findings.
Secrets Detection: Verified that the Secrets Detection scan passed and no sensitive information (keys, tokens, PII) is exposed.
Code Quality: Code follows organization style guidelines and best practices for the specific language (e.g., PEP 8, Google Java Style).

Testing & Validation

Test Accuracy: Verified that test data is accurate, representative of real-world PDS4 scenarios, and sufficient for the logic being tested.
Coverage: Automated tests cover new logic and edge cases.
Local Verification: (If applicable) Successfully built and ran the changes in a local or staging environment.

Documentation

Documentation: README, Wiki, or inline documentation (Sphinx, Javadoc, docstrings) have been updated to reflect these changes.

Maintenance

Issue Traceability: The PR is linked to a valid GitHub Issue or Jira Ticket.
Backward Compatibility: Confirmed that these changes do not break existing downstream dependencies or API contracts (or that breaking changes are clearly documented).

Copilot

Pull request overview

This PR enhances the reliability and performance of registry-based deep archive generation by addressing three critical issues: unreliable collection discovery, transient API failures, and excessive API payload sizes. The changes implement a more robust collection discovery mechanism using bundle properties, add automatic retry logic with exponential backoff for transient errors, and optimize API requests to fetch only required fields.

Key changes:

Replace /members endpoint with ref_lidvid_collection property parsing for collection discovery
Add retry mechanism with exponential backoff for HTTP 500/502/503/504 errors
Optimize API payload size by requesting only necessary fields (99% reduction)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/pds2/aipgen/registry.py

Copilot · 2026-01-07T22:20:49Z

src/pds2/aipgen/registry.py

    url = f"{server_url}/products/{lidvid}/members"
-    params = {"sort": _searchkey, "limit": _apiquerylimit}
+    # Request only the fields we need to minimize payload size
+    params = {"sort": _searchkey, "limit": _apiquerylimit, "fields": ",".join(_fields)}


The fields parameter is added to limit the response size, but there's no validation that the API actually supports this parameter or handles it correctly. If the API ignores unrecognized parameters, the optimization would silently fail. Consider adding validation or documentation noting the API version requirement for the fields parameter.

@copilot I think hardcoding in API parameters is fine. we own that API

src/pds2/aipgen/registry.py

Workaround for #236 where the /products/{bundle-lidvid}/members endpoint has proven brittle. Instead, parse ref_lidvid_collection directly from the bundle properties to discover collections. This approach is equally accurate since it uses the exact collection LIDVIDs specified in the bundle metadata, but avoids the reliability issues with the /members endpoint at the bundle level. The /members endpoint is still used for discovering products within each collection.

Fixes #237 by implementing automatic retry with exponential backoff for all PDS API requests. This addresses transient API/database performance issues that result in 500 errors and JSON decode failures. Changes: - Added _get_session_with_retry() to create sessions with retry logic - Configured to retry on HTTP 500, 502, 503, 504 status codes - Uses exponential backoff (2s, 4s, 8s, 16s, 32s) over 5 attempts - Enhanced error handling to catch JSONDecodeError exceptions - Added detailed logging for debugging API failures The retry mechanism uses urllib3.Retry with requests.HTTPAdapter to automatically handle transient failures without code changes in calling functions.

Fixes #238 by adding fields parameter to /members endpoint requests to minimize payload size and improve performance. Deep archive only needs 5 specific fields from product responses: - ops:Data_File_Info.ops:file_ref (data file URLs) - ops:Data_File_Info.ops:md5_checksum (data file checksums) - ops:Label_File_Info.ops:file_ref (label file URL) - ops:Label_File_Info.ops:md5_checksum (label file checksum) - ops:Harvest_Info.ops:harvest_date_time (for pagination) Changes: - Updated _fields constant to include harvest_date_time for pagination - Modified _getproducts() to pass fields parameter to API requests - Reduced API response payload by filtering unnecessary metadata Benefits: - Smaller JSON payloads reduce network bandwidth - Faster API responses with less data serialization - Better performance for collections with many products - Reduced load on API server and database

Co-authored-by: Copilot <[email protected]>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/pds2/aipgen/registry.py

Updated retry status codes to use HTTPStatus enum constants for better code readability and maintainability: - 500 → HTTPStatus.INTERNAL_SERVER_ERROR - 502 → HTTPStatus.BAD_GATEWAY - 503 → HTTPStatus.SERVICE_UNAVAILABLE - 504 → HTTPStatus.GATEWAY_TIMEOUT Addresses code review feedback.

Addresses PR feedback with two improvements: 1. Clarified _searchkey field requirement in _fields constant - Added comment explaining it must be included for pagination - Required by line 244 for search-after parameter access 2. Implemented batch collection fetching to reduce API calls - New _getcollections_batch() function fetches collections in batches - Uses query syntax: (lidvid eq "..." or lidvid eq "..." or ...) - Batches collections into groups of 50 (API limit) - Reduces API calls dramatically (e.g., 8 collections: 8→1 requests) - Updated _comprehendregistry() to use batch fetching Benefits: - Fewer API requests reduces server load - Better performance for bundles with many collections - Stays within API query limits - Maintains same functionality with better efficiency

Copilot · 2026-01-07T23:50:01Z

@jordanpadams I've opened a new pull request, #242, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot · 2026-01-07T23:50:33Z

@jordanpadams I've opened a new pull request, #243, to work on those changes. Once the pull request is ready, I'll request review from you.

nutjob4life

Thanks @jordanpadams!

jordanpadams · 2026-01-08T20:09:29Z

Updated with Code Review Feedback

Additional Improvements

Commit fce720b - Use HTTPStatus constants:

Replaced hardcoded status codes (500, 502, 503, 504) with HTTPStatus enum constants
Improves code readability and maintainability
Uses: HTTPStatus.INTERNAL_SERVER_ERROR, HTTPStatus.BAD_GATEWAY, HTTPStatus.SERVICE_UNAVAILABLE, HTTPStatus.GATEWAY_TIMEOUT

Commit 4b55407 - Batch collection fetching:

Added _getcollections_batch() function to fetch collections efficiently
Uses API query syntax: (lidvid eq "..." or lidvid eq "..." or ...)
Batches collections into groups of 50 (API limit) to minimize requests
Dramatically reduces API calls: 8 collections → 1 request (8x improvement!)
For bundles with 150 collections: 150 requests → 3 requests (50x improvement!)
Clarified that _searchkey field is required in _fields for pagination at line 244

Performance Impact

Before batch fetching:

1 API call per collection to fetch metadata
NH SWAP bundle (8 collections): 8 separate API requests

After batch fetching:

1 API call per 50 collections
NH SWAP bundle (8 collections): 1 API request (8x reduction)
Large bundle (150 collections): 3 API requests (50x reduction)

Combined with the field optimization (11MB → 110KB payload), these changes significantly improve both performance and reliability of registry-based archive generation.

jordanpadams requested a review from a team as a code owner January 7, 2026 22:19

Copilot AI review requested due to automatic review settings January 7, 2026 22:19

Copilot AI reviewed Jan 7, 2026

View reviewed changes

nutjob4life requested changes Jan 7, 2026

View reviewed changes

src/pds2/aipgen/registry.py Outdated Show resolved Hide resolved

jordanpadams added 3 commits January 7, 2026 15:28

jordanpadams force-pushed the fix-fips-mode-md5-issue-226 branch from c52b51d to 0b025ef Compare January 7, 2026 23:28

Add warning of no data is returned

f94e082

Co-authored-by: Copilot <[email protected]>

Copilot AI review requested due to automatic review settings January 7, 2026 23:31

Copilot AI reviewed Jan 7, 2026

View reviewed changes

src/pds2/aipgen/registry.py Outdated Show resolved Hide resolved

jordanpadams added 2 commits January 7, 2026 15:48

Copilot AI mentioned this pull request Jan 7, 2026

Address review feedback: acknowledge fix for batch collection retrieval #242

Draft

Copilot AI mentioned this pull request Jan 7, 2026

Acknowledge API parameter hardcoding approach #243

Draft

nutjob4life self-requested a review January 8, 2026 14:29

nutjob4life approved these changes Jan 8, 2026

View reviewed changes

nutjob4life merged commit 8f156f2 into main Jan 8, 2026
4 checks passed

nutjob4life deleted the fix-fips-mode-md5-issue-226 branch January 8, 2026 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve registry-based archive generation reliability and performance#240

Improve registry-based archive generation reliability and performance#240
nutjob4life merged 6 commits intomainfrom
fix-fips-mode-md5-issue-226

jordanpadams commented Jan 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 7, 2026

Uh oh!

jordanpadams Jan 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI commented Jan 7, 2026

Uh oh!

Copilot AI commented Jan 7, 2026

Uh oh!

nutjob4life left a comment

Uh oh!

jordanpadams commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jordanpadams commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Summary

Changes Made

⚙️ Test Data and/or Report

Automated Tests

Manual Testing

Performance Metrics

♻️ Related Issues

🤓 Reviewer Checklist

Security & Quality

Testing & Validation

Documentation

Maintenance

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

jordanpadams Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI commented Jan 7, 2026

Uh oh!

Copilot AI commented Jan 7, 2026

Uh oh!

nutjob4life left a comment

Choose a reason for hiding this comment

Uh oh!

jordanpadams commented Jan 8, 2026

Updated with Code Review Feedback

Additional Improvements

Performance Impact

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jordanpadams commented Jan 7, 2026 •

edited

Loading