This is a pilot program exploring how to archive URL references found in CVE Records.
Note: This repository is in early development and is subject to change.
We are transitioning from Phase 1 to Phase 2 and would appreciate feedback.
- Prepare the live environment.
- Pilot in the live environment (target: ~2025-07-10).
- Refactor and simplify code (currently fragmented across iterations).
- Report findings to AWG/QWG.
- Authentication
- This project runs in an isolated environment. API access is only available via SSH to trusted users.
- Process & Controls
- Archiving is manual and initiated by trusted users. Deletion or modification of archived assets is not currently supported.
The archiver includes:
- Two Node.js services:
scheduler
— Fastify-based HTTP API that queues archive jobs.engine
— Executes archive jobs and manages asset relocation.
- Infrastructure components:
Amazon S3
— For storing and delivering archived resources.PostgreSQL
— Tracks jobs, assets, and source domains.
- Development stack:
Visual Studio Code
with DevContainers.Docker Desktop
for local environments, including:MinIO
(S3-compatible object storage).PostgreSQL
database.
- General documentation:
docs/
- Rationale and background:
docs/rationale.md
- Set up a local, isolated foundation (database, S3).
- Use ArchiveBox to generate archives and metadata.
- Upload to a public S3 bucket.
- Establish basic workflows for submitting CVEs and reviewing operations.
- Deploy to a shared but isolated environment.
- Provide access to stakeholders as needed (no public access).
- Simulate job submissions over time with test plans.
- Evaluate results and iterate.
Reserved.
Contributions welcome!
See docs/overview.md
to get started.
Development is containerized via DevContainers to ensure a consistent environment. Recommended setup:
- VSCode + DevContainer plugin
- Docker Desktop
- Clone and open in VSCode.
- Open the Command Palette (
Ctrl/Cmd + P
) and run:
> Dev Containers: Rebuild and Reopen in Container
- This will reopen the project inside the container environment.
- Let the
Configuring...
terminal run; it watches and rebuilds on changes.
- Use the integrated terminal:
- Run both services:
npm run dev
- Run individually:
npm run dev:scheduler
ornpm run dev:engine
- Run both services:
Once running:
- Submit a job:
curl --location 'http://localhost:8001/api/v1/jobs' \ --header 'Content-Type: application/json' \ --data '{ "cve": "CVE-2025-24070" }'