Skip to content

feat: add evaluation evidence manifest#891

Closed
luochen211 wants to merge 2 commits into
santifer:mainfrom
luochen211:codex/evaluation-evidence-manifest
Closed

feat: add evaluation evidence manifest#891
luochen211 wants to merge 2 commits into
santifer:mainfrom
luochen211:codex/evaluation-evidence-manifest

Conversation

@luochen211

@luochen211 luochen211 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add evidence-manifest.mjs to validate lightweight reports/*.evidence.json files.
  • Treat missing manifests for legacy reports as warning-only while failing invalid manifest JSON/schema.
  • Document the manifest schema and update auto-pipeline/pipeline guidance to write manifests beside reports.

Fixes #883

Tests

  • node --check evidence-manifest.mjs && node evidence-manifest.mjs --self-test
  • node evidence-manifest.mjs
  • node --check test-all.mjs && node test-all.mjs --quick — 167 passed, 0 failed, 7 existing README.ua personal-data warnings

Summary by CodeRabbit

  • New Features

    • Introduced evidence manifest validation command (npm run evidence) to verify lightweight JSON metadata files stored alongside evaluation reports, ensuring required fields are present and data integrity is maintained.
  • Documentation

    • Added comprehensive guide for evidence manifest validation, detailing field requirements and validation behavior for different report scenarios.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@luochen211, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 49 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 62e5a6b1-6e97-41cd-a194-1e5c6fd83467

📥 Commits

Reviewing files that changed from the base of the PR and between cd8b068 and 2e7c3b1.

📒 Files selected for processing (2)
  • modes/auto-pipeline.md
  • modes/pipeline.md
📝 Walkthrough

Walkthrough

This PR introduces a lightweight evidence manifest system for evaluation reports. A new Node.js CLI module validates JSON metadata files (stored alongside markdown reports) containing report identifiers, source information, fetch timestamps, liveness results, and JD hashes. The validation logic is documented, integrated into the npm script system, and included in pre-merge tests; pipeline modes are updated to specify where manifests should be saved.

Changes

Evidence Manifest Validation and Integration

Layer / File(s) Summary
Evidence manifest validation module
evidence-manifest.mjs
Validates evidence manifest JSON files with schema enforcement for required fields (report_number, company, role, source, jd_text_hash, report_path), enumerated values (source_path, liveness_result), ISO-compatible timestamps, and pdf_path as string | null. Discovers reports by pattern, checks for corresponding .evidence.json files, computes per-report status (error, warning, or ok), and includes self-test mode that fixtures a temp directory with valid and missing manifests.
Documentation and npm script integration
docs/SCRIPTS.md, package.json, test-all.mjs
Adds npm run evidence to the Quick Reference and documents manifest filename convention, required fields, warning behavior for legacy/missing manifests, self-test invocation, and exit codes. Registers the script in package.json and integrates evidence-manifest self-test into the pre-merge test suite.
Pipeline manifest generation guidance
modes/auto-pipeline.md, modes/pipeline.md
Updates auto-pipeline and pipeline mode documentation to specify that .evidence.json manifests must be saved alongside generated reports, including report metadata, source/liveness details, JD hash, and output file paths (or null when PDF is not produced).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

🔧 scripts

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'feat: add evaluation evidence manifest' clearly and concisely summarizes the main change—adding an evidence manifest feature for evaluations.
Linked Issues check ✅ Passed The pull request comprehensively implements all coding-related objectives from issue #883: defines manifest schema with required fields, updates auto-pipeline and pipeline documentation to write manifests, adds validation logic with appropriate warning/error handling, and includes self-test coverage in test-all.mjs.
Out of Scope Changes check ✅ Passed All changes are directly related to the evidence manifest feature: new validation module, documentation updates, npm script addition, and test integration. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modes/auto-pipeline.md`:
- Around line 28-32: Clarify the liveness_result semantics: update the guidance
to state that evidence-manifest.mjs validation expects liveness_result =
"not_applicable" for local file sources (those whose original URL/value starts
with "local:") and liveness_result = "unverified" for non-local inputs where
liveness checking was skipped; reference the validation logic in
evidence-manifest.mjs (lines ~50-52) and the manifest filename pattern in
modes/auto-pipeline.md when making the wording change.

In `@modes/pipeline.md`:
- Around line 13-14: The pipeline docs currently list only a subset of manifest
fields; update modes/pipeline.md to enumerate all required manifest properties
as defined in the manifest schema (evidence-manifest.mjs) so generated manifests
are valid: include report_number, company, role, source, fetched_at in addition
to source_path, liveness_result, jd_hash, report_path and pdf_path/null; mirror
the field names and expected formats used in evidence-manifest.mjs and align
wording with modes/auto-pipeline.md to avoid divergence.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 64165706-58f8-49a8-84b8-b3157456babd

📥 Commits

Reviewing files that changed from the base of the PR and between 214f5f8 and cd8b068.

📒 Files selected for processing (6)
  • docs/SCRIPTS.md
  • evidence-manifest.mjs
  • modes/auto-pipeline.md
  • modes/pipeline.md
  • package.json
  • test-all.mjs

Comment thread modes/auto-pipeline.md Outdated
Comment thread modes/pipeline.md Outdated
@luochen211

Copy link
Copy Markdown
Contributor Author

Fixed the valid CodeRabbit documentation feedback for evidence manifests. Changes:

  • Clarified liveness_result: not_applicable for local:jds/... sources, unverified when URL/manual liveness checking was skipped.
  • Expanded pipeline mode guidance to list all manifest fields required by evidence-manifest.mjs: report number, company, role, source/original URL, fetched timestamp, source path, liveness result, JD hash, report path, and PDF path/null.

Validation: node --check test-all.mjs; node test-all.mjs --quick (167 passed, 0 failed, existing README.ua warnings only).

@santifer

Copy link
Copy Markdown
Owner

Closing under the acceptance criterion explained in full on #890: core takes what the candidate uses; project-artifact tooling lives outside the core. For this one specifically: evidence manifests have no consumer in any current flow — schema-first artifacts should be designed against a real consumer when one exists. The companion-repo door from #890 applies to this one too. 🙏

@santifer santifer closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add evaluation evidence manifest

2 participants