feat: add evaluation report contract verifier#927
Conversation
📝 WalkthroughWalkthroughA new report validation CLI script, ChangesReport Validation Script with Self-Test Integration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@verify-reports.mjs`:
- Around line 56-57: The Score parsing regex in verify-reports.mjs (the const
score assignment) only allows zero or one decimal place and thus rejects values
like 4.25/5; update the regex used in the score extraction (the expression
assigned to score) to allow up to two decimal places (e.g., make the fractional
part \.\d{1,2} optional) while keeping the existing alternatives (N/A, SKIP,
DUP) and case-insensitivity so reports like 4.25/5 are accepted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 17785419-9a17-451e-967d-6dc8b791ed6f
📒 Files selected for processing (4)
docs/SCRIPTS.mdpackage.jsontest-all.mjsverify-reports.mjs
| const score = content.match(/\*\*Score:\*\*\s*(\d(?:\.\d)?\/5|N\/A|SKIP|DUP)/i)?.[1]; | ||
| if (!score) errors.push('missing or invalid **Score:** header'); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for scores with two decimal places in reports/
rg -n '\*\*Score:\*\*\s*\d\.\d{2}/5' --glob 'reports/*.md' -C1Repository: santifer/career-ops
Length of output: 212
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Locate verify-reports.mjs to confirm directory structure / validation code context
echo "== verify-reports.mjs locations =="
rg -n --files-with-matches "verify-reports\.mjs" . 2>/dev/null || true
# fallback: find file directly
echo "== find verify-reports.mjs =="
fd -a "verify-reports.mjs" . 2>/dev/null || true
# 2) Locate "reports" directory/files
echo "== find reports/ directories =="
fd -t d -a "reports" . 2>/dev/null || true
echo "== list markdown under any reports-like directories (depth<=3) =="
fd -a -t f "*.md" . | rg -n "(/|^)reports(/|$)" || true
# 3) Search for Score headers with decimal formats in all markdown (limit to likely files)
echo "== Search for **Score:** with two decimals in any md/html/js templates =="
rg -n --glob '**/*.md' --glob '**/*.html' --glob '**/*.mjs' \
'\*\*Score:\*\*\s*\d+\.\d{2}/5' -S . || true
echo "== Search for **Score:** with 1 decimal =="
rg -n --glob '**/*.md' --glob '**/*.html' --glob '**/*.mjs' \
'\*\*Score:\*\*\s*\d+\.\d/5' -S . || true
echo "== Search for **Score:** with no decimals (e.g., 4/5) =="
rg -n --glob '**/*.md' --glob '**/*.html' --glob '**/*.mjs' \
'\*\*Score:\*\*\s*\d/5' -S . || trueRepository: santifer/career-ops
Length of output: 986
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== verify-reports.mjs regex area (lines ~50-80) =="
nl -ba verify-reports.mjs | sed -n '40,90p'
echo "== Search in reports/ for two-decimal scores like 4.25/5 =="
rg -n '\*\*Score:\*\*\s*\d+\.\d{2}/5' reports -S || true
echo "== Search in reports/ for one-decimal scores like 4.2/5 =="
rg -n '\*\*Score:\*\*\s*\d+\.\d/5' reports -S || true
echo "== Search in reports/ for integer scores like 4/5 =="
rg -n '\*\*Score:\*\*\s*\d/5' reports -S || trueRepository: santifer/career-ops
Length of output: 175
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== verify-reports.mjs total lines =="
wc -l verify-reports.mjs
echo "== verify-reports.mjs lines 45-75 =="
awk 'NR>=45 && NR<=75 {printf "%d:%s\n", NR, $0}' verify-reports.mjs
echo "== Search in reports/ for two-decimal scores like 4.25/5 =="
rg -n '\*\*Score:\*\*\s*\d+\.\d{2}/5' reports -S || true
echo "== Search in reports/ for one-decimal scores like 4.2/5 =="
rg -n '\*\*Score:\*\*\s*\d+\.\d/5' reports -S || true
echo "== Search in reports/ for integer scores like 4/5 =="
rg -n '\*\*Score:\*\*\s*\d/5' reports -S || trueRepository: santifer/career-ops
Length of output: 1591
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== reports/ markdown files (count + first 50) =="
fd -t f "*.md" reports . | head -n 50
echo "== reports/ md count =="
fd -t f "*.md" reports . | wc -l
echo "== Any **Score:** occurrences in reports/ =="
rg -n '\*\*Score:\*\*' reports -S || true
echo "== Any two-decimal scores in reports/ (4.25/5 etc) =="
rg -n '\*\*Score:\*\*\s*\d+\.\d{2}/5' reports -S || true
echo "== Any one-decimal scores in reports/ (4.2/5 etc) =="
rg -n '\*\*Score:\*\*\s*\d+\.\d/5' reports -S || true
echo "== Any integer scores in reports/ (4/5) =="
rg -n '\*\*Score:\*\*\s*\d/5' reports -S || trueRepository: santifer/career-ops
Length of output: 414
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== reports/ directory listing (top 3 levels) =="
find reports -maxdepth 3 -type d -print
echo "== reports/ md files count =="
find reports -type f -name '*.md' | wc -l
echo "== reports/ md files (first 50) =="
find reports -type f -name '*.md' | head -n 50
echo "== Any **Score:** occurrences in reports/ =="
rg -n '\*\*Score:\*\*' reports -S || true
echo "== Two-decimal scores (**Score:** 4.25/5) in reports/ =="
rg -n '\*\*Score:\*\*\s*\d+\.\d{2}/5' reports -S || true
echo "== One-decimal scores (**Score:** 4.2/5) in reports/ =="
rg -n '\*\*Score:\*\*\s*\d+\.\d/5' reports -S || true
echo "== Integer scores (**Score:** 4/5) in reports/ =="
rg -n '\*\*Score:\*\*\s*\d/5' reports -S || trueRepository: santifer/career-ops
Length of output: 394
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Any **Score:** values with two decimals across repo =="
rg -n '\*\*Score:\*\*\s*\d+\.\d{2}/5' -S . || true
echo "== Any **Score:** values with one decimal across repo =="
rg -n '\*\*Score:\*\*\s*\d+\.\d/5' -S . || true
echo "== Any **Score:** values with integer across repo =="
rg -n '\*\*Score:\*\*\s*\d/5' -S . || trueRepository: santifer/career-ops
Length of output: 380
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== verify-reports.mjs lines 90-150 =="
awk 'NR>=90 && NR<=150 {print NR ":" $0}' verify-reports.mjs
echo "== examples/sample-report.md =="
awk 'NR<=60 {print NR ":" $0}' examples/sample-report.mdRepository: santifer/career-ops
Length of output: 4755
Score regex only supports 0–1 decimal digit; no two-decimal scores found in current repo fixtures
verify-reports.mjs uses \d(?:\.\d)?\/5, which accepts 4/5 and 4.2/5 but rejects 4.25/5. Repo-wide search found only **Score:** 4.2/5 (and 4.1/5 in the self-test), with no existing **Score:** values containing two decimal places—so this is only a gap if 4.xx/5 is expected in real reports.
🛠️ Proposed fix to support two decimal places
- const score = content.match(/\*\*Score:\*\*\s*(\d(?:\.\d)?\/5|N\/A|SKIP|DUP)/i)?.[1];
+ const score = content.match(/\*\*Score:\*\*\s*(\d(?:\.\d{1,2})?\/5|N\/A|SKIP|DUP)/i)?.[1];
if (!score) errors.push('missing or invalid **Score:** header');📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const score = content.match(/\*\*Score:\*\*\s*(\d(?:\.\d)?\/5|N\/A|SKIP|DUP)/i)?.[1]; | |
| if (!score) errors.push('missing or invalid **Score:** header'); | |
| const score = content.match(/\*\*Score:\*\*\s*(\d(?:\.\d{1,2})?\/5|N\/A|SKIP|DUP)/i)?.[1]; | |
| if (!score) errors.push('missing or invalid **Score:** header'); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@verify-reports.mjs` around lines 56 - 57, The Score parsing regex in
verify-reports.mjs (the const score assignment) only allows zero or one decimal
place and thus rejects values like 4.25/5; update the regex used in the score
extraction (the expression assigned to score) to allow up to two decimal places
(e.g., make the fractional part \.\d{1,2} optional) while keeping the existing
alternatives (N/A, SKIP, DUP) and case-insensitivity so reports like 4.25/5 are
accepted.
|
@luochen211 — closing this one quickly: it's a re-submit of #892 (same To be crystal clear about scope: this close is ONLY about the re-submit. #929 (the networkidle fix) and #928 (latex menu) are welcome and in my review queue right now — they're exactly the candidate-facing work the criterion exists to prioritize. And if you disagree with the criterion itself, the right venue is a reply on #890 — I'll engage with arguments there, not with re-pushes. 🙏 |
Summary
Validation
Summary by CodeRabbit
New Features
verify:reportscommand to validate evaluation report markdown files, checking for required headers, accepted legitimacy tier values, proper score formats, PDF status, and Machine Summary sections.Documentation
Tests