fix(merge-tracker): require company match for number-based dedup#913
fix(merge-tracker): require company match for number-based dedup#913darkpandawarrior wants to merge 1 commit into
Conversation
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughmerge-tracker.mjs now requires normalized-company equality for report-number and entry-number matches; cross-company number collisions append as new entries with the next free tracker number. ChangesNumber collision safety in merge-tracker
Sequence Diagram(s)sequenceDiagram
participant AdditionTSV as Addition TSV
participant MergeTracker as merge-tracker.mjs
participant TrackerCSV as Tracker CSV
AdditionTSV->>MergeTracker: addition with reportNum, company, role, score
MergeTracker->>TrackerCSV: check reportNum + normalized company
alt reportNum match & same company
MergeTracker->>TrackerCSV: update matching tracker row
else
MergeTracker->>TrackerCSV: check entryNum + same company
alt entryNum match & same company
MergeTracker->>TrackerCSV: update matching tracker row
else
MergeTracker->>TrackerCSV: perform fuzzy company+role match
alt fuzzy match found
MergeTracker->>TrackerCSV: update matching tracker row
else
MergeTracker->>TrackerCSV: append as new row with next free number
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
***@***.***
…On Wed, 10 Jun 2026, 12:43 pm coderabbitai[bot], ***@***.***> wrote:
*coderabbitai[bot]* left a comment (santifer/career-ops#913)
<#913 (comment)>
Note
Currently processing new changes in this PR. This may take a few minutes,
please wait...
⚙️ Run configuration
*Configuration used*: Path: .coderabbit.yaml
*Review profile*: ASSERTIVE
*Plan*: Pro
*Run ID*: c23ad328-3a5f-44aa-ba62-19ac5ac123c4
📥 Commits
Reviewing files that changed from the base of the PR and between e2272dc
<e2272dc>
and b21178b
<b21178b>
.
📒 Files selected for processing (2)
- merge-tracker.mjs
- test-all.mjs
___________________________________________________________________________
< Two hard things: cache invalidation, naming things, and off-by-one jokes. >
---------------------------------------------------------------------------
\
\ \
\ /\
( )
.( o ).
✨ Finishing Touches 🧪 Generate unit tests (beta)
- Create PR with unit tests
------------------------------
Thanks for using CodeRabbit
<https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=santifer/career-ops&utm_content=913>!
It's free for OSS, and your support helps us grow. If you like it, consider
giving us a shout-out.
❤️ Share
- X
<https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai>
- Mastodon
<https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai>
- Reddit
<https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai>
- LinkedIn
<https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code>
Comment @coderabbitai help to get the list of available commands and
usage tips.
Tip
You can disable sequence diagrams in the walkthrough.
Disable the reviews.sequence_diagrams setting to disable sequence
diagrams in the walkthrough.
—
Reply to this email directly, view it on GitHub
<#913?email_source=notifications&email_token=B5KYPUOQTCF24STXAB5YUB347FCXTA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTINRWHE3TQMZVGMZKM4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#issuecomment-4669783532>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/B5KYPUKUTR4I6Q5LSXL5SHT47FCXTAVCNFSNUABGKJSXA33TNF2G64TZHMYTEMBRGQ3TMNJZGQ5US43TOVSTWNBWGMYDQNBWGAYTPILWAI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
|
Tip For best results, initiate chat on the files or code changes.
If you'd like me to trigger a fresh review of the current changes, just let me know. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
merge-tracker.mjs (1)
389-416:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBatch dedup state is stale after first mutation (can duplicate or overwrite with stale score).
existingAppsis used for duplicate detection, but it isn’t updated after add/update operations in the same run. A later TSV in the same batch can miss a just-added row (duplicate append) or compare against staleduplicate.score(incorrect re-eval decision).Proposed fix
@@ if (lineIdx >= 0) { const updatedLine = `| ${duplicate.num} | ${addition.date} | ${addition.company} | ${addition.role} | ${addition.score} | ${duplicate.status} | ${duplicate.pdf} | ${addition.report} | Re-eval ${addition.date} (${oldScore}→${newScore}). ${addition.notes} |`; appLines[lineIdx] = updatedLine; + // Keep in-memory dedup index in sync for later files in this same batch. + duplicate.raw = updatedLine; + duplicate.date = addition.date; + duplicate.company = addition.company; + duplicate.role = addition.role; + duplicate.score = addition.score; + duplicate.report = addition.report; updated++; } @@ const newLine = `| ${entryNum} | ${addition.date} | ${addition.company} | ${addition.role} | ${addition.score} | ${addition.status} | ${addition.pdf} | ${addition.report} | ${addition.notes} |`; newLines.push(newLine); + // Keep in-memory dedup index in sync for later files in this same batch. + existingApps.push({ + num: entryNum, + date: addition.date, + company: addition.company, + role: addition.role, + score: addition.score, + status: addition.status, + pdf: addition.pdf, + report: addition.report, + notes: addition.notes || '', + raw: newLine, + }); added++;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@merge-tracker.mjs` around lines 389 - 416, The batch dedup state isn't updated after mutating rows so later additions in the same run use stale data; after you modify an existing row (inside the if (duplicate) branch where you set appLines[lineIdx] and increment updated) update the in-memory index/state (the same structure used to find duplicate) so duplicate.score (and any lookup by company/role) reflects the new score/date/report/notes; likewise after pushing newLine and incrementing added update that state (and maxNum if changed) so subsequent additions in this batch see the newly added entry instead of reusing the number or making wrong comparisons.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@merge-tracker.mjs`:
- Around line 389-416: The batch dedup state isn't updated after mutating rows
so later additions in the same run use stale data; after you modify an existing
row (inside the if (duplicate) branch where you set appLines[lineIdx] and
increment updated) update the in-memory index/state (the same structure used to
find duplicate) so duplicate.score (and any lookup by company/role) reflects the
new score/date/report/notes; likewise after pushing newLine and incrementing
added update that state (and maxNum if changed) so subsequent additions in this
batch see the newly added entry instead of reusing the number or making wrong
comparisons.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c23ad328-3a5f-44aa-ba62-19ac5ac123c4
📒 Files selected for processing (2)
merge-tracker.mjstest-all.mjs
|
Nice work, @darkpandawarrior — and to be clear up front: the bug you found is real and still live on main. I reproduced it: #867 guarded the entry-number match, but the report-number match is still unguarded, so a TSV whose report number collides with an unrelated company's row link overwrites that row (company/role/notes destroyed). Your diagnosis of the two independent sequences (TSV = report sequence, tracker
Happy to merge once these are in — it closes a real data-loss hole that has now bitten three times (#634, #867/#912, this). |
b21178b to
8fbfe9f
Compare
|
Both points addressed in the rebased branch (now a single commit, 8fbfe9f, on top of Rebase: The delta is now just the report-number company guard + the new-entry numbering comment. I hoisted a single Tests pin the fix: The fixture row #1 now carries a real report link (
Also kept a positive control: a same-company, same-report-number re-eval with a higher score still updates in place through the now-guarded report-number path (no duplicate row).
🤖 Generated with Claude Code |
8fbfe9f to
f900d5a
Compare
The report-number duplicate check matched on the bare number, but TSV numbers carry the report sequence while the tracker # column is its own row sequence (santifer#867), so a TSV whose report number collides with the report number on an unrelated company's row was treated as a duplicate — and the update-in-place path overwrote that row's company, role, report link, and notes. Guard the report-number match with the same normalized-company check that santifer#867 added for the entry-number match, hoisting a single normCompany/sameCompany shared by all three duplicate checks. Colliding TSVs fall through to the new-entry path and are appended with the next free number. Adds a regression section to test-all.mjs: the collision scenario corrupts row santifer#1 on current main (verified) and leaves it byte-identical with the guard, plus a positive control proving same-company re-evals still update in place through the guarded path. Fixes santifer#912 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
f900d5a to
93d94ed
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test-all.mjs`:
- Around line 1859-1863: The test currently only checks the single line found by
batchRunner.split('\n').find(l => l.includes('claude_args=(')) so it misses
flags that appear on subsequent lines of a multiline claude_args=( ...)
construction; change the check to locate the line index for the line containing
'claude_args=(' (use batchRunner.split('\n') to get an array), then collect and
join subsequent lines until the closing ')' is found (or until end), and search
that joined multiline block for '--strict-mcp-config' so the assertion on
claudeArgsLine/batchRunner becomes resilient to multiline argument formatting.
- Around line 1401-1450: The test lacks an explicit invariant that the existing
row `#2` (OtherCo) remains byte-identical across both merges; capture the original
row `#2` string (e.g. readFileSync(tracker) and extract the line containing
'OtherCo' into a const like originalRow2) before performing the merges, then
after each run(NODE, ['merge-tracker.mjs'], { env: mergeEnv }) compare the
current file's row for OtherCo to originalRow2 and call pass(...) or fail(...)
accordingly so the test asserts that the unrelated row remains unchanged; locate
the setup using oldcoRow, tracker, additionsDir and the two run(...) points to
insert these checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 645c9c3f-2ec0-4294-9646-fd469de6ab13
📒 Files selected for processing (2)
merge-tracker.mjstest-all.mjs
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test-all.mjs`:
- Around line 1859-1863: The test currently only checks the single line found by
batchRunner.split('\n').find(l => l.includes('claude_args=(')) so it misses
flags that appear on subsequent lines of a multiline claude_args=( ...)
construction; change the check to locate the line index for the line containing
'claude_args=(' (use batchRunner.split('\n') to get an array), then collect and
join subsequent lines until the closing ')' is found (or until end), and search
that joined multiline block for '--strict-mcp-config' so the assertion on
claudeArgsLine/batchRunner becomes resilient to multiline argument formatting.
- Around line 1401-1450: The test lacks an explicit invariant that the existing
row `#2` (OtherCo) remains byte-identical across both merges; capture the original
row `#2` string (e.g. readFileSync(tracker) and extract the line containing
'OtherCo' into a const like originalRow2) before performing the merges, then
after each run(NODE, ['merge-tracker.mjs'], { env: mergeEnv }) compare the
current file's row for OtherCo to originalRow2 and call pass(...) or fail(...)
accordingly so the test asserts that the unrelated row remains unchanged; locate
the setup using oldcoRow, tracker, additionsDir and the two run(...) points to
insert these checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 645c9c3f-2ec0-4294-9646-fd469de6ab13
📒 Files selected for processing (2)
merge-tracker.mjstest-all.mjs
🛑 Comments failed to post (2)
test-all.mjs (2)
1401-1450:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd an explicit invariant check for the preexisting row
#2across both merges.This regression currently proves row
#1stability, append-on-collision, and same-company in-place update, but it does not pin that the unrelated row| 2 | ... OtherCo ... |remains byte-identical. That leaves a hole in the intended#912coverage.Suggested patch
- writeFileSync(tracker, + const othercoRow = '| 2 | 2026-01-01 | OtherCo | Widget Engineer | N/A | Applied | ❌ | — | row two |'; + writeFileSync(tracker, '# Applications Tracker\n\n' + '| # | Date | Company | Role | Score | Status | PDF | Report | Notes |\n' + '|---|------|---------|------|-------|--------|-----|--------|-------|\n' + oldcoRow + '\n' + - '| 2 | 2026-01-01 | OtherCo | Widget Engineer | N/A | Applied | ❌ | — | row two |\n'); + othercoRow + '\n'); @@ if (after.includes(oldcoRow)) { pass('report-number collision: existing row `#1` (OldCo) left byte-identical'); } else { fail('report-number collision OVERWROTE row `#1` (`#912` regression — bare number match must not dedup)'); } + if (after.includes(othercoRow)) { + pass('unrelated row `#2` remains byte-identical after first merge'); + } else { + fail('row `#2` changed during first merge'); + } @@ if (after2.includes(oldcoRow)) { pass('row `#1` still intact after the same-company re-eval merge'); } else { fail('row `#1` was modified by the same-company re-eval merge'); } + if (after2.includes(othercoRow)) { + pass('unrelated row `#2` remains byte-identical after second merge'); + } else { + fail('row `#2` changed during second merge'); + }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.const oldcoRow = '| 1 | 2026-01-09 | OldCo | Senior Widget Engineer | 4.1/5 | Interview | ❌ | [1](../reports/001-oldco-2026-01-09.md) | precious row-one notes |'; const othercoRow = '| 2 | 2026-01-01 | OtherCo | Widget Engineer | N/A | Applied | ❌ | — | row two |'; writeFileSync(tracker, '# Applications Tracker\n\n' + '| # | Date | Company | Role | Score | Status | PDF | Report | Notes |\n' + '|---|------|---------|------|-------|--------|-----|--------|-------|\n' + oldcoRow + '\n' + othercoRow + '\n'); for (const n of ['001-oldco-2026-01-09', '001-newco-2026-01-10', '001-newco-2026-01-11']) { writeFileSync(join(collTmp, 'reports', `${n}.md`), '# fixture\n'); } // Colliding TSV: report number 1 (= row `#1`'s report number), DIFFERENT company, // higher score than row `#1` so the update-in-place branch would fire if matched. writeFileSync(join(additionsDir, '001-newco.tsv'), '1\t2026-01-10\tNewCo\tPlatform Reliability Lead\tEvaluated\t4.7/5\t❌\t[1](reports/001-newco-2026-01-10.md)\tfresh eval\n'); run(NODE, ['merge-tracker.mjs'], { env: mergeEnv }); const after = readFileSync(tracker, 'utf-8'); if (after.includes(oldcoRow)) { pass('report-number collision: existing row `#1` (OldCo) left byte-identical'); } else { fail('report-number collision OVERWROTE row `#1` (`#912` regression — bare number match must not dedup)'); } if (after.includes(othercoRow)) { pass('unrelated row `#2` remains byte-identical after first merge'); } else { fail('row `#2` changed during first merge'); } const newcoRows = after.split('\n').filter(l => l.includes('NewCo')); if (newcoRows.length === 1 && /^\| 3 \| 2026-01-10 \| NewCo \| Platform Reliability Lead \|/.test(newcoRows[0])) { pass('colliding TSV appended as a new row with the next free number (`#3`)'); } else { fail(`colliding TSV not appended as `#3`: ${JSON.stringify(newcoRows)}`); } // Positive control: SAME company + same report number with a higher score // must still UPDATE in place via the (now company-guarded) report-number path. writeFileSync(join(additionsDir, '001-newco.tsv'), '1\t2026-01-11\tNewCo\tPlatform Reliability Lead\tEvaluated\t4.9/5\t❌\t[1](reports/001-newco-2026-01-11.md)\tre-evaluated\n'); run(NODE, ['merge-tracker.mjs'], { env: mergeEnv }); const after2 = readFileSync(tracker, 'utf-8'); const newcoRows2 = after2.split('\n').filter(l => l.includes('NewCo')); if (newcoRows2.length === 1 && newcoRows2[0].startsWith('| 3 |') && newcoRows2[0].includes('4.9/5')) { pass('same-company report-number match still updates in place (no duplicate row)'); } else { fail(`same-company re-eval did not update `#3` in place: ${JSON.stringify(newcoRows2)}`); } if (after2.includes(oldcoRow)) { pass('row `#1` still intact after the same-company re-eval merge'); } else { fail('row `#1` was modified by the same-company re-eval merge'); } if (after2.includes(othercoRow)) { pass('unrelated row `#2` remains byte-identical after second merge'); } else { fail('row `#2` changed during second merge'); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test-all.mjs` around lines 1401 - 1450, The test lacks an explicit invariant that the existing row `#2` (OtherCo) remains byte-identical across both merges; capture the original row `#2` string (e.g. readFileSync(tracker) and extract the line containing 'OtherCo' into a const like originalRow2) before performing the merges, then after each run(NODE, ['merge-tracker.mjs'], { env: mergeEnv }) compare the current file's row for OtherCo to originalRow2 and call pass(...) or fail(...) accordingly so the test asserts that the unrelated row remains unchanged; locate the setup using oldcoRow, tracker, additionsDir and the two run(...) points to insert these checks.
1859-1863: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Make the MCP isolation assertion resilient to multiline argument construction.
The current check inspects only the single line containing
claude_args=(. If--strict-mcp-configis appended on another line, this test can fail spuriously.Suggested patch
- const claudeArgsLine = batchRunner - .split('\n') - .find(l => l.includes('claude_args=(')); - if (claudeArgsLine && claudeArgsLine.includes('--strict-mcp-config')) { + const hasStrictMcp = + /claude_args=\([\s\S]*?--strict-mcp-config/.test(batchRunner) || + /claude[^\n]*--strict-mcp-config/.test(batchRunner); + if (hasStrictMcp) { pass('batch workers spawn with --strict-mcp-config (no inherited MCP)'); } else { fail('batch-runner.sh worker spawn missing --strict-mcp-config (issue `#506` regression)'); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test-all.mjs` around lines 1859 - 1863, The test currently only checks the single line found by batchRunner.split('\n').find(l => l.includes('claude_args=(')) so it misses flags that appear on subsequent lines of a multiline claude_args=( ...) construction; change the check to locate the line index for the line containing 'claude_args=(' (use batchRunner.split('\n') to get an array), then collect and join subsequent lines until the closing ')' is found (or until end), and search that joined multiline block for '--strict-mcp-config' so the assertion on claudeArgsLine/batchRunner becomes resilient to multiline argument formatting.
Fixes #912
Problem
merge-tracker.mjs's first two duplicate checks — report-number match and entry-number match — had no company guard. TSV numbers carry the report sequence while the tracker#column is its own row sequence, so a TSV numbered001colliding with tracker row#1(a completely different company) was treated as a duplicate, and the update-in-place path rewrote that row's company, role, report link, and notes. A first batch merge with TSVs 001–003 against a tracker holding manually-added rows #1–#3 corrupts all three.Fix
CAREER_OPS_ADDITIONSenv override for the additions dir, mirroring the existingCAREER_OPS_TRACKERtest hook, so the merge flow can be tested end-to-end against fixtures.Tests
New section in
test-all.mjs(4 checks, all fixture-based, no personal data):node test-all.mjs --quick: 156 passed, 0 failed (7 pre-existing warnings aboutREADME.ua.mdin the personal-data scan, unrelated).🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
New Features
Tests