Problem
Bugbot's dedup layer (bugbot/src/dedup.ts) has 3 layers:
- Fingerprint cache — previous bugbot scan results
- GH issue search — open issues mentioning the same file + title similarity
- GH PR search — open PRs touching the same file
Missing: recent git commit cross-reference. If a bug was fixed in a recent commit on origin/main before bugbot runs, bugbot still flags it, files a GH issue, and nightshift creates a worktree — all for already-fixed code.
Evidence
On 2026-03-11, bugbot filed 6 issues for bugs that were already fixed 8-9 days earlier:
| Issue |
Filed |
Already Fixed |
Fix Commit |
#215 (dead getVariablesByForm) |
Mar 11 |
Mar 2 |
e4e4e73 |
#216 (dead resolveCourtName) |
Mar 11 |
Mar 2 (false positive — function is used) |
n/a |
#221 (formatNumber 999K boundary) |
Mar 11 |
Mar 2 |
fb47934 |
#222 (DEFAULT_CSV_PATH relative) |
Mar 11 |
Mar 2 |
e4e4e73 |
#223 (absoluteUrl empty string) |
Mar 11 |
Mar 3 |
7a10081 |
#233 (untested parseBmfRecord) |
Mar 11 |
already tested |
n/a |
All 6 resulted in empty nightshift worktrees (0 commits ahead of main) that sat around for 2+ weeks.
Proposed Fix
Add a Layer 0 in deduplicateFindings() that checks whether the finding's target code was modified in recent commits:
// Layer 0: Check if the flagged code was recently changed on origin/main
// If the function/pattern was touched in the last N commits, skip it —
// the fix may already be on main.
function wasRecentlyModified(scanRoot: string, file: string, pattern: string, lookback = 50): boolean {
try {
const log = execFileSync("git", [
"log", `--max-count=${lookback}`, "--oneline", "-S", pattern, "--", file
], { cwd: scanRoot }).toString().trim();
return log.length > 0;
} catch {
return false;
}
}
The -S flag (pickaxe search) checks if the string was added or removed in recent commits. If the flagged function/pattern shows up in recent commit diffs, it's likely already addressed.
Trade-offs
- False negatives: A commit that touches the function but doesn't fix the bug would cause bugbot to skip it. Mitigated by keeping
lookback small (50 commits ~ 2 weeks of active dev).
- Performance: One
git log -S per finding. Should be fast on local repos. Can batch by file.
- Scope: Only useful for categories that flag specific code patterns (dead code, untested functions). Categories like "missing test coverage" need different dedup.
Impact
Prevents phantom issues, prevents phantom worktrees, reduces nightshift noise and worktree sprawl.
Problem
Bugbot's dedup layer (
bugbot/src/dedup.ts) has 3 layers:Missing: recent git commit cross-reference. If a bug was fixed in a recent commit on
origin/mainbefore bugbot runs, bugbot still flags it, files a GH issue, and nightshift creates a worktree — all for already-fixed code.Evidence
On 2026-03-11, bugbot filed 6 issues for bugs that were already fixed 8-9 days earlier:
getVariablesByForm)e4e4e73resolveCourtName)formatNumber999K boundary)fb47934DEFAULT_CSV_PATHrelative)e4e4e73absoluteUrlempty string)7a10081parseBmfRecord)All 6 resulted in empty nightshift worktrees (0 commits ahead of main) that sat around for 2+ weeks.
Proposed Fix
Add a Layer 0 in
deduplicateFindings()that checks whether the finding's target code was modified in recent commits:The
-Sflag (pickaxe search) checks if the string was added or removed in recent commits. If the flagged function/pattern shows up in recent commit diffs, it's likely already addressed.Trade-offs
lookbacksmall (50 commits ~ 2 weeks of active dev).git log -Sper finding. Should be fast on local repos. Can batch by file.Impact
Prevents phantom issues, prevents phantom worktrees, reduces nightshift noise and worktree sprawl.