Description
When a repository is removed from tracking (master_repositories.json), existing PR records in the database keep their old pr_state forever. For example, a PR that was OPEN when the repo was removed will still show as OPEN in the dashboard even after it's been merged on GitHub.
The same problem occurs when a PR is merged to a non-acceptable branch - should_skip_merged_pr() filters it out, so the old DB record is never updated.
This happens because the validator only updates PR state via UPSERT during evaluation. If a PR is skipped during fetching, no write happens and the old record stays untouched. There is no cleanup mechanism for these stale pull request records.
Note: repos marked inactive via inactive_at are not affected - PRs created before the inactive date still get processed and their state updates correctly.
Steps to Reproduce
- A miner has an open PR in a tracked repository, stored in DB as
pr_state='OPEN'
- The repository is removed from
master_repositories.json
- The PR gets merged on GitHub
- The validator skips the PR - no UPSERT happens
- The database record remains
pr_state='OPEN'
Alternatively:
- A miner has an open PR stored in DB as
pr_state='OPEN'
- The PR gets merged to a branch not in the acceptable branches list
should_skip_merged_pr() returns True - no UPSERT happens
- The database record remains
pr_state='OPEN'
Expected Behavior
PR state in the database should reflect the actual state on GitHub, even when the repository is no longer tracked or the PR was merged to a non-acceptable branch.
Actual Behavior
The PR record stays in its last-known state indefinitely. The dashboard shows stale data (e.g., "OPEN" for a PR that was merged weeks ago).
Additional Context
The root cause is in load_miners_prs() - PRs that don't pass the filtering pipeline are simply skipped, so the existing DB record is never touched. The UPSERT in BULK_UPSERT_PULL_REQUESTS does update pr_state on conflict, but only if the PR reaches that stage. A periodic cleanup or reconciliation step is needed for PR records tied to removed repositories or non-acceptable branches.
Description
When a repository is removed from tracking (
master_repositories.json), existing PR records in the database keep their oldpr_stateforever. For example, a PR that wasOPENwhen the repo was removed will still show asOPENin the dashboard even after it's been merged on GitHub.The same problem occurs when a PR is merged to a non-acceptable branch -
should_skip_merged_pr()filters it out, so the old DB record is never updated.This happens because the validator only updates PR state via UPSERT during evaluation. If a PR is skipped during fetching, no write happens and the old record stays untouched. There is no cleanup mechanism for these stale pull request records.
Note: repos marked inactive via
inactive_atare not affected - PRs created before the inactive date still get processed and their state updates correctly.Steps to Reproduce
pr_state='OPEN'master_repositories.jsonpr_state='OPEN'Alternatively:
pr_state='OPEN'should_skip_merged_pr()returns True - no UPSERT happenspr_state='OPEN'Expected Behavior
PR state in the database should reflect the actual state on GitHub, even when the repository is no longer tracked or the PR was merged to a non-acceptable branch.
Actual Behavior
The PR record stays in its last-known state indefinitely. The dashboard shows stale data (e.g., "OPEN" for a PR that was merged weeks ago).
Additional Context
The root cause is in
load_miners_prs()- PRs that don't pass the filtering pipeline are simply skipped, so the existing DB record is never touched. The UPSERT inBULK_UPSERT_PULL_REQUESTSdoes updatepr_stateon conflict, but only if the PR reaches that stage. A periodic cleanup or reconciliation step is needed for PR records tied to removed repositories or non-acceptable branches.