Skip to content

Stale PR state in database when repository is removed from tracking #398

@bittoby

Description

@bittoby

Description

When a repository is removed from tracking (master_repositories.json), existing PR records in the database keep their old pr_state forever. For example, a PR that was OPEN when the repo was removed will still show as OPEN in the dashboard even after it's been merged on GitHub.

The same problem occurs when a PR is merged to a non-acceptable branch - should_skip_merged_pr() filters it out, so the old DB record is never updated.

This happens because the validator only updates PR state via UPSERT during evaluation. If a PR is skipped during fetching, no write happens and the old record stays untouched. There is no cleanup mechanism for these stale pull request records.

Note: repos marked inactive via inactive_at are not affected - PRs created before the inactive date still get processed and their state updates correctly.

Steps to Reproduce

  1. A miner has an open PR in a tracked repository, stored in DB as pr_state='OPEN'
  2. The repository is removed from master_repositories.json
  3. The PR gets merged on GitHub
  4. The validator skips the PR - no UPSERT happens
  5. The database record remains pr_state='OPEN'

Alternatively:

  1. A miner has an open PR stored in DB as pr_state='OPEN'
  2. The PR gets merged to a branch not in the acceptable branches list
  3. should_skip_merged_pr() returns True - no UPSERT happens
  4. The database record remains pr_state='OPEN'

Expected Behavior

PR state in the database should reflect the actual state on GitHub, even when the repository is no longer tracked or the PR was merged to a non-acceptable branch.

Actual Behavior

The PR record stays in its last-known state indefinitely. The dashboard shows stale data (e.g., "OPEN" for a PR that was merged weeks ago).

Additional Context

The root cause is in load_miners_prs() - PRs that don't pass the filtering pipeline are simply skipped, so the existing DB record is never touched. The UPSERT in BULK_UPSERT_PULL_REQUESTS does update pr_state on conflict, but only if the PR reaches that stage. A periodic cleanup or reconciliation step is needed for PR records tied to removed repositories or non-acceptable branches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions