Skip to content

Conversation

@matthalp
Copy link
Contributor

@matthalp matthalp commented Oct 15, 2025

Clear GitContext caches BEFORE calling owner.flush_caches() to ensure that when MacheteClient caches are cleared and potentially rebuilt, they use fresh GitContext data (especially reflogs) rather than stale cached data.

This fixes an issue where 'git machete traverse -F' would not detect merged PRs on the first run but would detect them on the second run, due to MacheteClient using stale reflog data from GitContext caches.

Clear GitContext caches BEFORE calling owner.flush_caches() to ensure
that when MacheteClient caches are cleared and potentially rebuilt,
they use fresh GitContext data (especially reflogs) rather than stale
cached data.

This fixes an issue where 'git machete traverse -F' would not detect
merged PRs on the first run but would detect them on the second run,
due to MacheteClient using stale reflog data from GitContext caches.

Add cache_fix_analysis.md documenting the root cause and solution.
@matthalp
Copy link
Contributor Author

Git-Machete Cache Issue Analysis

Problem Description

When running git machete traverse -F after a PR has been merged remotely, the first run doesn't detect the merged PR, but the second run does. This happens due to caching inconsistencies between git-machete's internal cache layers.

Root Cause

Cache Architecture

Git-machete has two cache layers:

  1. GitContext caches (git_operations.py):

    • __commit_hash_by_revision_cached
    • __reflogs_cached
    • __local_branches_cached
    • __remote_branches_cached
    • etc.
  2. MacheteClient caches (client.py):

    • __branch_pairs_by_hash_in_reflog

The Issue

  1. git machete traverse -F calls git.fetch_remote() which flushes GitContext caches
  2. However, MacheteClient.__branch_pairs_by_hash_in_reflog is NOT cleared
  3. This cache is used by is_merged_to() for merge detection via reflog analysis
  4. Stale cache causes missed merge detection on first run
  5. Second run works because new CLI process creates fresh MacheteClient instance

Key Code Locations

Fetch with cache flush (client.py:687-692):

if opt_fetch:
    for rem in self.__git.get_remotes():
        print(f"Fetching {bold(rem)}...")
        self.__git.fetch_remote(rem)  # This flushes GitContext caches

MacheteClient cache not cleared (client.py:2031-2032):

def flush_caches(self) -> None:
    self.__branch_pairs_by_hash_in_reflog = None  # Only clears this cache

GitContext owner relationship (git_operations.py:202-204):

def flush_caches(self) -> None:
    if self.owner:  # This should be MacheteClient
        self.owner.flush_caches()  # But owner is not always set

Solutions

Immediate Workarounds

1. Double Traverse Pattern

# Run twice - second run will work correctly
git machete traverse -F
git machete traverse -F

2. Manual Cache Refresh

# Force cache refresh by checking status first
git machete status > /dev/null
git machete traverse -F

3. Explicit Fetch + Traverse

# Separate fetch from traverse
git fetch --prune --all
git machete traverse

Long-term Solutions

1. Enhanced Wrapper Script

Create a wrapper that ensures proper cache management.

2. Upstream Fix

The proper fix would be to ensure GitContext.owner is set to MacheteClient and that fetch_remote() properly triggers MacheteClient.flush_caches().

Testing the Issue

To reproduce:

  1. Have a PR merged remotely
  2. Run git machete traverse -F - won't detect merge
  3. Run again immediately - will detect merge
  4. The difference is the fresh MacheteClient instance

Verification

Check if the issue affects your setup:

# After a PR is merged remotely:
git machete status | grep -E "(merged|red)"  # Note any red branches
git machete traverse -F  # See if it offers to slide out merged branches
git machete traverse -F  # Second run should work if first didn't

@matthalp
Copy link
Contributor Author

Hi @PawelLipski -- long time no talk! I've been running into this issue quite frequently and decided to do a some AI-driven investigation

@PawelLipski
Copy link
Collaborator

Wow I wasn't aware of this issue, TBH I thought that the caching is perfect already 😅 thanks, I'll take a look + add some regression tests

@PawelLipski PawelLipski changed the base branch from master to develop October 15, 2025 15:32
@matthalp
Copy link
Contributor Author

Thank you @PawelLipski!

There are only two hard things in Computer Science: cache invalidation and naming things.

@matthalp
Copy link
Contributor Author

@PawelLipski I don't think the solution in here correct (it's just moving code around). But the problem is real

@PawelLipski
Copy link
Collaborator

Hmm I'm trying to reproduce the error scenario now 🤔 I don't think the analysis provided by AI is correct — self.owner is always set on GitClient when an action requiring flush_caches is executed. Also, __branch_pairs_by_hash_in_reflog doesn't seem to be used in MacheteClient.is_merged_to (even transitively, via different methods) — so it should not affect the correctness of status displayed in git machete traverse.

I've tried a local repro (git machete traverse --fetch when a branch has been squashed-merged via GitHub), but cannot reproduce the issue. After doing a fetch, the caches were flushed (checked from logs), and the merged branch has been recognized as such within the same traverse invocation.

Could you provide a screen of a traverse session that went different than expected? there's a chance that the behavior is surprising but correct (maybe extra clarification or a change in logic will be needed somewhere) 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants