-
Notifications
You must be signed in to change notification settings - Fork 155
range-diff: add configurable memory limit for cost matrix #1958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this patch! It is reasonable, I just have one suggestion how to improve it.
Please note that there has been a highly over-engineered attempt at addressing this problem before: https://lore.kernel.org/git/[email protected]/t/#me423268c4f14a0d37c0ac3e83dc7d5e9cea3661a. You probably want to mention this in the "cover letter" (i.e. in the initial PR comment that will be sent), even though that patch series' contributor seems to be AWOL for years already.
Thank you @dscho for the thoughtful review! I attempted to implement your suggestion of checking content size within read_patches(), but discovered an issue:
if (strbuf_read(&contents, cp.out, 0) < 0) { // Line 87 - reads ALL output
error_errno(_("could not read `log` output"));
...
}
// Only AFTER reading everything do we process line by line
for (; size > 0; size -= len, line += len) {
// Check limits here is too late - memory already consumed
} For the test case with 256k commits, this means ~6GB is read into the contents strbuf before any limits can be checked. By the time we could check content size or commit count in the loop, the memory is already exhausted. To properly implement early exit as you suggested, we would need to:
Would you prefer:
I'll also reference the previous RFC attempt as you suggested. |
@pcasaretto wow, thorough work! Personally, I would prefer the streaming approach, but I could understand if it is unreasonable to ask for such a huge refactor just to get the bug fix in. Your choice! |
1a92256
to
daea1fe
Compare
After pairing with @thehcma, we've updated the approach to address the memory exhaustion issue more directly. Instead of pre-counting commits, we now check the actual memory requirements of the cost matrix just before allocation in
This solution avoids the performance overhead of spawning additional processes while still preventing the crashes. Worth noting, that the process still takes a while to process and takes up around 10GB for the particular command that triggered the crash. As you noted, integrating this into What do you think about this approach, particularly:
|
cd92fde
to
e308b55
Compare
e308b55
to
dc9c6a6
Compare
There are issues in commit dc9c6a6: |
Update: 4GB was too much for 32bit systems. Made the limit 2GB in those cases. |
dc9c6a6
to
f6a1c6d
Compare
When comparing large commit ranges (e.g., 250,000+ commits), range-diff attempts to allocate an n×n cost matrix that can exhaust available memory. For example, with 256,784 commits (n = 513,568), the matrix would require approximately 256GB of memory (513,568² × 4 bytes), causing either immediate segmentation faults due to integer overflow or system hangs. Add a memory limit check in get_correspondences() before allocating the cost matrix. This check uses the total size in bytes (n² × sizeof(int)) and compares it against a configurable maximum, preventing both excessive memory usage and integer overflow issues. The limit is configurable via a new --max-memory option that accepts human-readable sizes (e.g., "1G", "500M"). The default is 4GB for 64 bit systems and 2GB for 32 bit systems. This allows comparing ranges of approximately 32,000 (16,000) commits - generous for real-world use cases while preventing impractical operations. When the limit is exceeded, range-diff now displays a clear error message showing both the requested memory size and the maximum allowed, formatted in human-readable units for better user experience. Example usage: git range-diff --max-memory=1G branch1...branch2 git range-diff --max-memory=500M base..topic1 base..topic2 This approach was chosen over alternatives: - Pre-counting commits: Would require spawning additional git processes and reading all commits twice - Limiting by commit count: Less precise than actual memory usage - Streaming approach: Would require significant refactoring of the current algorithm This issue was previously discussed in: https://lore.kernel.org/git/[email protected]/ Signed-off-by: pcasaretto <[email protected]>
f6a1c6d
to
90d0059
Compare
Problem Description
When
git range-diff
is given extremely large ranges, it can result in either:Reproduction Case
In a Shopify's large monorepo a range-diff command like this crashes after several minutes with a SIGBUS error
Range statistics:
Stack Trace (Segmentation Fault)
Root Cause Analysis
The crash occurs in
get_correspondences()
at line 356:Problems:
n=256,784
fits in anint
,n*n
overflowsread_patches()
takes minutes to process 256k commits before any errorSolution
This PR adds an early size check that runs before attempting to read patches:
Key Changes
count_range_commits()
: Usesgit rev-list --count
to quickly count commitsshow_range_diff()
: Checks total commit count before expensive operationsMAX_RANGE_DIFF_TOTAL_PATCHES = 10000
(allows ~400MB matrix)Performance Comparison
Before (current master):
read_patches()
, then crashes or exhausts memoryAfter (this PR):
Why 10,000 Limit?
Testing
Tested with:
Defense in Depth
This PR keeps the existing check in
get_correspondences()
as a backup, providing two layers of protection:Alternative Approaches Considered