-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Create an AllocId
for ConstValue::Slice
.
#116707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Create an `AllocId` for `ConstValue::Slice`. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
How does this differ from the almost identical perf experiment I did a few weeks ago? Here are the perf results. |
Finished benchmarking commit (9ef21e1): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 628.692s -> 625.904s (-0.44%) |
No real difference. I just couldn't find your version. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Create an `AllocId` for `ConstValue::Slice`. r? `@ghost`
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (eafbd55): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 627.527s -> 627.366s (-0.03%) |
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
Finished benchmarking commit (f8fc695): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 0.3%, secondary -4.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.4%, secondary -9.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.1%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 490.115s -> 464.348s (-5.26%) |
rustbot has assigned @compiler-errors. Use |
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr This PR changes Stable MIR cc @oli-obk, @celinval, @ouz-a Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred in src/tools/clippy cc @rust-lang/clippy Some changes occurred in compiler/rustc_codegen_ssa Some changes occurred to the CTFE machinery |
This comment has been minimized.
This comment has been minimized.
What's the change that fixed the performance issues? |
No idea. And I'm not sure I want to bisect 2y of changes to find out. |
No I meant 😆 what you changed in your PR oh wait that perf run was 2 years old, ignore me |
r=me with miri tests blessed, too |
@@ -170,26 +171,29 @@ impl<'tcx> ConstValue<'tcx> { | |||
// Non-empty slice, must have memory. We know this is a relative pointer. | |||
let (inner_prov, offset) = | |||
ptr.into_pointer_or_addr().ok()?.prov_and_relative_offset(); | |||
let data = tcx.global_alloc(inner_prov.alloc_id()).unwrap_memory(); | |||
(data, offset.bytes(), offset.bytes() + len) | |||
(inner_prov.alloc_id(), offset.bytes(), offset.bytes() + len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(inner_prov.alloc_id(), offset.bytes(), offset.bytes() + len) | |
(inner_prov.alloc_id(), offset.bytes(), len) |
I think? You rename the variable to len
but didn't adjust the logic here, it seems.
0x00 │ 69 6e 74 65 72 6e 61 6c 20 65 72 72 6f 72 3a 20 │ internal error: | ||
0x10 │ 65 6e 74 65 72 65 64 20 75 6e 72 65 61 63 68 61 │ entered unreacha | ||
0x20 │ 62 6c 65 20 63 6f 64 65 │ ble code | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we print so many more allocation contents now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For slices, we just printed the contents. This PR adds their alloc-id to the set of alloc-ids to dump. I find this more consistent, even if a bit verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just fixed by accident, right? There's likely still tons of issues due to nonsensical bounds, we just don't have a reproducer any more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a cleaner fix in #143271 for this ICE and another one. But I agree, this is an accidental fix so meh.
I'll need some help with the miri test. The failing test verifies that we create a new alloc-id for each allocated object in a function call. It used a string, but with this PR a string now has a single alloc-id ever. Are there other constructs that would recover the proper behavior? |
Cc @saethlin -- is the const alloc dedup machinery you added still needed with this PR? |
Calling We can still get different results across multiple crates, but that won't happen inside Miri. |
I don't have time to review this in depth right now, but the upshot is that programs of this form which repeatedly evaluate a constant should not have constantly-growing memory use when run in Miri: fn main() {
loop {
helper();
}
}
fn helper() {
"ouch";
} This is pasted straight from the PR #118336. To first order, if this program does not have ever-growing memory use in Miri with this PR I say it seems good. If we re-add the memory growth bug we'll just fix it again. It's not the end of the world, we got along just fine for years with the memory growth behavior. |
yeah I think this PR properly fixes that memory growth, and we can now remove the stuff that #118336 added to work around the previous growth. @cjgillot so for this PR
|
☔ The latest upstream changes (presumably #143934) made this pull request unmergeable. Please resolve the merge conflicts. |
// The interpreter used to create a new AllocId every time it evaluates any const. | ||
// This caused unbounded memory use in Miri. | ||
// This test verifies that we only create a bounded amount of addresses for any given const. | ||
// In practice, the interpreter always returns the same address, but we *do not guarantee* it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// In practice, the interpreter always returns the same address, but we *do not guarantee* it. | |
// In practice, the interpreter always returns the same address, but we *do not guarantee* that. |
// The interpreter used to create a new AllocId every time it evaluates any const. | ||
// This caused unbounded memory use in Miri. | ||
// This test verifies that we only create a bounded amount of addresses for any given const. | ||
// In practice, the interpreter always returns the same address, but we *do not guarantee* it. | ||
//@compile-flags: -Zinline-mir=no | ||
|
||
const EVALS: usize = 256; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this test quite a bit faster now that it's not random any more.
const EVALS: usize = 256; | |
const EVALS: usize = 64; |
This PR modifies
ConstValue::Slice
to use anAllocId
instead of directly manipulating the allocation. This was originally proposed by #115764 but was a perf regression.Almost 2 years later, enough code has changed to make this a perf improvement: #116707 (comment)