-
Notifications
You must be signed in to change notification settings - Fork 3.8k
GH-46777: [C++] Use SimplifyIsIn only when the value_set of the expression is lower than a threshold #46859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zanmato1984 @pitrou how would you proceed about testing this? Should I just try to add a new benchmark that exercises this scenario? The current Python reproducer has validated the solution fixes the original reported problem and there are currently tests with less values but I haven't found tests with a high value_set.
Do we have any benchmarks for expression simplification already? Otherwise, we shouldn't bother adding any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would you proceed about testing this? Should I just try to add a new benchmark that exercises this scenario?
I think it's good to have both tests and benchmarks. I assume the test shouldn't be very heavy (50 elements value set), is it?
It would be nice to have this in 21.0. Do you want to update this PR @raulcd ? |
… expression is lower than 50
Sure, I am working on it at the moment, will try to push soon |
@@ -80,6 +80,16 @@ Expression add(Expression l, Expression r) { | |||
return call("add", {std::move(l), std::move(r)}); | |||
} | |||
|
|||
std::string make_range_json(int start, int end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know our test utilities but couldn't find something like this, do we have any utility to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 except for two nits
Co-authored-by: Antoine Pitrou <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 0b34e6b. There were 9 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 53 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
Using
SimplifyIsIn
when the value set is large has a substantial performance penalty.What changes are included in this PR?
Ensure we do not use the simplification when the value_set on the expression is higher than a threshold (50).
Are these changes tested?
I've tested locally that the reproducer goes back to pre change levels.
I have added a test for large sets and validate the expression is not being modified.
Are there any user-facing changes?
No