-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix next_up
and next_down
behavior for zero float values
#16745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for taking a look at this. A cursory look suggests when a strict inequality is being propagated, if the next value of other side's lower bound is greater than the upper bound, the propagation result should be "infeasible". @berkaysynnada will help with this issue |
Currently, when computing the next representable float from +0.0 or -0.0, the behavior incorrectly skips directly to the smallest subnormal (±ε) instead of transitioning between -0.0 and +0.0. For example, next_down(+0.0) returns -ε, but we expect it to return -0.0. Similarly, next_up(-0.0) returns +ε, but we expect it to return +0.0. This causes intervals like [-0.0, -ε] instead of the expected [-0.0, -0.0]. In ScalarValue comparisons we already treat -0.0 and +0.0 as NOT equal, but the rounding logic was skipping over them and jumping directly to subnormals. To fix this, I locally updated next_up and next_down to handle ±0.0 explicitly. In next_up, if the input is -0.0, it now returns +0.0 instead of +ε. In next_down, if the input is +0.0, it now returns -0.0 instead of -ε. All other cases remain as they were. This keeps the fix localized to the specific ±0.0 boundary without unnecessarily affecting the general behavior of interval arithmetic logic. pub fn next_up<F: FloatBits + Copy>(float: F) -> F {
let bits = float.to_bits();
if float.float_is_nan() || bits == F::infinity().to_bits() {
return float;
}
// Special case: -0.0 → +0.0
if bits == F::NEG_ZERO {
return F::from_bits(F::ZERO);
}
... pub fn next_down<F: FloatBits + Copy>(float: F) -> F {
let bits = float.to_bits();
if float.float_is_nan() || bits == F::neg_infinity().to_bits() {
return float;
}
// Special case: +0.0 → -0.0
if bits == F::ZERO {
return F::from_bits(F::NEG_ZERO);
}
... With these changes, the interval calculations now respect the special ±0.0 representations before moving into the subnormal range. This aligns the rounding behavior with how ScalarValue comparisons already work and avoids producing unexpected intervals. |
def4520
to
7074029
Compare
BTW, maybe we should modify how floating point ScalarValue's are compared (
|
Thank you for pointing that out, @berkaysynnada and @ozankabak! The fix in As for the |
satisfy_greater
next_up
and next_down
behavior for zero float values
If this issue is not urgent for you, let's wait for a few days. This is not a trivial change, and we need to consider all consequences of the given decision. I need to do some readings and investigate how other engine/platforms behave. |
Hi again @liamzwbao. We’ve discussed this with @ozankabak, and the actual fix should be on the |
The SQL ordering of float values clearly distinguishes between 0 and -0 and is a total ordering.
The ScalarValue comparison must match that done in SQL. |
What you give as an example is correct but missing. In SQL "ORDER BY" produces a total order and distinguishes So, for |
Well explained, @berkaysynnada. It seems like the |
You're absolutely right. The ORDER BY ordering and
Not sure whether you mean SQL spec, or what's implemented in DataFusion, or what DataFusion should be implementing? What's currently implemented seems to be this:
|
I mean SQL spec and what DataFusion should be implementing.
This is what's implemented in DataFusion, and conflicts with SQL spec |
I wish DataFusion followed SQL spec in everything, but that's not the project design philosophy AFAICT. if we want to follow the SQL spec, you have my full support, but I'd encourage codifying it in a form of a referencible documentation. BTW, last time I checked SQL spec didn't know about NaN values and I don't think it really distinguishes between positive or negative zeros. following #13706 proposal, here is behavior check with PostgreSQL
here is the output from Trino
|
Again, if we want to follow the SQL spec, you have my support, but it won't bring answers to what actual float equality, comparison and ordering for float values should be. Following IEEE 754 makes sense to me. For that purpose we should be cross checking with Trino (and not with PostgreSQL!). It really goes far beyond this single PR, so @berkaysynnada can you please put relevant wording about IEEE 754 in the docs somewhere, so that we set the direction once and then execute on it? I think this is important enough to go through mailing list. |
Thanks for your insights, @berkaysynnada @findepi! It seems we haven't reached a consensus yet. Should I proceed with the |
Copying here for visibility - #13704 (comment) |
Since we do not yet fully understand what transitioning to partial ordering will entail (and we may not even want to do it, at the end), I think the best path forward is to go back to @berkaysynnada's original suggestion, which was to fix cc @findepi He will respond shortly with our final suggestion. |
@liamzwbao can you please cherrypick this commit |
Which issue does this PR close?
Rationale for this change
The root cause mentioned in the linked issue is caused by an invalid interval produced by
satisfy_greater
. For instance, callingsatisfy_greater([0.0, 0.0], [-0.0, any], true)
yields[0.0, 0.0], [-0.0, -ε]
instead of the expected[0.0, 0.0], [-0.0, -0.0]
.What changes are included in this PR?
As @berkaysynnada pointed out, the correct fix is to update
next_up
andnext_down
inrounding.rs
, ensuring thatnext_up(-0.0)
returns0.0
andnext_down(0.0)
returns-0.0
.Are these changes tested?
Yes
Are there any user-facing changes?
No