Skip to content

Commit 7199fee

Browse files
committed
Merge remote-tracking branch 'origin/main' into allow_large_results
2 parents 435c602 + b454256 commit 7199fee

File tree

14 files changed

+632
-102
lines changed

14 files changed

+632
-102
lines changed

CHANGELOG.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,41 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.16.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.15.0...v2.16.0) (2025-08-20)
8+
9+
10+
### Features
11+
12+
* Add `bigframes.pandas.options.display.precision` option ([#1979](https://github.com/googleapis/python-bigquery-dataframes/issues/1979)) ([15e6175](https://github.com/googleapis/python-bigquery-dataframes/commit/15e6175ec0aeb1b7b02d0bba9e8e1e018bd11c31))
13+
* Add level, inplace params to reset_index ([#1988](https://github.com/googleapis/python-bigquery-dataframes/issues/1988)) ([3446950](https://github.com/googleapis/python-bigquery-dataframes/commit/34469504b79a082d3380f9f25c597483aef2068a))
14+
* Add ML code samples from dbt blog post ([#1978](https://github.com/googleapis/python-bigquery-dataframes/issues/1978)) ([ebaa244](https://github.com/googleapis/python-bigquery-dataframes/commit/ebaa244a9eb7b87f7f9fd9c3bebe5c7db24cd013))
15+
* Add where, coalesce, fillna, casewhen, invert local impl ([#1976](https://github.com/googleapis/python-bigquery-dataframes/issues/1976)) ([f7f686c](https://github.com/googleapis/python-bigquery-dataframes/commit/f7f686cf85ab7e265d9c07ebc7f0cd59babc5357))
16+
* Adjust anywidget CSS to prevent overflow ([#1981](https://github.com/googleapis/python-bigquery-dataframes/issues/1981)) ([204f083](https://github.com/googleapis/python-bigquery-dataframes/commit/204f083a2f00fcc9fd1500dcd7a738eda3904d2f))
17+
* Format page number in table widget ([#1992](https://github.com/googleapis/python-bigquery-dataframes/issues/1992)) ([e83836e](https://github.com/googleapis/python-bigquery-dataframes/commit/e83836e8e1357f009f3f95666f1661bdbe0d3751))
18+
* Or, And, Xor can execute locally ([#1994](https://github.com/googleapis/python-bigquery-dataframes/issues/1994)) ([59c52a5](https://github.com/googleapis/python-bigquery-dataframes/commit/59c52a55ebea697855eb4c70529e226cc077141f))
19+
* Support callable bigframes function for dataframe where ([#1990](https://github.com/googleapis/python-bigquery-dataframes/issues/1990)) ([44c1ec4](https://github.com/googleapis/python-bigquery-dataframes/commit/44c1ec48cc4db1c4c9c15ec1fab43d4ef0758e56))
20+
* Support callable for series where method ([#2005](https://github.com/googleapis/python-bigquery-dataframes/issues/2005)) ([768b82a](https://github.com/googleapis/python-bigquery-dataframes/commit/768b82af96a5dd0c434edcb171036eb42cfb9b41))
21+
* When using `repr_mode = "anywidget"`, numeric values align right ([15e6175](https://github.com/googleapis/python-bigquery-dataframes/commit/15e6175ec0aeb1b7b02d0bba9e8e1e018bd11c31))
22+
23+
24+
### Bug Fixes
25+
26+
* Address the packages issue for bigframes function ([#1991](https://github.com/googleapis/python-bigquery-dataframes/issues/1991)) ([68f1d22](https://github.com/googleapis/python-bigquery-dataframes/commit/68f1d22d5ed8457a5cabc7751ed1d178063dd63e))
27+
* Correct pypdf dependency specifier for remote PDF functions ([#1980](https://github.com/googleapis/python-bigquery-dataframes/issues/1980)) ([0bd5e1b](https://github.com/googleapis/python-bigquery-dataframes/commit/0bd5e1b3c004124d2100c3fbec2fbe1e965d1e96))
28+
* Enable default retries in calls to BQ Storage Read API ([#1985](https://github.com/googleapis/python-bigquery-dataframes/issues/1985)) ([f25d7bd](https://github.com/googleapis/python-bigquery-dataframes/commit/f25d7bd30800dffa65b6c31b0b7ac711a13d790f))
29+
* Fix the copyright year in dbt sample files ([#1996](https://github.com/googleapis/python-bigquery-dataframes/issues/1996)) ([fad5722](https://github.com/googleapis/python-bigquery-dataframes/commit/fad57223d129f0c95d0c6a066179bb66880edd06))
30+
31+
32+
### Performance Improvements
33+
34+
* Faster session startup by defering anon dataset fetch ([#1982](https://github.com/googleapis/python-bigquery-dataframes/issues/1982)) ([2720c4c](https://github.com/googleapis/python-bigquery-dataframes/commit/2720c4cf070bf57a0930d7623bfc41d89cc053ee))
35+
36+
37+
### Documentation
38+
39+
* Add examples of running bigframes in kaggle ([#2002](https://github.com/googleapis/python-bigquery-dataframes/issues/2002)) ([7d89d76](https://github.com/googleapis/python-bigquery-dataframes/commit/7d89d76976595b75cb0105fbe7b4f7ca2fdf49f2))
40+
* Remove preview warning from partial ordering mode sample notebook ([#1986](https://github.com/googleapis/python-bigquery-dataframes/issues/1986)) ([132e0ed](https://github.com/googleapis/python-bigquery-dataframes/commit/132e0edfe9f96c15753649d77fcb6edd0b0708a3))
41+
742
## [2.15.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.14.0...v2.15.0) (2025-08-11)
843

944

bigframes/core/compile/sqlglot/expressions/binary_compiler.py

Lines changed: 97 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,12 @@
1414

1515
from __future__ import annotations
1616

17-
import bigframes_vendored.constants as constants
17+
import bigframes_vendored.constants as bf_constants
1818
import sqlglot.expressions as sge
1919

2020
from bigframes import dtypes
2121
from bigframes import operations as ops
22+
import bigframes.core.compile.sqlglot.expressions.constants as constants
2223
from bigframes.core.compile.sqlglot.expressions.op_registration import OpRegistration
2324
from bigframes.core.compile.sqlglot.expressions.typed_expr import TypedExpr
2425

@@ -37,50 +38,65 @@ def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
3738
return sge.Concat(expressions=[left.expr, right.expr])
3839

3940
if dtypes.is_numeric(left.dtype) and dtypes.is_numeric(right.dtype):
40-
left_expr = left.expr
41-
if left.dtype == dtypes.BOOL_DTYPE:
42-
left_expr = sge.Cast(this=left_expr, to="INT64")
43-
right_expr = right.expr
44-
if right.dtype == dtypes.BOOL_DTYPE:
45-
right_expr = sge.Cast(this=right_expr, to="INT64")
41+
left_expr = _coerce_bool_to_int(left)
42+
right_expr = _coerce_bool_to_int(right)
4643
return sge.Add(this=left_expr, expression=right_expr)
4744

4845
if (
4946
dtypes.is_time_or_date_like(left.dtype)
5047
and right.dtype == dtypes.TIMEDELTA_DTYPE
5148
):
52-
left_expr = left.expr
53-
if left.dtype == dtypes.DATE_DTYPE:
54-
left_expr = sge.Cast(this=left_expr, to="DATETIME")
49+
left_expr = _coerce_date_to_datetime(left)
5550
return sge.TimestampAdd(
5651
this=left_expr, expression=right.expr, unit=sge.Var(this="MICROSECOND")
5752
)
5853
if (
5954
dtypes.is_time_or_date_like(right.dtype)
6055
and left.dtype == dtypes.TIMEDELTA_DTYPE
6156
):
62-
right_expr = right.expr
63-
if right.dtype == dtypes.DATE_DTYPE:
64-
right_expr = sge.Cast(this=right_expr, to="DATETIME")
57+
right_expr = _coerce_date_to_datetime(right)
6558
return sge.TimestampAdd(
6659
this=right_expr, expression=left.expr, unit=sge.Var(this="MICROSECOND")
6760
)
6861
if left.dtype == dtypes.TIMEDELTA_DTYPE and right.dtype == dtypes.TIMEDELTA_DTYPE:
6962
return sge.Add(this=left.expr, expression=right.expr)
7063

7164
raise TypeError(
72-
f"Cannot add type {left.dtype} and {right.dtype}. {constants.FEEDBACK_LINK}"
65+
f"Cannot add type {left.dtype} and {right.dtype}. {bf_constants.FEEDBACK_LINK}"
7366
)
7467

7568

76-
@BINARY_OP_REGISTRATION.register(ops.div_op)
69+
@BINARY_OP_REGISTRATION.register(ops.eq_op)
70+
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
71+
left_expr = _coerce_bool_to_int(left)
72+
right_expr = _coerce_bool_to_int(right)
73+
return sge.EQ(this=left_expr, expression=right_expr)
74+
75+
76+
@BINARY_OP_REGISTRATION.register(ops.eq_null_match_op)
7777
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
7878
left_expr = left.expr
79-
if left.dtype == dtypes.BOOL_DTYPE:
80-
left_expr = sge.Cast(this=left_expr, to="INT64")
79+
if right.dtype != dtypes.BOOL_DTYPE:
80+
left_expr = _coerce_bool_to_int(left)
81+
8182
right_expr = right.expr
82-
if right.dtype == dtypes.BOOL_DTYPE:
83-
right_expr = sge.Cast(this=right_expr, to="INT64")
83+
if left.dtype != dtypes.BOOL_DTYPE:
84+
right_expr = _coerce_bool_to_int(right)
85+
86+
sentinel = sge.convert("$NULL_SENTINEL$")
87+
left_coalesce = sge.Coalesce(
88+
this=sge.Cast(this=left_expr, to="STRING"), expressions=[sentinel]
89+
)
90+
right_coalesce = sge.Coalesce(
91+
this=sge.Cast(this=right_expr, to="STRING"), expressions=[sentinel]
92+
)
93+
return sge.EQ(this=left_coalesce, expression=right_coalesce)
94+
95+
96+
@BINARY_OP_REGISTRATION.register(ops.div_op)
97+
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
98+
left_expr = _coerce_bool_to_int(left)
99+
right_expr = _coerce_bool_to_int(right)
84100

85101
result = sge.func("IEEE_DIVIDE", left_expr, right_expr)
86102
if left.dtype == dtypes.TIMEDELTA_DTYPE and dtypes.is_numeric(right.dtype):
@@ -89,6 +105,39 @@ def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
89105
return result
90106

91107

108+
@BINARY_OP_REGISTRATION.register(ops.floordiv_op)
109+
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
110+
left_expr = _coerce_bool_to_int(left)
111+
right_expr = _coerce_bool_to_int(right)
112+
113+
result: sge.Expression = sge.Cast(
114+
this=sge.Floor(this=sge.func("IEEE_DIVIDE", left_expr, right_expr)), to="INT64"
115+
)
116+
117+
# DIV(N, 0) will error in bigquery, but needs to return `0` for int, and
118+
# `inf`` for float in BQ so we short-circuit in this case.
119+
# Multiplying left by zero propogates nulls.
120+
zero_result = (
121+
constants._INF
122+
if (left.dtype == dtypes.FLOAT_DTYPE or right.dtype == dtypes.FLOAT_DTYPE)
123+
else constants._ZERO
124+
)
125+
result = sge.Case(
126+
ifs=[
127+
sge.If(
128+
this=sge.EQ(this=right_expr, expression=constants._ZERO),
129+
true=zero_result * left_expr,
130+
)
131+
],
132+
default=result,
133+
)
134+
135+
if dtypes.is_numeric(right.dtype) and left.dtype == dtypes.TIMEDELTA_DTYPE:
136+
result = sge.Cast(this=sge.Floor(this=result), to="INT64")
137+
138+
return result
139+
140+
92141
@BINARY_OP_REGISTRATION.register(ops.ge_op)
93142
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
94143
return sge.GTE(this=left.expr, expression=right.expr)
@@ -101,12 +150,8 @@ def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
101150

102151
@BINARY_OP_REGISTRATION.register(ops.mul_op)
103152
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
104-
left_expr = left.expr
105-
if left.dtype == dtypes.BOOL_DTYPE:
106-
left_expr = sge.Cast(this=left_expr, to="INT64")
107-
right_expr = right.expr
108-
if right.dtype == dtypes.BOOL_DTYPE:
109-
right_expr = sge.Cast(this=right_expr, to="INT64")
153+
left_expr = _coerce_bool_to_int(left)
154+
right_expr = _coerce_bool_to_int(right)
110155

111156
result = sge.Mul(this=left_expr, expression=right_expr)
112157

@@ -118,36 +163,33 @@ def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
118163
return result
119164

120165

166+
@BINARY_OP_REGISTRATION.register(ops.ne_op)
167+
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
168+
left_expr = _coerce_bool_to_int(left)
169+
right_expr = _coerce_bool_to_int(right)
170+
return sge.NEQ(this=left_expr, expression=right_expr)
171+
172+
121173
@BINARY_OP_REGISTRATION.register(ops.sub_op)
122174
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
123175
if dtypes.is_numeric(left.dtype) and dtypes.is_numeric(right.dtype):
124-
left_expr = left.expr
125-
if left.dtype == dtypes.BOOL_DTYPE:
126-
left_expr = sge.Cast(this=left_expr, to="INT64")
127-
right_expr = right.expr
128-
if right.dtype == dtypes.BOOL_DTYPE:
129-
right_expr = sge.Cast(this=right_expr, to="INT64")
176+
left_expr = _coerce_bool_to_int(left)
177+
right_expr = _coerce_bool_to_int(right)
130178
return sge.Sub(this=left_expr, expression=right_expr)
131179

132180
if (
133181
dtypes.is_time_or_date_like(left.dtype)
134182
and right.dtype == dtypes.TIMEDELTA_DTYPE
135183
):
136-
left_expr = left.expr
137-
if left.dtype == dtypes.DATE_DTYPE:
138-
left_expr = sge.Cast(this=left_expr, to="DATETIME")
184+
left_expr = _coerce_date_to_datetime(left)
139185
return sge.TimestampSub(
140186
this=left_expr, expression=right.expr, unit=sge.Var(this="MICROSECOND")
141187
)
142188
if dtypes.is_time_or_date_like(left.dtype) and dtypes.is_time_or_date_like(
143189
right.dtype
144190
):
145-
left_expr = left.expr
146-
if left.dtype == dtypes.DATE_DTYPE:
147-
left_expr = sge.Cast(this=left_expr, to="DATETIME")
148-
right_expr = right.expr
149-
if right.dtype == dtypes.DATE_DTYPE:
150-
right_expr = sge.Cast(this=right_expr, to="DATETIME")
191+
left_expr = _coerce_date_to_datetime(left)
192+
right_expr = _coerce_date_to_datetime(right)
151193
return sge.TimestampDiff(
152194
this=left_expr, expression=right_expr, unit=sge.Var(this="MICROSECOND")
153195
)
@@ -156,10 +198,24 @@ def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
156198
return sge.Sub(this=left.expr, expression=right.expr)
157199

158200
raise TypeError(
159-
f"Cannot subtract type {left.dtype} and {right.dtype}. {constants.FEEDBACK_LINK}"
201+
f"Cannot subtract type {left.dtype} and {right.dtype}. {bf_constants.FEEDBACK_LINK}"
160202
)
161203

162204

163205
@BINARY_OP_REGISTRATION.register(ops.obj_make_ref_op)
164206
def _(op, left: TypedExpr, right: TypedExpr) -> sge.Expression:
165207
return sge.func("OBJ.MAKE_REF", left.expr, right.expr)
208+
209+
210+
def _coerce_bool_to_int(typed_expr: TypedExpr) -> sge.Expression:
211+
"""Coerce boolean expression to integer."""
212+
if typed_expr.dtype == dtypes.BOOL_DTYPE:
213+
return sge.Cast(this=typed_expr.expr, to="INT64")
214+
return typed_expr.expr
215+
216+
217+
def _coerce_date_to_datetime(typed_expr: TypedExpr) -> sge.Expression:
218+
"""Coerce date expression to datetime."""
219+
if typed_expr.dtype == dtypes.DATE_DTYPE:
220+
return sge.Cast(this=typed_expr.expr, to="DATETIME")
221+
return typed_expr.expr
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Copyright 2025 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import sqlglot.expressions as sge
16+
17+
_ZERO = sge.Cast(this=sge.convert(0), to="INT64")
18+
_NAN = sge.Cast(this=sge.convert("NaN"), to="FLOAT64")
19+
_INF = sge.Cast(this=sge.convert("Infinity"), to="FLOAT64")
20+
_NEG_INF = sge.Cast(this=sge.convert("-Infinity"), to="FLOAT64")
21+
22+
# Approx Highest number you can pass in to EXP function and get a valid FLOAT64 result
23+
# FLOAT64 has 11 exponent bits, so max values is about 2**(2**10)
24+
# ln(2**(2**10)) == (2**10)*ln(2) ~= 709.78, so EXP(x) for x>709.78 will overflow.
25+
_FLOAT64_EXP_BOUND = sge.convert(709.78)

0 commit comments

Comments
 (0)