Add `pylibcudf.Scalar.from_py` for construction from Python strings, bool, int, float #17898

mroeschke · 2025-02-01T02:12:21Z

Description

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

vyasr · 2025-02-08T01:50:51Z

python/pylibcudf/pylibcudf/scalar.pyx

+        if isinstance(py_val, py_bool):
+            dtype = DataType(type_id.BOOL8)
+            c_val = make_numeric_scalar(dtype.c_obj)
+            (<numeric_scalar[cbool]*>c_val.get()).set_value(py_val)
+        elif isinstance(py_val, int):
+            dtype = DataType(type_id.INT64)
+            c_val = make_numeric_scalar(dtype.c_obj)
+            (<numeric_scalar[int64_t]*>c_val.get()).set_value(py_val)
+        elif isinstance(py_val, float):
+            dtype = DataType(type_id.FLOAT64)
+            c_val = make_numeric_scalar(dtype.c_obj)
+            (<numeric_scalar[double]*>c_val.get()).set_value(py_val)
+        elif isinstance(py_val, str):
+            dtype = DataType(type_id.STRING)
+            c_val = make_string_scalar(py_val.encode())
+        else:
+            raise NotImplementedError(f"{type(py_val).__name__} is not supported.")


Can we rewrite this pattern with singledispatch? I anticipate this cascade eventually getting arbitrarily large and we'll start paying the price of failing through earlier cases to get to later ones. That also gives you the benefit of (since each overload is a separate function) being able to cdef the c_val type accordingly in each case.

Good idea, how would you handle creating the final Scalar with multiple dispatched functions? I'm referring to this logic

cdef Scalar s = Scalar.__new__(Scalar) s.c_obj.swap(c_val) s._data_type = dtype return s

Oh would this just go in a separate method so it can be used in each dispatched method?

Eg.

@staticmethod def _finalize_scalar(c_obj, dtype): cdef Scalar s = Scalar.__new__(Scalar) s.c_obj.swap(c_obj) s._data_type = dtype return s

Matt711

Thanks, a couple non blocking questions

Matt711 · 2025-02-08T01:52:44Z

python/pylibcudf/pylibcudf/scalar.pyx

+
+    @classmethod
+    def from_py(cls, py_val):
+        """Convert a Python standard library object to a Scalar.


Should we expand this doc string if this method is public?

Matt711 · 2025-02-08T02:03:25Z

python/pylibcudf/pylibcudf/scalar.pyx

+            dtype = DataType(type_id.STRING)
+            c_val = make_string_scalar(py_val.encode())
+        else:
+            raise NotImplementedError(f"{type(py_val).__name__} is not supported.")


I'm not sure if this is a nitpick (or incorrect) but should we distinguish between a type (that can be converted to a scalar) but is not implemented yet and a type that cannot be converted to a scalar? If its the latter, then we should raise TypeError, right?

It is a nitpick, but not incorrect. You're right, that would be optimal. The difficulty is in separating the two. We could have this fallback case be a TypeError and then add another NotImplementedError case that checks isinstance on a bunch of types that we expect to support. It'll be very difficult to capture all of them exhaustively, though. We'll probably have some things falling through to TypeError that we actually want to support for a while if we do this.

mroeschke added 2 commits January 31, 2025 17:41

Add Scalar.as_py for str, bool, int, float

fd18c1a

Remove unneeded code

273c601

mroeschke added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package labels Feb 1, 2025

mroeschke self-assigned this Feb 1, 2025

mroeschke requested a review from a team as a code owner February 1, 2025 02:12

mroeschke requested review from galipremsagar and Matt711 February 1, 2025 02:12

github-actions bot added the Python Affects Python cuDF API. label Feb 1, 2025

mroeschke changed the title ~~Add pylibcudfScalar.from_py for construction from Python strings, bool, int, float~~ Add pylibcudf.Scalar.from_py for construction from Python strings, bool, int, float Feb 1, 2025

Merge branch 'branch-25.04' into plc/scalar/from_py

4815c65

github-actions bot assigned vyasr Feb 8, 2025

vyasr requested changes Feb 8, 2025

View reviewed changes

Matt711 approved these changes Feb 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `pylibcudf.Scalar.from_py` for construction from Python strings, bool, int, float #17898

Add `pylibcudf.Scalar.from_py` for construction from Python strings, bool, int, float #17898

mroeschke commented Feb 1, 2025

vyasr Feb 8, 2025

Matt711 Feb 8, 2025

Matt711 Feb 8, 2025

Matt711 Feb 8, 2025 •

edited

Loading

Matt711 left a comment

Matt711 Feb 8, 2025

Matt711 Feb 8, 2025

vyasr Feb 8, 2025

Add pylibcudf.Scalar.from_py for construction from Python strings, bool, int, float #17898

Are you sure you want to change the base?

Add pylibcudf.Scalar.from_py for construction from Python strings, bool, int, float #17898

Conversation

mroeschke commented Feb 1, 2025

Description

Checklist

vyasr Feb 8, 2025

Choose a reason for hiding this comment

Matt711 Feb 8, 2025

Choose a reason for hiding this comment

Matt711 Feb 8, 2025

Choose a reason for hiding this comment

Matt711 Feb 8, 2025 • edited Loading

Choose a reason for hiding this comment

Matt711 left a comment

Choose a reason for hiding this comment

Matt711 Feb 8, 2025

Choose a reason for hiding this comment

Matt711 Feb 8, 2025

Choose a reason for hiding this comment

vyasr Feb 8, 2025

Choose a reason for hiding this comment

Add `pylibcudf.Scalar.from_py` for construction from Python strings, bool, int, float #17898

Add `pylibcudf.Scalar.from_py` for construction from Python strings, bool, int, float #17898

Matt711 Feb 8, 2025 •

edited

Loading