[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs #52170

allisonwang-db · 2025-08-28T21:45:09Z

What changes were proposed in this pull request?

This PR adds more tests for various table argument support for Arrow Python UDTFs.
It also exposed some existing issues that need to be fixed:

SPARK-53387: Support PARTITION BY clause with Python Arrow UDTF
SPARK-53426: Support named table argument with asTable() API

Why are the changes needed?

To improve test coverage

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests

Was this patch authored or co-authored using generative AI tooling?

Yes

allisonwang-db · 2025-08-28T22:00:45Z

python/pyspark/sql/tests/arrow/test_arrow_udtf.py

+        # TODO(SPARK-53426): Support named table argument with DataFrame API
+        # input_df = self.spark.range(3)  # [0, 1, 2]
+        # result_df = NamedArgsUDTF(table_data=input_df.asTable(), multiplier=lit(5))


cc @xinrong-meng @ueshin

Fix: #52171

Thank you @ueshin for the fix!

Thanks the test can pass now!

ueshin

Otherwise, LGTM.

ueshin · 2025-09-22T22:20:02Z

python/pyspark/sql/tests/arrow/test_arrow_udtf.py

+        result_df = self.spark.sql(
+            """
+            SELECT * FROM partition_sum_udtf(
+                TABLE(partition_test_data) PARTITION BY category
+            ) ORDER BY partition_key
+        """
+        )


This is also potentially flaky as same as tests in the previous PR. Use terminate to be more stable?

Thank for pointing this out. Fixed.

ueshin · 2025-09-22T22:20:14Z

python/pyspark/sql/tests/arrow/test_arrow_udtf.py

+        result_df = self.spark.sql(
+            """
+            SELECT * FROM dept_status_count_udtf(
+                TABLE(SELECT * FROM employee_data) 
+                PARTITION BY (department, status)
+            ) ORDER BY dept, status
+        """
+        )


allisonwang-db · 2025-09-24T00:14:51Z

Thanks! Merging to master

zhengruifeng · 2025-09-24T00:28:07Z

late LGTM

github-actions bot added SQL PYTHON labels Aug 28, 2025

allisonwang-db changed the title ~~[SPARK-53425][PYTHON][TESTS] Add more able argument tests for Arrow Python UDTFs~~ [SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs Aug 28, 2025

allisonwang-db requested review from ueshin and zhengruifeng August 28, 2025 22:00

allisonwang-db commented Aug 28, 2025

View reviewed changes

allisonwang-db force-pushed the spark-53425-tbl-arg-tests branch from 4b03475 to 709ab71 Compare September 12, 2025 22:55

xinrong-meng approved these changes Sep 12, 2025

View reviewed changes

allisonwang-db added 3 commits September 22, 2025 14:26

add tests

5ffe718

fix

2f6b58b

rebase

d13c7ce

allisonwang-db force-pushed the spark-53425-tbl-arg-tests branch from 709ab71 to d13c7ce Compare September 22, 2025 21:30

test

7c87286

ueshin approved these changes Sep 22, 2025

View reviewed changes

allisonwang-db added 2 commits September 23, 2025 11:59

Address review feedback: use terminate pattern for partition by tests

9e8f169

Fix flake8 linting issues

482e894

allisonwang-db closed this in fa9e787 Sep 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs #52170

[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs #52170

allisonwang-db commented Aug 28, 2025 •

edited

Loading

Uh oh!

allisonwang-db Aug 28, 2025

Uh oh!

ueshin Aug 29, 2025

Uh oh!

xinrong-meng Aug 29, 2025 •

edited

Loading

Uh oh!

allisonwang-db Sep 12, 2025

Uh oh!

ueshin left a comment

Uh oh!

ueshin Sep 22, 2025

Uh oh!

allisonwang-db Sep 24, 2025

Uh oh!

ueshin Sep 22, 2025

Uh oh!

allisonwang-db commented Sep 24, 2025

Uh oh!

zhengruifeng commented Sep 24, 2025

Uh oh!

Uh oh!

[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs #52170

[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs #52170

Conversation

allisonwang-db commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

allisonwang-db Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

ueshin Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

xinrong-meng Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

allisonwang-db Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

ueshin left a comment

Choose a reason for hiding this comment

Uh oh!

ueshin Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

allisonwang-db Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

ueshin Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

allisonwang-db commented Sep 24, 2025

Uh oh!

zhengruifeng commented Sep 24, 2025

Uh oh!

Uh oh!

allisonwang-db commented Aug 28, 2025 •

edited

Loading

xinrong-meng Aug 29, 2025 •

edited

Loading