Skip to content

Conversation

liyang1206
Copy link
Contributor

What is the purpose of the change

Implementing FLINK-36377 for supporting ROW, ARRAY, MAP, and RAW types for LAST_VALUE aggregate function.

Brief change log

  • Modify the AggFunctionFactory to ensure that the creation of the LAST_VALUE function supports ROW, ARRAY, MAP, and RAW types.
  • Modify the LastValueAggFunction and LastValueWithRetractAggFunction to support copying of complex data type arguments and remove the original handling of StringData

Verifying this change

This change added tests and can be verified as follows:

  • Added test in AggregateITCase to validate ROW, MAP, ARRAY in LAST_VALUE
  • Added unit test for applying LAST_VALUE for ROW in LastValueAggFunctionWithOrderTest, LastValueAggFunctionWithoutOrderTest, LastValueWithRetractAggFunctionWithOrderTest and LastValueWithRetractAggFunctionWithoutOrderTest
  • Added unit test for applying LAST_VALUE for ROW and ARRAY in GroupAggregateHarnessTest

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): yes
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 12, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

'age,
'slacker,
'manager,
'extra_row,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can test deeper nesting under the row including a row with arrays of rows of maps. Also if Variants and structured data types can be nested we should include them in the test.

@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Aug 13, 2025
@snuyanzin
Copy link
Contributor

@liyang1206 thank you for the contribution
can you please clarify why do we need serializer changes
and what do you mean by this

Implementing FLINK-36377 for supporting ROW, ARRAY, MAP, and RAW types for LAST_VALUE aggregate function.

I locally reverted all the non-test changes from this PR and all the tests continue passing...
can you please clarify if there is any non-working case?

@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Aug 20, 2025
@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Aug 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-reviewed PR has been reviewed by the community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants