-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[FLINK-37725] Makes Async calcs share correlate split rule with Python #26505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d002034
to
3539058
Compare
ec2d2e0
to
8489b71
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the correlate split rules for async calcs and Python calls by unifying them under a common RemoteCorrelateSplitRule, thereby addressing codegen errors when using AsyncCalc in correlate queries.
- Migrates Python-specific correlate splitting to a common rule with PythonRemoteCalcCallFinder
- Updates test cases and rule sets to support asynchronous scalar functions in correlate queries
- Introduces minor improvements to RemoteCalcCallFinder and AsyncCalcSplitRule implementations
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
flink-table-planner/src/test/resources/org/apache/flink/table/planner/plan/rules/logical/AsyncCorrelateSplitRuleTest.xml | Added test cases verifying the correlate plan transformation with async calcs. |
flink-table-planner/src/test/java/org/apache/flink/table/planner/runtime/stream/table/AsyncCalcITCase.java | Introduced a new integration test for table functions with async calcs. |
Other test and rule files | Refactored and unified correlate split rules and updated the rule sets accordingly. |
PythonCorrelateSplitRule.java | Migrated legacy Python rule logic to leverage RemoteCorrelateSplitRule. |
RemoteCorrelateSplitRule.java, AsyncCalcSplitRule.java | Updated internal rule matching and equality logic to incorporate new RemoteCalcCallFinder implementations. |
return callFinder.isRemoteCall(rexNode) && callFinder.containsNonRemoteCall(rexNode) | ||
|| callFinder.isNonRemoteCall(rexNode) && callFinder.containsRemoteCall(rexNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider adding parentheses around the complex boolean condition involving callFinder checks to clarify operator precedence and improve readability.
return callFinder.isRemoteCall(rexNode) && callFinder.containsNonRemoteCall(rexNode) | |
|| callFinder.isNonRemoteCall(rexNode) && callFinder.containsRemoteCall(rexNode) | |
return (callFinder.isRemoteCall(rexNode) && callFinder.containsNonRemoteCall(rexNode)) | |
|| (callFinder.isNonRemoteCall(rexNode) && callFinder.containsRemoteCall(rexNode)) |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good comment from copilot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
...rc/main/java/org/apache/flink/table/planner/plan/rules/logical/PythonCorrelateSplitRule.java
Show resolved
Hide resolved
correlate.getRowType(), | ||
newCorrelate); | ||
|
||
call.transformTo(newTopCalc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider overriding hashCode() in RemoteCorrelateSplitRule to maintain consistency with the custom equals() implementation and ensure correct behavior in hash-based collections.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
RemoteCorrelateSplitRule(RemoteCalcCallFinder callFinder) { | ||
super( | ||
operand(FlinkLogicalCorrelate.class, any()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're modifying the class, can you please replace the usage of deprecated operand
and any
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, converted to the new form of rule with the config.
} | ||
|
||
private ScalarFunctionSplitter createScalarFunctionSplitter( | ||
RexProgram program, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDE tells me it's always null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the parameter and put null below. I also put a comment about how the scan should not contain any local references that need resolving, in my understanding. I could potentially make the program optional in ScalarFunctionSplitter
and assert it exists before use when seeing a local reference, but that seems unnecessary.
} | ||
|
||
FlinkLogicalCorrelate newCorrelate; | ||
if (extractedRexNodes.size() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (extractedRexNodes.size() > 0) { | |
if (!extractedRexNodes.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
private static final RemoteCalcCallFinder ASYNC_CALL_FINDER = new AsyncRemoteCalcCallFinder(); | ||
|
||
public static final RelOptRule CORRELATE_SPLIT = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static final RelOptRule CORRELATE_SPLIT = | |
public static final RelOptRule INSTANCE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return callFinder.isRemoteCall(rexNode) && callFinder.containsNonRemoteCall(rexNode) | ||
|| callFinder.isNonRemoteCall(rexNode) && callFinder.containsRemoteCall(rexNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good comment from copilot
import org.junit.jupiter.api.Test; | ||
|
||
/** Test for {@link AsyncCorrelateSplitRule}. */ | ||
public class AsyncCorrelateSplitRuleTest extends TableTestBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please fix the compiler warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done. I believe they are all fixed.
/** Test for {@link AsyncCorrelateSplitRule}. */ | ||
public class AsyncCorrelateSplitRuleTest extends TableTestBase { | ||
|
||
private TableTestUtil util = streamTestUtil(TableConfig.getDefault()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
@BeforeEach | ||
public void setup() { | ||
FlinkChainedProgram programs = new FlinkChainedProgram<BatchOptimizeContext>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raw use of parameterized class 'FlinkChainedProgram'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
FlinkHepRuleSetProgramBuilder.newBuilder() | ||
.setHepRulesExecutionType(HEP_RULES_EXECUTION_TYPE.RULE_SEQUENCE()) | ||
.setHepMatchOrder(HepMatchOrder.BOTTOM_UP) | ||
.add(FlinkStreamRuleSets.LOGICAL_REWRITE()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use a BatchOptimizeContext
context with Stream
rule set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I may have copied this part of the test -- my mistake. Changed to StreamOptimizeContext
.
8489b71
to
7203ae8
Compare
What is the purpose of the change
Moves
PythonCorrelateSplitRule
to be common asRemoteCorrelateSplitRule
so that we can move Async Scalars out of correlate queries. Today an attempt to use an AsyncCalc in a correlate results in a codegen error:Results in error:
Brief change log
(for example:)
Verifying this change
Please make sure both new and modified tests in this PR follow the conventions for tests defined in our code quality guide.
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation