-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Draft #26699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Draft #26699
Conversation
Differential Revision: D87904577
Reviewer's GuideAdds support for an "empty connector" optimization path that allows connector-specific optimizers to run on plans composed only of non-connector leaf nodes (Values), gated by a new session property, and extends tests to validate the behavior across simple, union, and mixed plans. Class diagram for empty connector optimizer integrationclassDiagram
class ApplyConnectorOptimization {
- Supplier<Map<ConnectorId, Set<ConnectorPlanOptimizer>>> connectorOptimizersSupplier
- Map<ConnectorId, Set<ConnectorPlanOptimizer>> connectorOptimizers
- static ConnectorId EMPTY_CONNECTOR_ID
+ PlanOptimizerResult optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator variableAllocator, PlanNodeIdAllocator idAllocator)
}
class ConnectorAccessInfo {
- Set<ConnectorId> reachableConnectors
- Set<Class<? extends PlanNode>> reachablePlanNodeTypes
+ boolean isClosure(ConnectorId connectorId, Session session, List<ConnectorId> supportedConnectorId)
}
class SystemSessionProperties {
<<final>>
+ static String ENABLE_EMPTY_CONNECTOR_OPTIMIZER
+ static boolean isEmptyConnectorOptimizerEnabled(Session session)
}
class Session {
+ Optional<String> getCatalog()
+ <T> T getSystemProperty(String key, Class<T> clazz)
+ ConnectorSession toConnectorSession(ConnectorId connectorId)
}
class ConnectorPlanOptimizer {
+ List<ConnectorId> getSupportedConnectorIds()
+ PlanNode optimize(PlanNode plan, ConnectorSession session, SymbolAllocator variableAllocator, PlanNodeIdAllocator idAllocator)
}
class ConnectorSession {
+ Optional<ConnectorId> getConnectorId()
}
ApplyConnectorOptimization ..> SystemSessionProperties : uses
ApplyConnectorOptimization ..> Session : uses
ApplyConnectorOptimization ..> ConnectorPlanOptimizer : invokes
ApplyConnectorOptimization ..> ConnectorSession : creates
ApplyConnectorOptimization o-- ConnectorAccessInfo : uses helper
ConnectorAccessInfo ..> SystemSessionProperties : calls isEmptyConnectorOptimizerEnabled
ConnectorAccessInfo ..> ConnectorId : evaluates
class ConnectorId {
+ ConnectorId(String id)
}
ApplyConnectorOptimization ..> ConnectorId : manages
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The logic that handles the EMPTY_CONNECTOR_ID in
ApplyConnectorOptimization.optimizeandisClosure(e.g., repeatedconnectorIdSet.stream().allMatch(x -> x.equals(EMPTY_CONNECTOR_ID))/reachableConnectors.stream().allMatch(...)checks and special filtering forgetSupportedConnectorIds() == [EMPTY_CONNECTOR_ID]) is fairly intricate; consider extracting this into a dedicated helper to centralize the condition and avoid subtle divergence between the two code paths. - In the EMPTY_CONNECTOR handling branch of
optimize, you callconnectorOptimizers.get(queryConnectorId)twice andx.getSupportedConnectorIds()multiple times inside the stream; consider storing these in local variables to make the code clearer and avoid redundant lookups.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The logic that handles the EMPTY_CONNECTOR_ID in `ApplyConnectorOptimization.optimize` and `isClosure` (e.g., repeated `connectorIdSet.stream().allMatch(x -> x.equals(EMPTY_CONNECTOR_ID))` / `reachableConnectors.stream().allMatch(...)` checks and special filtering for `getSupportedConnectorIds() == [EMPTY_CONNECTOR_ID]`) is fairly intricate; consider extracting this into a dedicated helper to centralize the condition and avoid subtle divergence between the two code paths.
- In the EMPTY_CONNECTOR handling branch of `optimize`, you call `connectorOptimizers.get(queryConnectorId)` twice and `x.getSupportedConnectorIds()` multiple times inside the stream; consider storing these in local variables to make the code clearer and avoid redundant lookups.
## Individual Comments
### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/ApplyConnectorOptimization.java:357-358` </location>
<code_context>
boolean isClosure(ConnectorId connectorId, Session session, List<ConnectorId> supportedConnectorId)
{
+ if (isEmptyConnectorOptimizerEnabled(session) && reachableConnectors.stream().allMatch(x -> x.equals(EMPTY_CONNECTOR_ID)) && supportedConnectorId.size() == 1 && supportedConnectorId.get(0).equals(EMPTY_CONNECTOR_ID)) {
+ return containsAll(CONNECTOR_ACCESSIBLE_PLAN_NODES, reachablePlanNodeTypes);
+ }
// check if all children can reach the only connector
</code_context>
<issue_to_address>
**issue:** `allMatch` over an empty set makes the special-case branch fire when `reachableConnectors` is empty.
`Stream#allMatch` returns `true` on an empty stream, so this condition is also true when `reachableConnectors` is empty. That means the "empty connector" special case will trigger even when there are no reachable connectors, as long as `supportedConnectorId` is `[EMPTY_CONNECTOR_ID]`.
If the intent is "all reachable connectors are EMPTY" rather than "there are no reachable connectors", consider guarding with a non-empty check, e.g.:
```java
!reachableConnectors.isEmpty()
&& reachableConnectors.stream().allMatch(EMPTY_CONNECTOR_ID::equals)
```
Otherwise, plans with no connectors may be treated as closures, altering optimization behavior unexpectedly.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
We have an internal connector optimizer, which we want to apply to queries which have values node only. Add a session property to control it.
Description
We have an internal connector optimizer, which we want to apply to queries which have values node only. Add a session property to control it.
Motivation and Context
as in description
Impact
as in description
Test Plan
Unit tests
Contributor checklist
Release Notes