Skip to content

Conversation

@feilong-liu
Copy link
Contributor

@feilong-liu feilong-liu commented Nov 26, 2025

We have an internal connector optimizer, which we want to apply to queries which have values node only. Add a session property to control it.

Description

We have an internal connector optimizer, which we want to apply to queries which have values node only. Add a session property to control it.

Motivation and Context

as in description

Impact

as in description

Test Plan

Unit tests

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

== NO RELEASE NOTE ==

Differential Revision: D87904577
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Nov 26, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 26, 2025

Reviewer's Guide

Adds support for an "empty connector" optimization path that allows connector-specific optimizers to run on plans composed only of non-connector leaf nodes (Values), gated by a new session property, and extends tests to validate the behavior across simple, union, and mixed plans.

Class diagram for empty connector optimizer integration

classDiagram
    class ApplyConnectorOptimization {
        - Supplier<Map<ConnectorId, Set<ConnectorPlanOptimizer>>> connectorOptimizersSupplier
        - Map<ConnectorId, Set<ConnectorPlanOptimizer>> connectorOptimizers
        - static ConnectorId EMPTY_CONNECTOR_ID
        + PlanOptimizerResult optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator variableAllocator, PlanNodeIdAllocator idAllocator)
    }

    class ConnectorAccessInfo {
        - Set<ConnectorId> reachableConnectors
        - Set<Class<? extends PlanNode>> reachablePlanNodeTypes
        + boolean isClosure(ConnectorId connectorId, Session session, List<ConnectorId> supportedConnectorId)
    }

    class SystemSessionProperties {
        <<final>>
        + static String ENABLE_EMPTY_CONNECTOR_OPTIMIZER
        + static boolean isEmptyConnectorOptimizerEnabled(Session session)
    }

    class Session {
        + Optional<String> getCatalog()
        + <T> T getSystemProperty(String key, Class<T> clazz)
        + ConnectorSession toConnectorSession(ConnectorId connectorId)
    }

    class ConnectorPlanOptimizer {
        + List<ConnectorId> getSupportedConnectorIds()
        + PlanNode optimize(PlanNode plan, ConnectorSession session, SymbolAllocator variableAllocator, PlanNodeIdAllocator idAllocator)
    }

    class ConnectorSession {
        + Optional<ConnectorId> getConnectorId()
    }

    ApplyConnectorOptimization ..> SystemSessionProperties : uses
    ApplyConnectorOptimization ..> Session : uses
    ApplyConnectorOptimization ..> ConnectorPlanOptimizer : invokes
    ApplyConnectorOptimization ..> ConnectorSession : creates
    ApplyConnectorOptimization o-- ConnectorAccessInfo : uses helper

    ConnectorAccessInfo ..> SystemSessionProperties : calls isEmptyConnectorOptimizerEnabled
    ConnectorAccessInfo ..> ConnectorId : evaluates

    class ConnectorId {
        + ConnectorId(String id)
    }

    ApplyConnectorOptimization ..> ConnectorId : manages
Loading

File-Level Changes

Change Details Files
Introduce a session-gated empty-connector optimization flow in ApplyConnectorOptimization so connectors can optimize plans with only Values nodes.
  • Rename the internal empty connector ID constant to a stable, explicit string name.
  • Collect all connector IDs in the plan once and reuse the built set during optimization iteration.
  • When the empty-connector optimizer is enabled and all connectors in the plan are the empty connector, remap to the query catalog’s connector and select only optimizers that declare support solely for the empty connector ID.
  • During optimizer execution for the empty connector, build the ConnectorSession using the query catalog’s connector ID so connector-specific session properties are visible.
  • Update the closure check in MaxClosurePlanExtractor to treat an empty-connector-only optimizer (supporting only the empty connector ID) as a closure when all reachable connectors are empty connectors and all connector-accessible node types are present.
presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/ApplyConnectorOptimization.java
Add a new system/session property to control whether the empty-connector optimizer path is enabled and expose a helper to check it.
  • Introduce ENABLE_EMPTY_CONNECTOR_OPTIMIZER constant and register it as a boolean system property with default false and description about optimizing queries with Values nodes.
  • Provide isEmptyConnectorOptimizerEnabled(Session) helper to read the new property.
presto-main-base/src/main/java/com/facebook/presto/SystemSessionProperties.java
Extend connector optimization tests to cover the empty-connector optimizer behavior and enable using a custom set of connector optimizers per test session.
  • Add testEmptyConnectorOptimization to validate that when only Values nodes are present and the empty-connector optimizer is enabled, a connector-level optimizer is applied to Values-based plans but not to plans that involve other connectors.
  • Introduce a two-argument optimize helper that accepts a map of connector IDs to ConnectorPlanOptimizers and a Session, wiring it through ApplyConnectorOptimization.
  • Implement createEmptyConnectorOptimizer utility that returns a ConnectorPlanOptimizer which wraps the given subplan in a TRUE filter and declares support for a specific empty connector ID matching the internal EMPTY_CONNECTOR_ID.
  • Verify behaviors on simple Values outputs, unions of Values, mixed union with TableScan from another connector, nested filters, and pure TableScan plans where the empty-connector optimizer must not alter the plan.
presto-main-base/src/test/java/com/facebook/presto/sql/planner/optimizations/TestConnectorOptimization.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@feilong-liu feilong-liu changed the title Update connector optimizer mis: Update connector optimizer for query with values Nov 26, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The logic that handles the EMPTY_CONNECTOR_ID in ApplyConnectorOptimization.optimize and isClosure (e.g., repeated connectorIdSet.stream().allMatch(x -> x.equals(EMPTY_CONNECTOR_ID)) / reachableConnectors.stream().allMatch(...) checks and special filtering for getSupportedConnectorIds() == [EMPTY_CONNECTOR_ID]) is fairly intricate; consider extracting this into a dedicated helper to centralize the condition and avoid subtle divergence between the two code paths.
  • In the EMPTY_CONNECTOR handling branch of optimize, you call connectorOptimizers.get(queryConnectorId) twice and x.getSupportedConnectorIds() multiple times inside the stream; consider storing these in local variables to make the code clearer and avoid redundant lookups.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The logic that handles the EMPTY_CONNECTOR_ID in `ApplyConnectorOptimization.optimize` and `isClosure` (e.g., repeated `connectorIdSet.stream().allMatch(x -> x.equals(EMPTY_CONNECTOR_ID))` / `reachableConnectors.stream().allMatch(...)` checks and special filtering for `getSupportedConnectorIds() == [EMPTY_CONNECTOR_ID]`) is fairly intricate; consider extracting this into a dedicated helper to centralize the condition and avoid subtle divergence between the two code paths.
- In the EMPTY_CONNECTOR handling branch of `optimize`, you call `connectorOptimizers.get(queryConnectorId)` twice and `x.getSupportedConnectorIds()` multiple times inside the stream; consider storing these in local variables to make the code clearer and avoid redundant lookups.

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/ApplyConnectorOptimization.java:357-358` </location>
<code_context>

         boolean isClosure(ConnectorId connectorId, Session session, List<ConnectorId> supportedConnectorId)
         {
+            if (isEmptyConnectorOptimizerEnabled(session) && reachableConnectors.stream().allMatch(x -> x.equals(EMPTY_CONNECTOR_ID)) && supportedConnectorId.size() == 1 && supportedConnectorId.get(0).equals(EMPTY_CONNECTOR_ID)) {
+                return containsAll(CONNECTOR_ACCESSIBLE_PLAN_NODES, reachablePlanNodeTypes);
+            }
             // check if all children can reach the only connector
</code_context>

<issue_to_address>
**issue:** `allMatch` over an empty set makes the special-case branch fire when `reachableConnectors` is empty.

`Stream#allMatch` returns `true` on an empty stream, so this condition is also true when `reachableConnectors` is empty. That means the "empty connector" special case will trigger even when there are no reachable connectors, as long as `supportedConnectorId` is `[EMPTY_CONNECTOR_ID]`.

If the intent is "all reachable connectors are EMPTY" rather than "there are no reachable connectors", consider guarding with a non-empty check, e.g.:

```java
!reachableConnectors.isEmpty()
    && reachableConnectors.stream().allMatch(EMPTY_CONNECTOR_ID::equals)
```

Otherwise, plans with no connectors may be treated as closures, altering optimization behavior unexpectedly.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@feilong-liu feilong-liu marked this pull request as draft November 26, 2025 01:54
@feilong-liu feilong-liu changed the title mis: Update connector optimizer for query with values Draft Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants