Skip to content

Conversation

@sh-shamsan
Copy link
Contributor

@sh-shamsan sh-shamsan commented Nov 20, 2025

Description

This bug fix normalizes identifiers for tables and columns when the case-sensitive flag is disabled. As a result, mixed case table/column names no longer cause query failures when using the Pinot connector in Presto.

Motivation and Context

to fix failing queries when the flag is disabled

Impact

Test Plan

Tested using a Pinot connector instance from IBM QA. Also tested with the flag enabled to ensure behavior remains unchanged.

Flag disabled;

Screenshot 2025-11-20 at 2 06 20 PM Screenshot 2025-11-20 at 2 07 08 PM

Flag enabled;
Screenshot 2025-11-20 at 2 19 45 PM

Screenshot 2025-11-20 at 2 19 19 PM

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix table and column normalization when the case-sensitive flag is disabled to prevent query failures when using Pinot connector.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Nov 20, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 20, 2025

Reviewer's Guide

Normalizes Pinot table and column identifier handling based on the connector’s case-sensitivity flag, ensuring mixed-case names work correctly when case-sensitive matching is disabled, and threads ConnectorSession and configuration through the relevant metadata and schema utilities.

Sequence diagram for Pinot table and column resolution with case-sensitivity flag

sequenceDiagram
    actor User
    participant PrestoEngine
    participant PinotMetadata
    participant PinotConnection
    participant PinotClusterInfoFetcher
    participant PinotColumnUtils

    User->>PrestoEngine: Submit query (tableName, columnNames)
    PrestoEngine->>PinotMetadata: getTableHandle(session, schemaTableName)
    PinotMetadata->>PinotMetadata: getPinotTableNameFromPrestoTableName(session, prestoTableName)
    PinotMetadata->>PinotConnection: getTableNames()
    PinotConnection-->>PinotMetadata: allTables
    PinotMetadata->>PinotMetadata: normalizedPrestoTableName = normalizeIdentifier(session, prestoTableName)
    loop for each pinotTableName in allTables
        PinotMetadata->>PinotMetadata: normalizedPinotTableName = normalizeIdentifier(session, pinotTableName)
        PinotMetadata-->>PinotMetadata: compare normalizedPrestoTableName and normalizedPinotTableName
    end
    PinotMetadata-->>PrestoEngine: PinotTableHandle(connectorId, schemaName, resolvedPinotTableName)

    PrestoEngine->>PinotMetadata: getColumnHandles(session, tableHandle)
    PinotMetadata->>PinotMetadata: getPinotTableNameFromPrestoTableName(session, pinotTableHandle.tableName)
    PinotMetadata->>PinotConnection: getTable(resolvedPinotTableName)
    PinotConnection-->>PinotMetadata: PinotTable

    PrestoEngine->>PinotConnection: load(resolvedPinotTableName)
    PinotConnection->>PinotClusterInfoFetcher: getTableSchema(resolvedPinotTableName)
    PinotClusterInfoFetcher-->>PinotConnection: Schema tablePinotSchema

    alt pinotConfig.isCaseSensitiveNameMatchingEnabled == true
        PinotConnection->>PinotColumnUtils: getPinotColumnsForPinotSchema(schema, inferDateType, inferTimestampType, nullHandlingEnabled, true)
        PinotColumnUtils-->>PinotConnection: PinotColumn list (original columnName)
    else pinotConfig.isCaseSensitiveNameMatchingEnabled == false
        PinotConnection->>PinotColumnUtils: getPinotColumnsForPinotSchema(schema, inferDateType, inferTimestampType, nullHandlingEnabled, false)
        PinotColumnUtils-->>PinotConnection: PinotColumn list (lowercased columnName)
    end

    PinotConnection-->>PrestoEngine: PinotColumn list
    PrestoEngine-->>User: Query planned and executed with normalized identifiers
Loading

Updated class diagram for PinotMetadata, PinotConnection, and PinotColumnUtils

classDiagram
    class PinotMetadata {
        - String connectorId
        - PinotConnection pinotPrestoConnection
        + List~String~ listSchemaNames(ConnectorSession session)
        - String getPinotTableNameFromPrestoTableName(ConnectorSession session, String prestoTableName)
        + PinotTableHandle getTableHandle(ConnectorSession session, SchemaTableName tableName)
        + ConnectorTableMetadata getTableMetadata(ConnectorSession session, ConnectorTableHandle table)
        + Map~String, ColumnHandle~ getColumnHandles(ConnectorSession session, ConnectorTableHandle tableHandle)
        + Map~SchemaTableName, List~ColumnMetadata~~ listTableColumns(ConnectorSession session, SchemaTablePrefix prefix)
        - ConnectorTableMetadata getTableMetadata(ConnectorSession session, SchemaTableName tableName)
        - String normalizeIdentifier(ConnectorSession session, String identifier)
    }

    class PinotConnection {
        - PinotClusterInfoFetcher pinotClusterInfoFetcher
        - PinotConfig pinotConfig
        - Executor executor
        - boolean nullHandlingEnabled
        + List~String~ getTableNames()
        + PinotTable getTable(String tableName)
        + List~PinotColumn~ load(String tableName)
    }

    class PinotColumnUtils {
        <<utility>>
        - PinotColumnUtils()
        + List~PinotColumn~ getPinotColumnsForPinotSchema(Schema pinotTableSchema, boolean inferDateType, boolean inferTimestampType)
        + List~PinotColumn~ getPinotColumnsForPinotSchema(Schema pinotTableSchema, boolean inferDateType, boolean inferTimestampType, boolean nullHandlingEnabled, boolean isCaseSensitiveNameMatchingEnabled)
    }

    class PinotConfig {
        + boolean isInferDateTypeInSchema()
        + boolean isInferTimestampTypeInSchema()
        + boolean isCaseSensitiveNameMatchingEnabled()
    }

    class Schema {
        + List~String~ getColumnNames()
        + FieldSpec getFieldSpecFor(String columnName)
    }

    class PinotColumn {
        + String columnName
        + Type prestoType
        + boolean nullable
        + String comment
    }

    class PinotTable {
        + String tableName
    }

    class PinotClusterInfoFetcher {
        + Schema getTableSchema(String tableName)
    }

    class ConnectorSession
    class SchemaTableName {
        + String getSchemaName()
        + String getTableName()
    }

    class ConnectorTableHandle
    class PinotTableHandle {
        + String getConnectorId()
        + String getSchemaName()
        + String getTableName()
        + SchemaTableName toSchemaTableName()
    }

    class ColumnHandle
    class ColumnMetadata
    class SchemaTablePrefix
    class Type
    class FieldSpec
    class Executor

    PinotMetadata --> PinotConnection : uses
    PinotMetadata --> PinotTableHandle : creates
    PinotMetadata --> ConnectorSession : uses
    PinotMetadata --> SchemaTableName : uses
    PinotMetadata --> ConnectorTableMetadata : returns
    PinotMetadata --> ColumnHandle : returns
    PinotMetadata --> ColumnMetadata : returns

    PinotConnection --> PinotClusterInfoFetcher : uses
    PinotConnection --> PinotConfig : uses
    PinotConnection --> Executor : uses
    PinotConnection --> PinotColumn : returns
    PinotConnection --> PinotTable : returns

    PinotColumnUtils --> Schema : uses
    PinotColumnUtils --> PinotColumn : creates
    PinotColumnUtils --> Type : uses
    PinotColumnUtils --> FieldSpec : uses

    PinotConfig --> boolean

    Schema --> FieldSpec

    PinotTableHandle --|> ConnectorTableHandle

    ConnectorSession --> PinotConfig : resolves case sensitivity
Loading

File-Level Changes

Change Details Files
Normalize table name matching between Presto and Pinot based on ConnectorSession and case-sensitivity rules.
  • Updated getPinotTableNameFromPrestoTableName to accept ConnectorSession and normalize both Presto and Pinot table names via normalizeIdentifier before comparison.
  • Adjusted PinotMetadata callers (getTableHandle, getColumnHandles, listTableColumns, getTableMetadata) to pass ConnectorSession into table name resolution and metadata lookup.
presto-pinot-toolkit/src/main/java/com/facebook/presto/pinot/PinotMetadata.java
Normalize Pinot column names according to the case-sensitivity configuration and propagate the new behavior through PinotConnection.
  • Extended getPinotColumnsForPinotSchema overload to accept an isCaseSensitiveNameMatchingEnabled flag and to lower-case column names when case-sensitive matching is disabled.
  • Updated the simpler getPinotColumnsForPinotSchema overload to call the extended version with default false values for null handling and case-sensitive matching.
  • Modified PinotConnection.load to pass the case-sensitivity flag from pinotConfig into getPinotColumnsForPinotSchema so column name normalization matches the connector configuration.
presto-pinot-toolkit/src/main/java/com/facebook/presto/pinot/PinotColumnUtils.java
presto-pinot-toolkit/src/main/java/com/facebook/presto/pinot/PinotConnection.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sh-shamsan sh-shamsan marked this pull request as ready for review November 26, 2025 15:10
@sh-shamsan sh-shamsan requested a review from a team as a code owner November 26, 2025 15:10
@prestodb-ci prestodb-ci requested review from a team, pdabre12 and pramodsatya and removed request for a team November 26, 2025 15:10
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Consider reusing the same normalization helper used for table names (e.g., normalizeIdentifier or a shared utility) in PinotColumnUtils instead of manually calling columnName.toLowerCase(Locale.ROOT) so that table and column name normalization stay consistent and driven by the same logic/configuration.
  • In getPinotTableNameFromPrestoTableName, you now normalize both the Presto and Pinot table names; if multiple Pinot tables differ only by case when case-insensitive matching is configured, you may want to explicitly define or document the tie-breaking behavior (e.g., which one is chosen) rather than implicitly returning the first match.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider reusing the same normalization helper used for table names (e.g., `normalizeIdentifier` or a shared utility) in `PinotColumnUtils` instead of manually calling `columnName.toLowerCase(Locale.ROOT)` so that table and column name normalization stay consistent and driven by the same logic/configuration.
- In `getPinotTableNameFromPrestoTableName`, you now normalize both the Presto and Pinot table names; if multiple Pinot tables differ only by case when case-insensitive matching is configured, you may want to explicitly define or document the tie-breaking behavior (e.g., which one is chosen) rather than implicitly returning the first match.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants