Skip to content

Conversation

@Joe-Abraham
Copy link
Contributor

@Joe-Abraham Joe-Abraham commented Oct 6, 2025

Description

Fixes

Motivation and Context

Add support for custom schemas in native sidecar function registry

Impact

Enables functions to be registered in connector specific or custom namespaces.

Test Plan

Included with this PR.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==
Prestissimo (Native Execution) Changes
* Add support for custom schemas in native sidecar function registry.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Oct 6, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Oct 6, 2025

Reviewer's Guide

This PR enhances the native sidecar function registry to support catalog-scoped namespaces by extending the C++ metadata API and HTTP endpoints with an optional catalog parameter, updating the Java client to call the new endpoints, documenting the changes in the OpenAPI spec, integrating a new hive.initcap function variant with build adjustments, and adding comprehensive tests.

Sequence diagram for catalog-scoped function metadata retrieval via sidecar

sequenceDiagram
participant JavaClient as NativeFunctionDefinitionProvider (Java)
participant Sidecar as Presto Native Sidecar (C++)
participant Catalog as Function Registry
JavaClient->>Sidecar: GET /v1/functions/{catalog}
Sidecar->>Catalog: getFunctionsMetadata(catalog)
Catalog-->>Sidecar: Filtered function metadata (for catalog)
Sidecar-->>JavaClient: JSON response with catalog-scoped functions
Loading

ER diagram for UdfSignatureMap with catalog filtering

erDiagram
CATALOG ||--o{ FUNCTION : contains
FUNCTION {
  string name
  string schema
  string outputType
  string functionKind
  list paramTypes
  string docString
  string routineCharacteristics
}
CATALOG {
  string catalogName
}
Loading

Class diagram for Hive Initcap function registration

classDiagram
class InitCapFunction_T {
  +call(result: Varchar, input: Varchar)
  +callAscii(result: Varchar, input: Varchar)
  static is_default_ascii_behavior: bool
}
class HiveFunctionRegistration {
  +registerHiveNativeFunctions()
}
InitCapFunction_T <.. HiveFunctionRegistration : registers
Loading

File-Level Changes

Change Details Files
Introduce catalog-scoped native function metadata retrieval
  • Extend getFunctionsMetadata signature to accept an optional catalog argument
  • Add filtering logic to skip functions not matching the specified catalog
  • Add unit tests for catalog-filtered and non-existent catalog metadata retrieval
presto-native-execution/presto_cpp/main/functions/FunctionMetadata.h
presto-native-execution/presto_cpp/main/functions/FunctionMetadata.cpp
presto-native-execution/presto_cpp/main/functions/tests/FunctionMetadataTest.cpp
Expose new /v1/functions/{catalog} sidecar endpoint and update client to use it
  • Document the new catalog path parameter in the OpenAPI spec
  • Register GET /v1/functions/{catalog} in PrestoServer to invoke catalog-scoped metadata
  • Inject and use catalogName in NativeFunctionDefinitionProvider to build catalog-specific URI
  • Update error handling to fail on filtered endpoint without fallback
presto-openapi/src/main/resources/rest_function_server.yaml
presto-native-execution/presto_cpp/main/PrestoServer.cpp
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/functionNamespace/NativeFunctionDefinitionProvider.java
Register hive-specific initcap function under hive.default and integrate hive catalog support
  • Add InitcapFunction implementation and one-time HiveFunctionRegistration
  • Invoke hive native functions registration in PrestoServer when hive connector is present
  • Add CMake entries to build the hive functions library and include it in the main server
  • Add C++ tests for the new hive.default.initcap behavior
presto-native-execution/presto_cpp/main/connectors/hive/functions/InitcapFunction.h
presto-native-execution/presto_cpp/main/connectors/hive/functions/HiveFunctionRegistration.h
presto-native-execution/presto_cpp/main/connectors/hive/functions/HiveFunctionRegistration.cpp
presto-native-execution/presto_cpp/main/connectors/hive/functions/tests/InitcapTest.cpp
presto-native-execution/presto_cpp/main/connectors/hive/functions/CMakeLists.txt
presto-native-execution/presto_cpp/main/connectors/hive/CMakeLists.txt
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt
presto-native-execution/presto_cpp/main/CMakeLists.txt
Enhance sidecar plugin test suite and utils for multi-catalog scenarios
  • Register both native and hive catalogs in NativeSidecarPluginQueryRunnerUtils
  • Add TestNativeSidecarCatalogFiltering to verify hive catalog initcap support
  • Add TestCatalogFilteredFunctionsWithoutSidecar to confirm failure cases when sidecar is disabled
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/NativeSidecarPluginQueryRunnerUtils.java
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarCatalogFiltering.java
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestCatalogFilteredFunctionsWithoutSidecar.java

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Joe-Abraham Joe-Abraham changed the title feat(native): support custom schemas in native sidecar function registry feat(native): Support custom schemas in native sidecar function registry Oct 7, 2025
@Joe-Abraham
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The new GET /v1/functions/{catalog} handler always returns 200 with null or empty JSON even for non‐existent catalogs—modify it to return a 404 when no functions are found to match the OpenAPI spec.
  • The /v1/functions/{catalog} path conflicts with the existing /v1/functions/{schema} route—consider renaming or ordering the handlers to avoid ambiguous routing.
  • The hard-coded blocklist in getFunctionsMetadata is scattered in the implementation—extract it into a shared constant or make it configurable to simplify future updates.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new GET /v1/functions/{catalog} handler always returns 200 with null or empty JSON even for non‐existent catalogs—modify it to return a 404 when no functions are found to match the OpenAPI spec.
- The `/v1/functions/{catalog}` path conflicts with the existing `/v1/functions/{schema}` route—consider renaming or ordering the handlers to avoid ambiguous routing.
- The hard-coded blocklist in getFunctionsMetadata is scattered in the implementation—extract it into a shared constant or make it configurable to simplify future updates.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/FunctionMetadata.cpp:341` </location>
<code_context>
+      continue;
+    }
+
+    const auto parts = getFunctionNameParts(name);
+    if (parts[0] != catalog) {
+      continue;
</code_context>

<issue_to_address>
**issue:** Potential out-of-bounds access in 'parts' array.

Validate that 'parts' contains at least three elements before accessing indices 1 and 2 to prevent undefined behavior.
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp:63-65` </location>
<code_context>
+
+  // Register hive-specific functions when hive catalog is detected.
+  // Delegate to generic Hive native function registrar which is idempotent.
+  if (connectorName ==
+          velox::connector::hive::HiveConnectorFactory::kHiveConnectorName ||
+      connectorName == std::string("hive-hadoop2")) {
+    hive::functions::registerHiveNativeFunctions();
+  }
</code_context>

<issue_to_address>
**suggestion:** Connector name comparison may be brittle.

Hardcoding connector names increases maintenance risk if new variants are added. Centralize connector name definitions or use a more robust matching approach.

Suggested implementation:

```cpp
  // Register hive-specific functions when hive catalog is detected.
  // Delegate to generic Hive native function registrar which is idempotent.
  if (isHiveConnector(connectorName)) {
    hive::functions::registerHiveNativeFunctions();
  }

```

```cpp
#include "velox/functions/FunctionRegistry.h"
#include <unordered_set>

```

```cpp
      connectorName);
  protocol::registerConnectorProtocol(
      connectorName, std::move(connectorProtocol));

// Centralized Hive connector name definitions.
namespace {
const std::unordered_set<std::string> kHiveConnectorNames = {
    velox::connector::hive::HiveConnectorFactory::kHiveConnectorName,
    "hive-hadoop2"
};

bool isHiveConnector(const std::string& connectorName) {
  return kHiveConnectorNames.count(connectorName) > 0;
}
} // namespace

```
</issue_to_address>

### Comment 3
<location> `presto-native-execution/presto_cpp/main/connectors/hive/functions/InitcapFunction.h:30` </location>
<code_context>
+struct InitCapFunction {
+  VELOX_DEFINE_FUNCTION_TYPES(T);
+
+  static constexpr bool is_default_ascii_behavior = true;
+
+  FOLLY_ALWAYS_INLINE void call(
</code_context>

<issue_to_address>
**nitpick:** Unused 'is_default_ascii_behavior' constant.

If this constant is not needed, please remove it. If it is reserved for future use or external access, add a comment explaining its purpose.
</issue_to_address>

### Comment 4
<location> `presto-native-execution/presto_cpp/main/functions/tests/FunctionMetadataTest.cpp:117` </location>
<code_context>
   testFunction("variance", "Variance.json", 5);
 }
+
+TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithCatalog) {
+  // Test with the "presto" catalog that is registered in SetUpTestSuite
+  std::string catalog = "presto";
</code_context>

<issue_to_address>
**suggestion (testing):** Missing test coverage for custom schemas and non-default schemas.

Please add tests for functions registered under non-default schemas to ensure getFunctionsMetadata returns correct metadata for those cases.

Suggested implementation:

```cpp
TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithCatalog) {
  // Test with the "presto" catalog that is registered in SetUpTestSuite
  std::string catalog = "presto";
  auto metadata = getFunctionsMetadata(catalog);

  // The result should be a JSON object with function names as keys
  ASSERT_TRUE(metadata.is_object());
  ASSERT_FALSE(metadata.empty());

  // Verify that common functions are present
  ASSERT_TRUE(metadata.contains("abs"));
  ASSERT_TRUE(metadata.contains("mod"));
}

// Register a function under a custom schema for testing purposes.
TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithCustomSchema) {
  // Assume registerFunction is available for registering test functions.
  // Register a function "custom_func" under schema "custom_schema".
  std::string customCatalog = "presto";
  std::string customSchema = "custom_schema";
  std::string functionName = "custom_func";
  // The registration API may differ; adjust as needed for your codebase.
  registerFunction(customCatalog, customSchema, functionName, /*function implementation*/ nullptr);

  auto metadata = getFunctionsMetadata(customCatalog);

  // The result should include the custom function under the custom schema.
  ASSERT_TRUE(metadata.is_object());
  ASSERT_TRUE(metadata.contains(functionName));
  // Optionally, check that the schema is correct in the metadata.
  ASSERT_EQ(metadata[functionName]["schema"], customSchema);
}

// Register a function under another non-default schema for additional coverage.
TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithNonDefaultSchema) {
  std::string catalog = "presto";
  std::string nonDefaultSchema = "analytics";
  std::string functionName = "analytics_func";
  registerFunction(catalog, nonDefaultSchema, functionName, /*function implementation*/ nullptr);

  auto metadata = getFunctionsMetadata(catalog);

  ASSERT_TRUE(metadata.is_object());
  ASSERT_TRUE(metadata.contains(functionName));
  ASSERT_EQ(metadata[functionName]["schema"], nonDefaultSchema);
}

```

- Ensure that the `registerFunction` API exists and is accessible in your test environment. If not, you may need to use the actual function registration mechanism used in your codebase.
- If the metadata structure differs (e.g., schema is not a direct property), adjust the assertions accordingly.
- If setup/teardown is required for custom functions, add appropriate cleanup code.
</issue_to_address>

### Comment 5
<location> `presto-native-execution/presto_cpp/main/functions/tests/FunctionMetadataTest.cpp:150` </location>
<code_context>
+  }
+}
+
+TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithNonExistentCatalog) {
+  // Test with a catalog that doesn't exist
+  std::string catalog = "nonexistent";
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for error conditions or malformed catalog names.

Please include test cases for malformed catalog names, such as empty strings or special characters, to verify robust error handling.
</issue_to_address>

### Comment 6
<location> `presto-native-execution/presto_cpp/main/functions/tests/FunctionMetadataTest.cpp:143` </location>
<code_context>
+      ASSERT_TRUE(signature.contains("functionKind"))
+          << "Function: " << it.key();
+
+      // Schema should be "default" since we registered with "presto.default."
+      // prefix
+      EXPECT_EQ(signature["schema"], "default") << "Function: " << it.key();
</code_context>

<issue_to_address>
**nitpick (testing):** Test assertions could be more robust for schema values.

Hardcoding 'default' for the schema may cause the test to fail if registration logic changes or custom schemas are used. Parameterize the expected schema or derive it from the registration logic to improve test resilience.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Joe-Abraham Joe-Abraham marked this pull request as ready for review October 8, 2025 11:54
@Joe-Abraham Joe-Abraham requested review from a team and pdabre12 as code owners October 8, 2025 11:54
@prestodb-ci prestodb-ci requested review from a team and sh-shamsan and removed request for a team October 8, 2025 11:54
Copy link
Contributor

@pdabre12 pdabre12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham ,
From a high-level, the changes look good to me. Will do a more thorough analysis later.

Can you write a small RFC so we can get more feedback on the architecture?

@Joe-Abraham
Copy link
Contributor Author

Thanks @Joe-Abraham , From a high-level, the changes look good to me. Will do a more thorough analysis later.

Can you write a small RFC so we can get more feedback on the architecture?

I have created the RFC - prestodb/rfcs#50

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham : Please also add e2e tests for initcap (with and without side-car) in presto-native-tests module

Copy link
Contributor

@pdabre12 pdabre12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % Aditi's comments.
Can we add some documentation for this feature?

@kevintang2022 Can you help take a look?

@Joe-Abraham Joe-Abraham force-pushed the hiveInitcap branch 2 times, most recently from 94e1906 to 2d7edd3 Compare October 24, 2025 06:19
@Joe-Abraham
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The new /v1/functions/{catalog} endpoint always returns HTTP 200 even when no functions match, but the OpenAPI spec and tests expect a 404 for non‐existent catalogs—please align the implementation with the spec or adjust the spec/tests accordingly.
  • Verify that the regex route for /v1/functions/{catalog} cannot inadvertently shadow the existing schema‐ or function‐specific endpoints; consider more specific patterns or ordering to avoid path conflicts.
  • You may want to guard the new hive subdirectory inclusions in CMakeLists (and the hive native functions library) behind a feature flag to prevent always building hive support in non‐hive builds.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new /v1/functions/{catalog} endpoint always returns HTTP 200 even when no functions match, but the OpenAPI spec and tests expect a 404 for non‐existent catalogs—please align the implementation with the spec or adjust the spec/tests accordingly.
- Verify that the regex route for /v1/functions/{catalog} cannot inadvertently shadow the existing schema‐ or function‐specific endpoints; consider more specific patterns or ordering to avoid path conflicts.
- You may want to guard the new hive subdirectory inclusions in CMakeLists (and the hive native functions library) behind a feature flag to prevent always building hive support in non‐hive builds.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/FunctionMetadata.cpp:345` </location>
<code_context>
+      continue;
+    }
+    const auto schema = parts[1];
+    const auto function = parts[2];
+    j[function] = buildScalarMetadata(name, schema, entry.second);
+  }
</code_context>

<issue_to_address>
**issue (bug_risk):** Potential key collision in JSON output for functions with same name in different schemas.

Consider including the schema in the key or grouping functions by schema to prevent overwriting metadata for functions with the same name.
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/connectors/hive/functions/InitcapFunction.h:28` </location>
<code_context>
+struct InitCapFunction {
+  VELOX_DEFINE_FUNCTION_TYPES(T);
+
+  static constexpr bool is_default_ascii_behavior = true;
+
+  FOLLY_ALWAYS_INLINE void call(
</code_context>

<issue_to_address>
**nitpick:** Unused static constexpr member.

If 'is_default_ascii_behavior' is meant for future use or external access, please document its intent. Otherwise, consider removing it.
</issue_to_address>

### Comment 3
<location> `presto-native-execution/presto_cpp/main/functions/tests/FunctionMetadataTest.cpp:117-115` </location>
<code_context>
+TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithCatalog) {
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for catalogs with special characters and case sensitivity.

Adding tests for special characters and case sensitivity will help ensure the function correctly handles a wider range of catalog names.
</issue_to_address>

### Comment 4
<location> `presto-native-execution/presto_cpp/main/connectors/hive/functions/tests/InitcapTest.cpp:28-37` </location>
<code_context>
+TEST_F(InitcapTest, initcap) {
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for strings with only whitespace and for very long strings.

Adding tests for strings with only whitespace and very long strings will help ensure the function handles these cases correctly and performs well.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Joe-Abraham
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The new /v1/functions/{catalog} route conflicts with the existing /v1/functions/{schema} endpoint—please rename or otherwise disambiguate these path parameters to avoid ambiguous routing.
  • The OpenAPI spec indicates a 404 for a non-existent catalog, but the handler always returns 200 (with null/empty JSON); please align the implementation with the spec or update the documentation.
  • getFunctionsMetadata uses only the unqualified function name as the JSON key, so functions with identical names across different schemas will overwrite one another; consider including schema (or full qualification) in the key or grouping by schema.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new /v1/functions/{catalog} route conflicts with the existing /v1/functions/{schema} endpoint—please rename or otherwise disambiguate these path parameters to avoid ambiguous routing.
- The OpenAPI spec indicates a 404 for a non-existent catalog, but the handler always returns 200 (with null/empty JSON); please align the implementation with the spec or update the documentation.
- getFunctionsMetadata uses only the unqualified function name as the JSON key, so functions with identical names across different schemas will overwrite one another; consider including schema (or full qualification) in the key or grouping by schema.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/FunctionMetadata.cpp:289` </location>
<code_context>

     const auto parts = getFunctionNameParts(name);
+    // Skip if catalog filter is specified and doesn't match
+    if (catalog.has_value() && parts[0] != catalog.value()) {
+      continue;
+    }
</code_context>

<issue_to_address>
**issue (bug_risk):** Handle case where getFunctionNameParts returns unexpected results.

Add a size check for 'parts' before accessing its elements to prevent out-of-bounds errors if the function name format is unexpected.
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/tests/FunctionMetadataTest.cpp:117-126` </location>
<code_context>
   testFunction("variance", "Variance.json", 5);
 }
+
+TEST_F(FunctionMetadataTest, GetFunctionsMetadataWithCatalog) {
+  // Test with the "presto" catalog that is registered in SetUpTestSuite
+  std::string catalog = "presto";
+  auto metadata = getFunctionsMetadata(catalog);
+
+  // The result should be a JSON object with function names as keys
+  ASSERT_TRUE(metadata.is_object());
+  ASSERT_FALSE(metadata.empty());
+
+  // Verify that common functions are present
+  ASSERT_TRUE(metadata.contains("abs"));
+  ASSERT_TRUE(metadata.contains("mod"));
+
+  // Each function should have an array of signatures
+  for (auto it = metadata.begin(); it != metadata.end(); ++it) {
+    ASSERT_TRUE(it.value().is_array()) << "Function: " << it.key();
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for custom schemas beyond the default.

Please add test cases for additional schemas, such as 'hive.default', to verify correct metadata filtering and retrieval for non-default schemas.
</issue_to_address>

### Comment 3
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarCustomNamespaces.java:66-67` </location>
<code_context>
+        return queryRunner;
+    }
+
+    @Test
+    public void testHiveInitcapFunctions()
+    {
+        assertQuery("SELECT hive.default.initcap(`Hello world`)", "SELECT('Hello World`)");
</code_context>

<issue_to_address>
**issue (testing):** Test assertions may not match expected output due to backtick usage.

Backticks are used for identifiers in Presto, not string literals. Please confirm the test is passing the correct string values and consider using single quotes. Also, check for mismatches due to trailing backticks in expected outputs.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@prestodb-ci prestodb-ci requested review from auden-woolfson and removed request for a team October 24, 2025 15:52
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham. This code is looking closer to done.

kevintang2022
kevintang2022 previously approved these changes Oct 24, 2025
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham : Have a review comment about the tests that needs to be addressed. Post that this PR looks good.

}

@Test
public void testInitcap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the tests in this class are using initcap with constant values... Due to constant folding optimization all these will be evavluated on the co-ordinator. You should invoke initcap with a column of values for evaluation on the worker. Please add such tests.

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

Copy link
Contributor

@pdabre12 pdabre12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

@pdabre12
Copy link
Contributor

@tdcmeehan Can you please take a look?

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local doc build, looks good. Thanks!

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds initcap for all Hive users, but shouldn't this be a custom function that's added for only users who need or want it?

void registerHiveFunctions() {
// Register functions under the 'hive.default' namespace.
facebook::presto::registerPrestoFunction<InitCapFunction, Varchar, Varchar>(
"initcap", "hive.default");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This presumes that initcap is available on the Presto coordinator, but I don't think it's added by default. Wouldn't this need to be registered through registerExtensions/with a shared library?

Copy link
Contributor

@aditi-pandit aditi-pandit Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdcmeehan
That's a good point.

This code was previously hard-wired in side-car to always load HiveFunctions. So the worker registration was done in this way. In our current state if the Hive catalog is not added on co-ordinator, its functions will not be queried.

Right now, native catalog support is not very clear. In my mind, we can register catalog functions on the worker along with their session properties if needed. The side-car then needs to discover catalogs from the worker and populate its function namespaces. Right now, the polling is driven from the co-ordinator plugin setup which doesn't seem quite right. Am I misunderstanding this ?

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Hive connector itself isn't specified as a plugin either, so we can defer my comment to a followup.

@aditi-pandit aditi-pandit merged commit 6d3b641 into prestodb:master Oct 27, 2025
91 of 93 checks passed
@Joe-Abraham Joe-Abraham deleted the hiveInitcap branch October 28, 2025 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support custom schemas in native sidecar function registry

7 participants