Skip to content

Pull/3859 #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Pull/3859 #5

wants to merge 16 commits into from

Conversation

yuancu
Copy link
Owner

@yuancu yuancu commented Jul 16, 2025

Description

Filter script pushdown with RelJson serialization in Calcite

Mirror of sql/pull/3859 to let copilot generate reviews

qianheng-aws and others added 16 commits June 23, 2025 18:23
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
@yuancu yuancu requested a review from Copilot July 16, 2025 04:57
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds filter script pushdown using Calcite’s RelJson serialization, replaces the legacy expression engine with a compounded script engine, and updates both core and integration tests to reflect the new opensearch_compounded_script language.

  • Introduce RelJsonSerializer and OpenSearchRelInputTranslator for serializing/deserializing RexNodes and field types
  • Replace ExpressionScriptEngine with CompoundedScriptEngine and wire it up in SQLPlugin
  • Enhance PredicateAnalyzer to emit ScriptQueryExpression and mark pushdown contexts as SCRIPT
  • Update integration tests and expected outputs to use opensearch_compounded_script and handle timestamp wildcards

Reviewed Changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java Switch to CompoundedScriptEngine using Calcite
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serialization/RelJsonSerializer.java New RelJson-based serializer for RexNode payloads
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/CompoundedScriptEngine.java Introduce new script engine delegating to V2 or Calcite
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java Emit ScriptQueryExpression when analyzer falls back
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java Classify filter pushdown as SCRIPT when needed
integ-test/.../expectedOutput/* Updated JSON expectations for opensearch_compounded_script
Comments suppressed due to low confidence (2)

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serialization/RelJsonSerializer.java:33

  • There are no unit tests for RelJsonSerializer. Add tests that serialize and then deserialize representative RexNode trees and field type maps to verify round-trip fidelity.
public class RelJsonSerializer {

benchmarks/src/jmh/java/org/opensearch/sql/expression/operator/predicate/MergeArrayAndObjectMapBenchmark.java:21

  • [nitpick] The method name testMerge no longer reflects the use of MergeRuleHelper. Consider renaming it to benchmarkMergeRuleHelper for clarity.
  public void testMerge() {

@@ -146,6 +146,6 @@ private ScriptQueryBuilder buildScriptQuery(FunctionExpression node) {
}
return new ScriptQueryBuilder(
new Script(
DEFAULT_SCRIPT_TYPE, EXPRESSION_LANG_NAME, serializer.serialize(node), emptyMap()));
DEFAULT_SCRIPT_TYPE, COMPOUNDED_LANG_NAME, serializer.serialize(node), emptyMap()));
Copy link
Preview

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pushed-down Script is created with an empty options map, so it defaults to the V2 engine instead of explicitly using Calcite. Include ENGINE_TYPE=CALCITE_ENGINE_TYPE in the options map to ensure the Calcite engine is used.

Suggested change
DEFAULT_SCRIPT_TYPE, COMPOUNDED_LANG_NAME, serializer.serialize(node), emptyMap()));
DEFAULT_SCRIPT_TYPE, COMPOUNDED_LANG_NAME, serializer.serialize(node),
ImmutableMap.of("ENGINE_TYPE", "CALCITE_ENGINE_TYPE")));

Copilot uses AI. Check for mistakes.


// Write bytes of all serializable contents
ByteArrayOutputStream output = new ByteArrayOutputStream();
ObjectOutputStream objectOutput = new ObjectOutputStream(output);
Copy link
Preview

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Java ObjectOutputStream for envelope serialization can be slow and carries security risks. Consider switching to a pure JSON-based approach (e.g., Jackson) for the entire envelope and use try-with-resources to manage streams.

Copilot uses AI. Check for mistakes.

final RelDataTypeField field = rowType.getFieldList().get(input);
return rexBuilder.makeInputRef(field.getType(), input);
}
throw new RuntimeException("input field " + input + " is out of range");
Copy link
Preview

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Throwing a raw RuntimeException is not descriptive. Consider using IndexOutOfBoundsException or IllegalArgumentException to better convey the nature of the error.

Suggested change
throw new RuntimeException("input field " + input + " is out of range");
throw new IndexOutOfBoundsException("input field " + input + " is out of range");

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants