-
Notifications
You must be signed in to change notification settings - Fork 93
feat: handle parameterConsistency option in YAML extensions #624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: handle parameterConsistency option in YAML extensions #624
Conversation
vbarua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. The big question I have is actually around what INCONSISTENT is supposed to mean. It's not super clear to me based on what's currently in the specification, and I'm not sure if the checks you've added in the FunctionConverter are correct. We can follow up with the upstream to clarify the intended behaviour for the Isthmus code.
The code you've added to the core to parse the parameterConsistency looks good to me, and stops substrait-java from blowing up if it encounters an extension with parameter consistency set as Ben pointed out in #622. If you make the core parsing changes their own PR, I'd be happy to merge those in.
| import org.junit.jupiter.api.Test; | ||
|
|
||
| /** Tests for variadic parameter consistency validation in FunctionConverter. */ | ||
| class VariadicParameterConsistencyTest extends PlanTestBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest renaming this to VariadicBehaviourTest for generality. You've already got some limited testing for min behaviour present, and this would be a good place to add more tests for this kind of stuff.
| } | ||
|
|
||
| @Test | ||
| public void testParameterConsistencyLoading() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a reasonable test, but I don't think it belongs in this test class, which from the comments is more specific to custom type handling in extensions.
A separate VariadicBehaviourTest class (or somesuch name) would be a better fit.
| // have the same type. | ||
| return wildcardToType.values().stream().allMatch(s -> s.size() == 1); | ||
| // When parameterConsistency is INCONSISTENT, wildcard types can differ. | ||
| // When parameterConsistency is CONSISTENT (or not variadic), wildcard types must be the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, the parameterConsistency behaviour only applies to the variadic arguments. It's not related to how we typecheck the enumerated wildcard types (i.e any1, any2, etc). From the (limited) documentation on it:
When the last argument of a function is variadic and declares a type parameter e.g. fn(A, B, C...), the C parameter can be marked as either consistent or inconsistent. If marked as consistent, the function can only be bound to arguments where all the C types are the same concrete type. If marked as inconsistent, each unique C can be bound to a different type within the constraints of what T allows.
CONSISTENT means that the types of all the variadic arguments have to be the same. INCONSISTENT means... I'm not sure tbh because in
each unique C can be bound to a different type within the constraints of what T allows.
I don't know what T is. This is something I can follow up with the community about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I realize that this is attempting to handle the second part of the ticket I linked (#622).
However, I don't believe this is the right place to handle it. Considering that parameterConsistency has meaning regardless of calcite, those checks should really be inside of core/ IMO.
|
|
||
| @Value.Default | ||
| default ParameterConsistency parameterConsistency() { | ||
| return ParameterConsistency.CONSISTENT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a reasonble default, because I think it's what most people expect in practice and it's also the first value in the enumeration.
We can formalize this in the spec more concretely.
benbellick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. Can you make sure to link #622 in your PR description? Also, do you plan on implementing the second piece of that ticket? Which is validating that variadic rguments to functions are consistent with the parameterConsistency argument. Let me know if you need me to clarify anything.
| // have the same type. | ||
| return wildcardToType.values().stream().allMatch(s -> s.size() == 1); | ||
| // When parameterConsistency is INCONSISTENT, wildcard types can differ. | ||
| // When parameterConsistency is CONSISTENT (or not variadic), wildcard types must be the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I realize that this is attempting to handle the second part of the ticket I linked (#622).
However, I don't believe this is the right place to handle it. Considering that parameterConsistency has meaning regardless of calcite, those checks should really be inside of core/ IMO.
benbellick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of my comments are about relatively minor things. But the most important point is about the meaning of INCONSISTENT. I do think that the upstream could benefit from a bit more clarity there. Thanks for your work and let me know if anything I have said is unclear!
| SimpleExtension.VariadicBehavior variadicBehavior = variadic.get(); | ||
| if (variadicBehavior.parameterConsistency() | ||
| != SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT) { | ||
| // INCONSISTENT allows different types, so validation passes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So unfortunately, this is going to be a bit more complicated than it initially seemed. From the docs:
// Each argument can be any possible concrete type afforded by the bounds // of any parameter defined in the arguments specification.
Let me show you a (made-up) example:
urn: "urn:example:extension"
scalar_functions:
- name: "inconsistent_sum"
impls:
- args:
- value: "decimal<P,S>"
variadic:
min: 1
parameterConsistency: INCONSISTENT
return: "decimal<38,S>"
This means that the following are valid:
inconsistent_sum(decimal<10, 2>, decimal<10, 2>, decimal<10, 2>)inconsistent_sum(decimal<10, 2>, decimal<10, 2>)inconsistent_sum(decimal<15, 8>, decimal<11, 8>, decimal<119, 8>)
On the other hand, the following are all invalid:
inconsistent_sum(decimal<10, 2>, decimal<10, 3>, decimal<10, 4>)inconsistent_sum(decimal<10, 2>, i32)
The above example is showing that there is the implicit constraint across the variadic parameters that the scale S must be the same as the output (and thus must all be the same across the variadic parameter). On the other hand, the precision P is free to be anything. But all of the parameters do have to be decimal.
Not sure if that was instructive or not 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readability, it may make sense to have two private methods that implement each behavior, then have the public method just be a switch on the parameter consistency options. Just an idea!
| // For CONSISTENT, all variadic arguments must have the same type (ignoring nullability) | ||
| int firstVariadicArgIdx = Math.max(variadicBehavior.getMin() - 1, 0); | ||
| for (int i = firstVariadicArgIdx; i < argumentTypes.size() - 1; i++) { | ||
| Type currentType = argumentTypes.get(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we just save the firstVariadicArg one time and then compare all of the rest of them to that single one (instead of comparing across the list pair wise)? I think this amounts to a more informative error message.
| Type nextType = argumentTypes.get(i + 1); | ||
| // Normalize both types to nullable for comparison (ignoring nullability) | ||
| if (!io.substrait.type.TypeCreator.asNullable(currentType) | ||
| .equals(io.substrait.type.TypeCreator.asNullable(nextType))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IMO this is useful enough to warrant its own method. Could imagine something in core/src/main/java/io/substrait/type/Type.java like:
/**
* Compares this type with another type, ignoring nullability differences.
*
* @param other the type to compare with
* @return true if the types are equal when both are treated as nullable
*/
default boolean equalsIgnoringNullability(Type other) {
return TypeCreator.asNullable(this).equals(TypeCreator.asNullable(other));
}Thoughts?
| .options(java.util.Collections.emptyMap()) | ||
| .build(); | ||
|
|
||
| return Expression.ScalarFunctionInvocation.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
|
|
||
| @Test | ||
| void testConsistentVariadicWithSameTypes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically every place here where you are using the builders directly, I think it would be better to use the ExpressionCreator helpers. E.g.
Expression.I64Literal.builder().value(1).build() -> ExpressionCreator.i8(false, 1)
|
|
||
| assertEquals( | ||
| SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT, | ||
| collection.scalarFunctions().get(0).variadic().get().parameterConsistency()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets be super thorough and also add a test to make sure that parameterConsistency: INCONSISTENT is also passed through.
|
|
||
| @Test | ||
| void testConsistentVariadicWithDifferentTypes() { | ||
| // Function: test_func(i64, i64...) with CONSISTENT parameterConsistency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these comments which explicitly show a text representation of the function being tested are good 👍
| .parameterConsistency(SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT) | ||
| .build(); | ||
|
|
||
| // Variadic arguments have different types - should fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But let's drop all of these comments which repeat the message in the assert.
This pull request introduces improved handling and validation of variadic parameter consistency for function signatures, particularly distinguishing between consistent and inconsistent parameter types in variadic arguments. The changes ensure that function argument matching logic respects the
parameterConsistencysetting and are thoroughly tested with new unit tests.Enhancements to variadic parameter consistency:
FunctionConverterso that, whenparameterConsistencyis set toCONSISTENT, all variadic arguments must have the same type (ignoring nullability), and when set toINCONSISTENT, variadic arguments can differ in type. This logic is now explicitly enforced in theinputTypesMatchDefinedArgumentsmethod. [1] [2] [3]parameterConsistencyin theSimpleExtension.VariadicBehaviorinterface, defaulting toCONSISTENTif not specified.FIxes: #622