- 
                Notifications
    You must be signed in to change notification settings 
- Fork 114
Support UDF in plan generator #3040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
      
        
              This comment was marked as outdated.
        
        
      
    
  This comment was marked as outdated.
      
        
              This comment was marked as outdated.
        
        
      
    
  This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am putting @alecgrieser on this review as well to make sure the proto layout and dependencies are what we want. I left a bunch of comments. There is some more work required but it's on the right track. I would recommend not ever removing -Werror (not even in your local build). It is useful. Also, please make sure the PR runs through the PRB.
        
          
                fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/RecordMetaData.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...rd-layer-core/src/main/java/com/apple/foundationdb/record/metadata/ScalarValuedFunction.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...rd-layer-core/src/main/java/com/apple/foundationdb/record/metadata/ScalarValuedFunction.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...ord-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/Function.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...ayer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/MacroFunction.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...in/java/com/apple/foundationdb/record/query/plan/cascades/KeyExpressionExpansionVisitor.java
          
            Show resolved
            Hide resolved
        
              
          
                ...rd-layer-core/src/main/java/com/apple/foundationdb/record/metadata/ScalarValuedFunction.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/metadata/Key.java
          
            Show resolved
            Hide resolved
        
              
          
                fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/RecordMetaData.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      
      
        
              This comment was marked as outdated.
        
        
      
    
  This comment was marked as outdated.
| for (RecordMetaDataProto.PUserDefinedFunction function: metaDataProto.getUserDefinedFunctionsList()) { | ||
| UserDefinedFunction func = (UserDefinedFunction)PlanSerialization.dispatchFromProtoContainer(new PlanSerializationContext(DefaultPlanSerializationRegistry.INSTANCE, | ||
| PlanHashable.CURRENT_FOR_CONTINUATION), function); | ||
| UserDefinedFunction func = (UserDefinedFunction)PlanSerialization.dispatchFromProtoContainer(serializationContext, function); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializationContext stores record types that it has "seen" before, for record type that we've "seen", it'll be serialized simply with a reference_id, so we need the "seen" types to deserialize the proto as well.
| @Nonnull | ||
| public Optional<? extends CatalogedFunction> lookup(@Nonnull final String functionName, Expressions arguments) { | ||
| final var functionSupplier = functionsMap.get(functionName); | ||
| final var functionSupplier = isCaseSensitive ? functionsMap.get(functionName) : functionsMap.get(functionName.toUpperCase(Locale.ROOT)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TemporaryFunctionTest.createTemporaryFunctionCaseSensitivityOption failed so I changed these 2 lines, confused why it worked before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please revert this line? it is actually incorrect to perform any normalization here. I think the tested you referred to should pass, but if it is not, we'll have to think about the root cause of the bug and solve it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we'll need to handle the case sensitivity somewhere, for example right now if I create a function with lower case, it'll be stored as upper case, but when I call the function with lower case, it'll only look for lower case, so it'll not be able to find the function. In the SqlFunctionCatalogImpl.lookUpBuiltInFunction, we do final var functionValidator = builtInSynonyms.get(name.toLowerCase(Locale.ROOT));. so that's why I think we should put something similar in the UserDefinedFunctionCatalog.lookup`. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you revert all the changes in this file, and run the test TemporaryFunctionTest.createTemporaryFunctionCaseSensitivityOption it should pass if you're working on top of #3511.
Now, I see that the YAML test is failing, this is due to the way you're parsing the function name without consulting the case-sensitivity flag in the plan generator.
    @Nonnull
    @Override
    public Expression visitUserDefinedScalarFunctionCall(@Nonnull RelationalParser.UserDefinedScalarFunctionCallContext ctx) {
        final var functionName = ctx.userDefinedScalarFunctionName().getText(); <---------
        Expressions arguments = visitFunctionArgs(ctx.functionArgs());
        return getDelegate().resolveFunction(functionName, arguments.asList().toArray(new Expression[0]));
    }This is not correct, instead you should parse the function name with case-sensitivity checks, this is for example done in visitUid:
Lines 73 to 81 in 6c0597e
| @Nonnull | |
| @Override | |
| public Identifier visitUid(@Nonnull RelationalParser.UidContext uidContext) { | |
| if (uidContext.simpleId() != null) { | |
| return visitSimpleId(uidContext.simpleId()); | |
| } else { | |
| return Identifier.of(getDelegate().normalizeString(uidContext.getText())); | |
| } | |
| } | 
So, you can for example change the parser rules and make userDefinedScalarFunctionName delegate to uid:
userDefinedScalarFunctionName
    : uid
    ;
Then, in visitUserDefinedScalarFunctionCall make it delegate to to visitUid:
    @Nonnull
    @Override
    public Expression visitUserDefinedScalarFunctionCall(@Nonnull RelationalParser.UserDefinedScalarFunctionCallContext ctx) {
        final var functionName = visitUid(ctx.userDefinedScalarFunctionName().uid());
        Expressions arguments = visitFunctionArgs(ctx.functionArgs());
        return getDelegate().resolveFunction(functionName.getName(), arguments.asList().toArray(new Expression[0]));
    }That way, parsing the function name respects whatever case-sensitivity option set in the SQL session. The YAML test should pass.
(I haven't checked whether this is causing any ambiguity problems in the parser though, so you might want to double-check before refactoring according to the above).
| } | ||
| } | ||
|  | ||
| @Test | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
working on moving these tests to yaml test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to make the YAML tests part of this PR as well :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, moved to yaml :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we keep the tests or remove them now that we have equivalent tests in YAML.
        
          
                fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/RecordMetaDataBuilder.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | // during plan generation of the temporary function. The literals and combined with query literals and provided | ||
| // for the execution of a (cached) physical plan. | ||
| final var compiledFunction = recordLayerRoutine.getCompilableSqlFunctionSupplier().apply(caseSensitive); | ||
| final var compiledFunction = (CompiledSqlFunction)recordLayerRoutine.getUserDefinedFunctionSupplier().apply(caseSensitive); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing is preventing us from creating a temporary marco function, but if the user does so, I suppose the cast above will throw?
        
          
                ...core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/SemanticAnalyzer.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | } | ||
|  | ||
| @Nonnull | ||
| public List<String> removePrefix(@Nonnull Identifier prefix) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed once the already existing resolver is used to resolve the nested field which is the result of the macro function.
        
          
                ...re/src/test/java/com/apple/foundationdb/relational/recordlayer/query/AstNormalizerTests.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | } | ||
| } | ||
|  | ||
| @Test | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to make the YAML tests part of this PR as well :-)
        
          
                ...ain/java/com/apple/foundationdb/relational/recordlayer/query/visitors/DelegatingVisitor.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...ain/java/com/apple/foundationdb/relational/recordlayer/query/visitors/DelegatingVisitor.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...ain/java/com/apple/foundationdb/relational/recordlayer/query/visitors/DelegatingVisitor.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| @@ -1,5 +1,5 @@ | |||
| /* | |||
| * MacroFunction.java | |||
| * UserDefinedScalarFunction.java | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a scalar function:
CREATE FUNCTION FOO(x INT) AS
   SELECT COUNT(a)
   FROM T
   WHERE x = 10
but it's not what this PR is about. I don't like the new name and vote for a name that reflects that the function is
- scalar
- it also expands in place
- makes it clear it's not the kind of function I listed above.
| @@ -38,15 +38,15 @@ | |||
| import java.util.stream.Collectors; | |||
|  | |||
| /** | |||
| * MacroFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site. | |||
| * UserDefinedScalarFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are user-defined scalar functions that are not covered by this kind of function, i.e:
CREATE FUNCTION foo(p int) AS
  SELECT count(a)
  FROM T
  GROUP BY x
  WHERE x = p
These are regular SQL-bodied functions that do provably (or only at-runtime provably) return exactly one value.
I liked the name MacroFunction as it encapsulated the drop in/replace logic of these mini-functions. Can we rename it back? Or something else that clearly describes what this thing does?
| @@ -100,18 +100,18 @@ public static MacroFunction fromProto(@Nonnull final PlanSerializationContext se | |||
| * Deserializer. | |||
| */ | |||
| @AutoService(PlanDeserializer.class) | |||
| public static class Deserializer implements PlanDeserializer<PMacroFunctionValue, MacroFunction> { | |||
| public static class Deserializer implements PlanDeserializer<PUserDefinedScalarFunctionValue, UserDefinedScalarFunction> { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please refrain from using plan serializer logic here. Can you refactor this logic to express the dynamic dispatch in a common way that plan serialization and this logic + @hatyo 's view all use the same underlying logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand here, isn't RawSqlFunction using the same logic? let's sync offline.
| } | ||
| PlanSerializationContext serializationContext = null; | ||
| for (RecordMetaDataProto.PUserDefinedFunction function: metaDataProto.getUserDefinedFunctionsList()) { | ||
| if (serializationContext == null) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed in a refactored world. Strictly speaking, the plan serialization context is necessary state such that repeated sub plans/values/types do not get serialized repeatedly. It a little bit like dictionary encoding. I think this is not needed here.
| PNumericAggregationValue.PBitmapConstructAgg numeric_aggregation_value_bitmap_construct_agg = 45; | ||
| PQuantifiedRecordValue quantified_record_value = 46; | ||
| PMacroFunctionValue macro_function_value = 47; | ||
| PUserDefinedScalarFunctionValue macro_function_value = 47; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PUserDefinedScalarFunctionValues counterpart is not a Value so this should be defined here. Also the name should be PUserDefinedScalarFunction because that's its counterpart. This file should not be touched by this PR at all I think, it  should all go into metadata.
| message PUserDefinedFunction { | ||
| oneof specific_function { | ||
| com.apple.foundationdb.record.planprotos.PMacroFunctionValue macro_function = 1; | ||
| com.apple.foundationdb.record.planprotos.PUserDefinedScalarFunctionValue user_defined_scalar_function = 1; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move that message definition to this file, rename it to PUserDefinedScalarFunction, see below.
| @@ -0,0 +1,174 @@ | |||
| /* | |||
| * MacroFunctionTest.java | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * MacroFunctionTest.java | |
| * UserDefinedScalaFunctionTest.java | 
|  | ||
| /** | ||
| * MacroFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site. | ||
| * UserDefinedScalarFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * UserDefinedScalarFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site. | |
| * UserDefinedMacroFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site. | 
| 📊 Metrics Diff Analysis ReportSummary
 ℹ️ About this analysisThis automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into: 
 The last category in particular may indicate planner regressions that should be investigated. New QueriesCount of new queries by file: 
 | 
This fixes #3675