Skip to content

Conversation

@pengpeng-lu
Copy link
Contributor

@pengpeng-lu pengpeng-lu commented Jan 14, 2025

This fixes #3675

@foundationdb-ci

This comment was marked as outdated.

@foundationdb-ci

This comment was marked as outdated.

@pengpeng-lu pengpeng-lu requested a review from hatyo January 16, 2025 16:44
Copy link
Contributor

@normen662 normen662 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am putting @alecgrieser on this review as well to make sure the proto layout and dependencies are what we want. I left a bunch of comments. There is some more work required but it's on the right track. I would recommend not ever removing -Werror (not even in your local build). It is useful. Also, please make sure the PR runs through the PRB.

@normen662 normen662 requested a review from alecgrieser February 3, 2025 09:22
@foundationdb-ci

This comment was marked as outdated.

@pengpeng-lu pengpeng-lu added enhancement New feature or request and removed Draft labels Mar 21, 2025
@pengpeng-lu pengpeng-lu marked this pull request as ready for review March 31, 2025 19:36
for (RecordMetaDataProto.PUserDefinedFunction function: metaDataProto.getUserDefinedFunctionsList()) {
UserDefinedFunction func = (UserDefinedFunction)PlanSerialization.dispatchFromProtoContainer(new PlanSerializationContext(DefaultPlanSerializationRegistry.INSTANCE,
PlanHashable.CURRENT_FOR_CONTINUATION), function);
UserDefinedFunction func = (UserDefinedFunction)PlanSerialization.dispatchFromProtoContainer(serializationContext, function);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serializationContext stores record types that it has "seen" before, for record type that we've "seen", it'll be serialized simply with a reference_id, so we need the "seen" types to deserialize the proto as well.

@Nonnull
public Optional<? extends CatalogedFunction> lookup(@Nonnull final String functionName, Expressions arguments) {
final var functionSupplier = functionsMap.get(functionName);
final var functionSupplier = isCaseSensitive ? functionsMap.get(functionName) : functionsMap.get(functionName.toUpperCase(Locale.ROOT));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TemporaryFunctionTest.createTemporaryFunctionCaseSensitivityOption failed so I changed these 2 lines, confused why it worked before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please revert this line? it is actually incorrect to perform any normalization here. I think the tested you referred to should pass, but if it is not, we'll have to think about the root cause of the bug and solve it there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we'll need to handle the case sensitivity somewhere, for example right now if I create a function with lower case, it'll be stored as upper case, but when I call the function with lower case, it'll only look for lower case, so it'll not be able to find the function. In the SqlFunctionCatalogImpl.lookUpBuiltInFunction, we do final var functionValidator = builtInSynonyms.get(name.toLowerCase(Locale.ROOT));. so that's why I think we should put something similar in the UserDefinedFunctionCatalog.lookup`. wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you revert all the changes in this file, and run the test TemporaryFunctionTest.createTemporaryFunctionCaseSensitivityOption it should pass if you're working on top of #3511.

Now, I see that the YAML test is failing, this is due to the way you're parsing the function name without consulting the case-sensitivity flag in the plan generator.

    @Nonnull
    @Override
    public Expression visitUserDefinedScalarFunctionCall(@Nonnull RelationalParser.UserDefinedScalarFunctionCallContext ctx) {
        final var functionName = ctx.userDefinedScalarFunctionName().getText(); <---------
        Expressions arguments = visitFunctionArgs(ctx.functionArgs());
        return getDelegate().resolveFunction(functionName, arguments.asList().toArray(new Expression[0]));
    }

This is not correct, instead you should parse the function name with case-sensitivity checks, this is for example done in visitUid:

@Nonnull
@Override
public Identifier visitUid(@Nonnull RelationalParser.UidContext uidContext) {
if (uidContext.simpleId() != null) {
return visitSimpleId(uidContext.simpleId());
} else {
return Identifier.of(getDelegate().normalizeString(uidContext.getText()));
}
}

So, you can for example change the parser rules and make userDefinedScalarFunctionName delegate to uid:

userDefinedScalarFunctionName
    : uid
    ;

Then, in visitUserDefinedScalarFunctionCall make it delegate to to visitUid:

    @Nonnull
    @Override
    public Expression visitUserDefinedScalarFunctionCall(@Nonnull RelationalParser.UserDefinedScalarFunctionCallContext ctx) {
        final var functionName = visitUid(ctx.userDefinedScalarFunctionName().uid());
        Expressions arguments = visitFunctionArgs(ctx.functionArgs());
        return getDelegate().resolveFunction(functionName.getName(), arguments.asList().toArray(new Expression[0]));
    }

That way, parsing the function name respects whatever case-sensitivity option set in the SQL session. The YAML test should pass.

(I haven't checked whether this is causing any ambiguity problems in the parser though, so you might want to double-check before refactoring according to the above).

}
}

@Test
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

working on moving these tests to yaml test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to make the YAML tests part of this PR as well :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, moved to yaml :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we keep the tests or remove them now that we have equivalent tests in YAML.

// during plan generation of the temporary function. The literals and combined with query literals and provided
// for the execution of a (cached) physical plan.
final var compiledFunction = recordLayerRoutine.getCompilableSqlFunctionSupplier().apply(caseSensitive);
final var compiledFunction = (CompiledSqlFunction)recordLayerRoutine.getUserDefinedFunctionSupplier().apply(caseSensitive);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing is preventing us from creating a temporary marco function, but if the user does so, I suppose the cast above will throw?

}

@Nonnull
public List<String> removePrefix(@Nonnull Identifier prefix) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed once the already existing resolver is used to resolve the nested field which is the result of the macro function.

}
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to make the YAML tests part of this PR as well :-)

@pengpeng-lu pengpeng-lu requested a review from hatyo October 13, 2025 22:28
Copy link
Contributor

@hatyo hatyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pengpeng-lu pengpeng-lu requested review from normen662 and removed request for alecgrieser October 14, 2025 19:15
@@ -1,5 +1,5 @@
/*
* MacroFunction.java
* UserDefinedScalarFunction.java
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a scalar function:

CREATE FUNCTION FOO(x INT) AS
   SELECT COUNT(a)
   FROM T
   WHERE x = 10

but it's not what this PR is about. I don't like the new name and vote for a name that reflects that the function is

  1. scalar
  2. it also expands in place
  3. makes it clear it's not the kind of function I listed above.

@@ -38,15 +38,15 @@
import java.util.stream.Collectors;

/**
* MacroFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site.
* UserDefinedScalarFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are user-defined scalar functions that are not covered by this kind of function, i.e:

CREATE FUNCTION foo(p int) AS
  SELECT count(a)
  FROM T
  GROUP BY x
  WHERE x = p

These are regular SQL-bodied functions that do provably (or only at-runtime provably) return exactly one value.

I liked the name MacroFunction as it encapsulated the drop in/replace logic of these mini-functions. Can we rename it back? Or something else that clearly describes what this thing does?

@@ -100,18 +100,18 @@ public static MacroFunction fromProto(@Nonnull final PlanSerializationContext se
* Deserializer.
*/
@AutoService(PlanDeserializer.class)
public static class Deserializer implements PlanDeserializer<PMacroFunctionValue, MacroFunction> {
public static class Deserializer implements PlanDeserializer<PUserDefinedScalarFunctionValue, UserDefinedScalarFunction> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please refrain from using plan serializer logic here. Can you refactor this logic to express the dynamic dispatch in a common way that plan serialization and this logic + @hatyo 's view all use the same underlying logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand here, isn't RawSqlFunction using the same logic? let's sync offline.

}
PlanSerializationContext serializationContext = null;
for (RecordMetaDataProto.PUserDefinedFunction function: metaDataProto.getUserDefinedFunctionsList()) {
if (serializationContext == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed in a refactored world. Strictly speaking, the plan serialization context is necessary state such that repeated sub plans/values/types do not get serialized repeatedly. It a little bit like dictionary encoding. I think this is not needed here.

PNumericAggregationValue.PBitmapConstructAgg numeric_aggregation_value_bitmap_construct_agg = 45;
PQuantifiedRecordValue quantified_record_value = 46;
PMacroFunctionValue macro_function_value = 47;
PUserDefinedScalarFunctionValue macro_function_value = 47;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PUserDefinedScalarFunctionValues counterpart is not a Value so this should be defined here. Also the name should be PUserDefinedScalarFunction because that's its counterpart. This file should not be touched by this PR at all I think, it should all go into metadata.

message PUserDefinedFunction {
oneof specific_function {
com.apple.foundationdb.record.planprotos.PMacroFunctionValue macro_function = 1;
com.apple.foundationdb.record.planprotos.PUserDefinedScalarFunctionValue user_defined_scalar_function = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move that message definition to this file, rename it to PUserDefinedScalarFunction, see below.

@@ -0,0 +1,174 @@
/*
* MacroFunctionTest.java
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* MacroFunctionTest.java
* UserDefinedScalaFunctionTest.java

@pengpeng-lu pengpeng-lu requested a review from normen662 October 26, 2025 18:20

/**
* MacroFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site.
* UserDefinedScalarFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* UserDefinedScalarFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site.
* UserDefinedMacroFunction that expands a body (referring to parameters) into a {@link Value} (through encapsulation) call site.

@github-actions
Copy link

📊 Metrics Diff Analysis Report

Summary

  • New queries: 13
  • Dropped queries: 0
  • Plan changed + metrics changed: 0
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/user-defined-macro-function-tests.metrics.yaml: 13

@pengpeng-lu pengpeng-lu merged commit f47d9a8 into FoundationDB:main Oct 29, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement a dereferencing UDF function

4 participants