Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36898][Table] Support SQL FLOOR and CEIL functions with NanoSecond and MicroSecond for TIMESTAMP_TLZ #25897

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hanyuzheng7
Copy link
Contributor

@hanyuzheng7 hanyuzheng7 commented Jan 5, 2025

What is the purpose of the change

We need a fix for both MICROSECOND and NANOSECOND.

The following query doesn't work:

SELECT
  FLOOR(
      CAST(TIMESTAMP '2024-04-25 17:19:42.654' AS TIMESTAMP_LTZ(3))
  TO MICROSECOND) 
SELECT
  FLOOR(
      CAST(TIMESTAMP '2024-04-25 17:19:42.654' AS TIMESTAMP_LTZ(3))
  TO NANOSECOND) 

Stack trace for the first not working query from above:

Caused by: io.confluent.flink.table.utils.CleanedException: org.codehaus.commons.compiler.CompileException: Line 41, Column 69: No applicable constructor/method found for actual parameters "org.apache.flink.table.data.TimestampData, org.apache.flink.table.data.TimestampData"; candidates are: "public static long org.apache.flink.table.runtime.functions.SqlFunctionUtils.floor(long, long)", "public static float org.apache.flink.table.runtime.functions.SqlFunctionUtils.floor(float)", "public static org.apache.flink.table.data.DecimalData org.apache.flink.table.runtime.functions.SqlFunctionUtils.floor(org.apache.flink.table.data.DecimalData)", "public static int org.apache.flink.table.runtime.functions.SqlFunctionUtils.floor(int, int)", "public static double org.apache.flink.table.runtime.functions.SqlFunctionUtils.floor(double)"
at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:13080)
at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:9646)
at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9506)
at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9422)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5263)
... 

Brief change log

(for example:)

  • The TaskInfo is stored in the blob store on job creation time as a persistent artifact
  • Deployments RPC transmits only the blob storage reference
  • TaskManagers retrieve the TaskInfo from the blob cache

Verifying this change

  • Added unit tests in TimeFunctionsITCase

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@hanyuzheng7 hanyuzheng7 marked this pull request as ready for review January 5, 2025 01:25
@flinkbot
Copy link
Collaborator

flinkbot commented Jan 5, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@snuyanzin
Copy link
Contributor

@hanyuzheng7 thanks for the PR

would be also great if you fill the PR form

Comment on lines 626 to 628
"CAST(CEIL(CAST(f3 AS TIMESTAMP_LTZ(4)) TO MICROSECOND) AS STRING)",
LocalDateTime.of(2021, 9, 24, 9, 20, 50, 924_000_000)
.format(TIMESTAMP_FORMATTER),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we trying to check TO MICROSECOND with millis only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @snuyanzin, I just updated the test and increase the precision to 9, so we can check TO MICROSECOND and TO NANOSECOND now.

@hanyuzheng7
Copy link
Contributor Author

hanyuzheng7 commented Jan 8, 2025

@hanyuzheng7 thanks for the PR

would be also great if you fill the PR form

Thank you, I forget added description, I will add it.

if terms.length + 1 == method.getParameterCount &&
method.getParameterTypes()(terms.length) == classOf[TimeZone] =>
MINUTE | SECOND | MILLISECOND | MICROSECOND | NANOSECOND
if terms.length + 2 == method.getParameterCount &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can make the code more readable and not use terms.length + 2 or terms.length + 1).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @davidradl, I also feel the code here is not very grace, do you have some ideas to make it more decent.

@@ -446,7 +446,7 @@ private Stream<TestSetSpec> ceilTestCases() {
LocalDate.of(1990, 10, 14),
LocalDateTime.of(2020, 2, 29, 1, 56, 59, 987654321),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we stopped using this constant, right
then why do we need it here?

Copy link
Contributor Author

@hanyuzheng7 hanyuzheng7 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We not stop using this constant, other tests will use it. For TIMESTAMP_TLZ tests, we use f3 now, old tests will use f2.

@@ -629,8 +645,9 @@ private Stream<TestSetSpec> floorTestCases() {
// Fractional seconds are lost
LocalTime.of(11, 22, 33),
LocalDate.of(1990, 10, 14),
LocalDateTime.of(2020, 2, 29, 1, 56, 59, 987654321))
.andDataTypes(TIME(), DATE(), TIMESTAMP())
LocalDateTime.of(2020, 2, 29, 1, 56, 59, 987654321),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we stopped using this constant, right
then why do we need it here?

Copy link
Contributor Author

@hanyuzheng7 hanyuzheng7 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the test such as https://github.com/apache/flink/pull/25897/files#diff-7588925e682b7f5b4b2e41fdc4992ef0d9f246ce00696f783c36f993b8aeb7aeR776-L770 still use f2, the reason I use f3 for TIMESTAMP_TLZ tests is the precision increased now, we should set f3 precision to 9 here,
if all TIMESTAMP_TLZ tests use f3, I don't need change the tests which use f2 before.

Copy link
Contributor

@snuyanzin snuyanzin Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and as a result we don't need a value under f2 at all right?

Copy link
Contributor Author

@hanyuzheng7 hanyuzheng7 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as a result, we don't use f2 and value under f2 for CAST LocalDateTime to TIMESTAMP_LTZ tests. All of them use f3.
I just let all tests such as CAST(CEIL(CAST(f3 AS TIMESTAMP_LTZ(9)) TO SECOND) AS STRING) use f3

  .testResult(
                                $("f3").cast(TIMESTAMP_LTZ(9))
                                        .ceil(TimeIntervalUnit.SECOND)
                                        .cast(STRING()),
                                "CAST(CEIL(CAST(f3 AS TIMESTAMP_LTZ(9)) TO SECOND) AS STRING)",
                                LocalDateTime.of(2021, 9, 24, 9, 20, 51, 0)
                                        .format(TIMESTAMP_FORMATTER),
                                STRING().nullable())

But the tests such as CEIL(f2 TO DECADE)

 .testResult(
                                $("f2").ceil(TimeIntervalUnit.DECADE),
                                "CEIL(f2 TO DECADE)",
                                LocalDateTime.of(2030, 1, 1, 0, 0),
                                TIMESTAMP().nullable())

I do not change it, so we need f2, other tests still use it.

Tests using .format(TIMESTAMP_FORMATTER), will use f3 now. Because we change TIMESTAMP_FORMATTER from DateTimeFormatter.ofPattern("yyyy-MM-dd' 'HH:mm:ss.SSS");
to DateTimeFormatter.ofPattern("yyyy-MM-dd' 'HH:mm:ss.SSSSSSSSS"); and f3 precision is 9 to match the DateTimeFormatter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see, thanks

Comment on lines +630 to +635
.testResult(
$("f3").cast(TIMESTAMP_LTZ(9))
.ceil(TimeIntervalUnit.NANOSECOND)
.cast(STRING()),
"CAST(CEIL(CAST(f3 AS TIMESTAMP_LTZ(9)) TO NANOSECOND) AS STRING)",
LocalDateTime.of(2021, 9, 24, 9, 20, 50, 924_325_471)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we introduce nanos in a separate var
can we have a test like this

CAST(CEIL(CAST(f2 AS TIMESTAMP_LTZ(9)) TO MICROSECOND) AS STRING)

where f2 is LocalDateTime.of(2021, 9, 24, 9, 20, 50, 999_999_471)
to be sure it works correctly

Copy link
Contributor Author

@hanyuzheng7 hanyuzheng7 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the issue is f2 use TIMESTAMP()

 .onFieldsWithData(
                                // https://issues.apache.org/jira/browse/FLINK-17224
                                // Fractional seconds are lost
                                LocalTime.of(11, 22, 33),
                                LocalDate.of(1990, 10, 14),
                                LocalDateTime.of(2020, 2, 29, 1, 56, 59, 987654321),
                                LocalDateTime.of(2021, 9, 24, 9, 20, 50, 924325471))
                        .andDataTypes(TIME(), DATE(), TIMESTAMP(), TIMESTAMP(9))

, which means the precision is 6. so the data deliver to TimestampData is
999_999_000. If we change the TIMESTAMP() for f2 to TIMESTAMP(9), all other tests will also change.
I think we can change f3 to LocalDateTime.of(2021, 9, 24, 9, 20, 50, 999_999_471)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can change f3 to LocalDateTime.of(2021, 9, 24, 9, 20, 50, 999_999_471)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f2 was just an example, it could other var name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will use LocalDateTime.of(2021, 9, 24, 9, 20, 50, 999_999_471) to tests new cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants