Skip to content

[FLINK-37914][table] Add built-in OBJECT_UPDATE function #26806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

raminqaf
Copy link
Contributor

What is the purpose of the change

This pull request implements the OBJECT_UPDATE built-in function as part of FLIP-520: Simplify StructuredType handling. The function allows users to update existing fields in structured objects by providing key-value pairs, enabling mutation operations on structured types in both SQL and Table API without requiring custom UDFs.

Brief change log

  • Added ObjectUpdateInputTypeStrategy for validating input arguments (structured object + key-value pairs)
  • Added ObjectUpdateTypeStrategy for inferring return types (same as input structured type)
  • Implemented ObjectUpdateFunction runtime function for performing field updates
  • Added OBJECT_UPDATE to BuiltInFunctionDefinitions with proper type inference strategies
  • Added Table API expression support via objectUpdate() method on expressions
  • Updated SQL and Table API documentation with examples and usage patterns
  • Added comprehensive validation for field names, types, and compatibility checking
  • Implemented proper error handling for non-existent fields and type mismatches

Verifying this change

This change added tests and can be verified as follows:

  • Added unit tests in ObjectUpdateInputTypeStrategyTest for input validation scenarios
  • Added integration tests in StructuredFunctionsITCase for end-to-end SQL functionality

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes (new built-in function)
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): yes (new scalar function execution)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs / JavaDocs (updated SQL functions documentation and comprehensive JavaDoc comments)

@flinkbot
Copy link
Collaborator

flinkbot commented Jul 17, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

- sql: OBJECT_UPDATE(object, key, value [, key, value , ...])
table: OBJECT.objectUpdate(key, value [, key, value , ...])
description: |
Updates existing fields in a structured object by providing key-value pairs.
Copy link
Contributor

@davidradl davidradl Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious:

  • I assume that arrays are not included in this
  • are there restrictions on the values, i.e. can they be maps, lists, objects or arrays. If there are restrictions , we should document them. If there are no restrictions we should include examples of more complex objects.
  • in the example I assume "com.example.User" is a "path".
    • could dots in the field name clash with the path?
    • could you give an example or pointer to documentaiton around how to construct these paths in the documentation - examples around nested objects would be useful - including an array of objects where you want to update the value of the 3rd object.
  • can we set null values.
  • if the field is defined as not nullable - do we error if an attempt to put a null there occurs.

Copy link
Contributor Author

@raminqaf raminqaf Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Arrays should be supported. In the background, the OBJECT_UPDATE uses a structured type, and this test evaluates its functionality.
  • There should be no restrictions on the values. We only check if the type of the updated value matches the one in the structured type.
  • For now, we decided to skip any validation on the field name format. The user is allowed to write OBJECT_UPDATE(obj, "my.field.name", 14). This should be up to the user to pass valid field names. For the first iteration, this validation was not necessary.
  • If I understand you correctly, you cannot update an ARRAY type with OBJECT_UPDATE. The OBJECT_UPDATE applies to STRUCTURED_TYPE.
  • Yes, values can be null
  • Yes, validation errors will be thrown if the field name is null:
    • Here is the validation check, where we see if the field is a non-null String literal.
    • Tests backing this logic.

@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Jul 17, 2025
@raminqaf raminqaf force-pushed the FLINK-37914 branch 4 times, most recently from 56a3f1e to e437f51 Compare July 18, 2025 08:19
@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Jul 19, 2025
@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Jul 22, 2025
*
* <pre>{@code
* // Create a structured object representing a user
* User userObject = objectOf("com.example.User", "name", "Bob", "age", 25);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples are a bit misleading. They read as if User userObject = objectOf is Java code and the function returns a class instance User. The explanation in docs is better.

@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Jul 22, 2025
@raminqaf raminqaf force-pushed the FLINK-37914 branch 2 times, most recently from a40c147 to 5a1cb17 Compare July 23, 2025 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-reviewed PR has been reviewed by the community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants