Skip to content

Conversation

@khaledh
Copy link

@khaledh khaledh commented Oct 22, 2025

What is the purpose of the change

Flink currently uses protobuf-java 3.x, causing compatibility issues for applications requiring protobuf 4.x. This upgrade to protobuf 4.32.1 enables:

  • Applications to use protobuf 4.x features (e.g., Protobuf Editions)
  • Resolution of dependency conflicts and forward compatibility as the protobuf ecosystem moves toward 4.x

The parquet-protobuf integration required a compatibility patch (PatchedProtoWriteSupport) because upstream parquet-java 1.15.2 still uses protobuf 3.x APIs. This patch can be removed once parquet-java adds native protobuf 4.x support (apache/parquet-java#3175).

Brief change log

  • Upgrade protobuf-java from 3.x to 4.32.1
  • Add PatchedProtoWriteSupport to maintain compatibility with parquet-java 1.15.2 (which still uses protobuf 3.x APIs)
  • Replace enum-based syntax detection (removed in protobuf 4.x) with string-based detection
  • All changes are internal - no public API changes

Verifying this change

This change added tests and can be verified as follows:

  • All existing tests pass (no regressions)
  • New test suite PatchedProtoWriteSupportTest with 6 tests:
    • 4 unit tests validating proto2/proto3 syntax detection with direct API
    • 2 integration tests validating production code path through ParquetProtoWriters
  • Tests confirm round-trip write/read integrity for both proto2 and proto3 messages

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): yes
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 22, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

<py4j.version>0.10.9.7</py4j.version>
<beam.version>2.54.0</beam.version>
<protoc.version>3.21.7</protoc.version>
<protoc.version>4.32.1</protoc.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This upgrade sounds like a great idea. I think we should update the docs to draw users attention to this new support.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I'll update the relevant docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the protobuf format docs and add a section to the 2.1 release ntoes.

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Oct 24, 2025
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).

#### Upgrade Protocol Buffers to 4.32.1
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved to a new flink-2.2.md release notes?

@MartijnVisser MartijnVisser requested a review from fapaul October 26, 2025 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants