Skip to content

Conversation

@PingLiuPing
Copy link
Contributor

Description

Refactor the IcebergPrestoToVeloxConnector class by extracting it from
PrestoToVeloxConnector.{h,cpp} into dedicated files. This improves code
organization and modularity for the Iceberg connector implementation.
Easier to maintain and extend Iceberg-specific functionality.

Changes:

  • Create IcebergPrestoToVeloxConnector.h with class declaration
  • Create IcebergPrestoToVeloxConnector.cpp with implementation
  • Update CMakeLists.txt to include new source files
  • Update includes in Registration.cpp and test files

See code review comments #25389 (comment) for motivation.

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==

@PingLiuPing PingLiuPing requested review from a team as code owners October 6, 2025 10:24
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Oct 6, 2025
@prestodb-ci prestodb-ci requested review from a team, ShahimSharafudeen and pratyakshsharma and removed request for a team October 6, 2025 10:24
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Oct 6, 2025

Reviewer's Guide

This refactor isolates all Iceberg-specific connector logic into new IcebergPrestoToVeloxConnector source files, removing it from the generic connector implementation, and updates the interface and build configuration accordingly.

Class diagram for refactored connector classes

classDiagram
    class PrestoToVeloxConnector {
        <<abstract>>
        +~PrestoToVeloxConnector()
        +virtual toVeloxSplit(...)
        +virtual toVeloxColumnHandle(...)
        +virtual toVeloxTableHandle(...)
        +virtual createConnectorProtocol()
    }
    class HivePrestoToVeloxConnector {
        +toVeloxSplit(...)
        +toVeloxColumnHandle(...)
        +toVeloxTableHandle(...)
        +createConnectorProtocol()
    }
    class IcebergPrestoToVeloxConnector {
        +toVeloxSplit(...)
        +toVeloxColumnHandle(...)
        +toVeloxTableHandle(...)
        +createConnectorProtocol()
    }
    class TpchPrestoToVeloxConnector {
        +toVeloxSplit(...)
        +toVeloxColumnHandle(...)
        +toVeloxTableHandle(...)
        +createConnectorProtocol()
    }
    PrestoToVeloxConnector <|-- HivePrestoToVeloxConnector
    PrestoToVeloxConnector <|-- IcebergPrestoToVeloxConnector
    PrestoToVeloxConnector <|-- TpchPrestoToVeloxConnector
Loading

File-Level Changes

Change Details Files
Extract Iceberg connector into standalone module
  • Create IcebergPrestoToVeloxConnector.h with the class declaration
  • Implement toVeloxSplit, toVeloxColumnHandle, toVeloxTableHandle, and createConnectorProtocol in IcebergPrestoToVeloxConnector.cpp
  • Remove all Iceberg-specific code from PrestoToVeloxConnector.cpp
presto-native-execution/presto_cpp/main/connectors/IcebergPrestoToVeloxConnector.h
presto-native-execution/presto_cpp/main/connectors/IcebergPrestoToVeloxConnector.cpp
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp
Clean up and extend PrestoToVeloxConnector interface
  • Add forward declarations for stringToType, toRequiredSubfields, toHiveColumnType, toHiveTableHandle
  • Remove duplicate helper functions and Iceberg split/column conversions
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.h
presto-native-execution/presto_cpp/main/types/PrestoToVeloxQueryPlan.cpp
Include new Iceberg sources in build
  • Add IcebergPrestoToVeloxConnector.cpp to the connectors library target
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt
Register Iceberg connector and update tests
  • Include IcebergPrestoToVeloxConnector.h in Registration.cpp
  • Add IcebergPrestoToVeloxConnector include to the connector test
presto-native-execution/presto_cpp/main/connectors/Registration.cpp
presto-native-execution/presto_cpp/main/types/tests/PrestoToVeloxConnectorTest.cpp

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Double‐check that the free helper functions you added in PrestoToVeloxConnector.h (stringToType, toRequiredSubfields, toHiveColumnType, toHiveTableHandle) are declared and defined in the same namespace so you don’t end up with linker mismatches.
  • Ensure you actually register the new IcebergPrestoToVeloxConnector in Registration.cpp (with the correct connector name) so that the Iceberg connector is available at runtime.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Double‐check that the free helper functions you added in PrestoToVeloxConnector.h (stringToType, toRequiredSubfields, toHiveColumnType, toHiveTableHandle) are declared and defined in the same namespace so you don’t end up with linker mismatches.
- Ensure you actually register the new IcebergPrestoToVeloxConnector in Registration.cpp (with the correct connector name) so that the Iceberg connector is available at runtime.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/connectors/IcebergPrestoToVeloxConnector.cpp:25-33` </location>
<code_context>
+
+namespace {
+
+velox::connector::hive::iceberg::FileContent toVeloxFileContent(
+    const presto::protocol::iceberg::FileContent content) {
+  if (content == protocol::iceberg::FileContent::DATA) {
+    return velox::connector::hive::iceberg::FileContent::kData;
+  } else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
+    return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
+  }
+  VELOX_UNSUPPORTED("Unsupported file content: {}", fmt::underlying(content));
+}
+
</code_context>

<issue_to_address>
**suggestion:** Consider handling additional Iceberg file content types for future extensibility.

If new file content types are added to Iceberg, this function will fail. Please document this limitation or update the code to handle future types gracefully.

```suggestion
/*
 * NOTE: This function currently only handles DATA and POSITION_DELETES file content types.
 * If new file content types are added to Iceberg, this function must be updated to handle them.
 * Otherwise, it will throw an unsupported error for unknown types.
 */
velox::connector::hive::iceberg::FileContent toVeloxFileContent(
    const presto::protocol::iceberg::FileContent content) {
  if (content == protocol::iceberg::FileContent::DATA) {
    return velox::connector::hive::iceberg::FileContent::kData;
  } else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
    return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
  }
  // Future extensibility: handle new file content types here.
  VELOX_UNSUPPORTED(
      "Unsupported file content: {}. Please update toVeloxFileContent to handle new types if added to Iceberg.",
      fmt::underlying(content));
}
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +25 to +33
velox::connector::hive::iceberg::FileContent toVeloxFileContent(
const presto::protocol::iceberg::FileContent content) {
if (content == protocol::iceberg::FileContent::DATA) {
return velox::connector::hive::iceberg::FileContent::kData;
} else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
}
VELOX_UNSUPPORTED("Unsupported file content: {}", fmt::underlying(content));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider handling additional Iceberg file content types for future extensibility.

If new file content types are added to Iceberg, this function will fail. Please document this limitation or update the code to handle future types gracefully.

Suggested change
velox::connector::hive::iceberg::FileContent toVeloxFileContent(
const presto::protocol::iceberg::FileContent content) {
if (content == protocol::iceberg::FileContent::DATA) {
return velox::connector::hive::iceberg::FileContent::kData;
} else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
}
VELOX_UNSUPPORTED("Unsupported file content: {}", fmt::underlying(content));
}
/*
* NOTE: This function currently only handles DATA and POSITION_DELETES file content types.
* If new file content types are added to Iceberg, this function must be updated to handle them.
* Otherwise, it will throw an unsupported error for unknown types.
*/
velox::connector::hive::iceberg::FileContent toVeloxFileContent(
const presto::protocol::iceberg::FileContent content) {
if (content == protocol::iceberg::FileContent::DATA) {
return velox::connector::hive::iceberg::FileContent::kData;
} else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
}
// Future extensibility: handle new file content types here.
VELOX_UNSUPPORTED(
"Unsupported file content: {}. Please update toVeloxFileContent to handle new types if added to Iceberg.",
fmt::underlying(content));
}

@PingLiuPing PingLiuPing force-pushed the lp_refactor_prestissimo_iceberg branch from 30a9a8c to 0f739c6 Compare October 6, 2025 17:39
@PingLiuPing
Copy link
Contributor Author

@aditi-pandit @yingsu00 Would you please help to have a look at this refactor PR, once this get merged I can submit another PR for the basic iceberg insertion. And since the basic insertion PR in velox has been merged we can merge code in Presto now. Thanks.

@PingLiuPing PingLiuPing requested a review from yingsu00 October 8, 2025 12:36
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a straightforward refactoring, but would appreciate other reviewers to chime in as well.

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this refactoring @PingLiuPing

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PingLiuPing

std::vector<velox::common::Subfield> toRequiredSubfields(
const protocol::List<protocol::Subfield>& subfields);

velox::connector::hive::HiveColumnHandle::ColumnType toHiveColumnType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PingLiuPing : Maybe we should move the entire PrestoToVeloxHiveConnector in a separate file as well so that these methods are not exposed from this file. But you can do that as a second cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aditi-pandit I will do that later.

@aditi-pandit aditi-pandit merged commit c829a7d into prestodb:master Oct 9, 2025
80 of 82 checks passed
imsayari404 pushed a commit to imsayari404/presto that referenced this pull request Oct 13, 2025
…e file (prestodb#26237)

## Description


Refactor the IcebergPrestoToVeloxConnector class by extracting it from
PrestoToVeloxConnector.{h,cpp} into dedicated files. This improves code
organization and modularity for the Iceberg connector implementation.
Easier to maintain and extend Iceberg-specific functionality.

```
== NO RELEASE NOTE ==
```
aditi-pandit pushed a commit that referenced this pull request Oct 23, 2025
)

## Description
<!---Describe your changes in detail-->

This is a follow up PR of
#26237 (comment)

This is a straightforward refactor. 

## Motivation and Context
<!---Why is this change required? What problem does it solve?-->
<!---If it fixes an open issue, please link to the issue here.-->

## Impact
<!---Describe any public API or user-facing feature change or any
performance impact-->

## Test Plan
<!---Please fill in how you tested your change-->

## Contributor checklist

- [ ] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [ ] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [ ] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [ ] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [ ] Adequate tests were added if applicable.
- [ ] CI passed.
- [ ] If adding new dependencies, verified they have an [OpenSSF
Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or
higher (or obtained explicit TSC approval for lower scores).

## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== NO RELEASE NOTE ==
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants