-
Notifications
You must be signed in to change notification settings - Fork 5.5k
refactor(native): Separate IcebergPrestoToVeloxConnector to standalone file #26237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(native): Separate IcebergPrestoToVeloxConnector to standalone file #26237
Conversation
Reviewer's GuideThis refactor isolates all Iceberg-specific connector logic into new IcebergPrestoToVeloxConnector source files, removing it from the generic connector implementation, and updates the interface and build configuration accordingly. Class diagram for refactored connector classesclassDiagram
class PrestoToVeloxConnector {
<<abstract>>
+~PrestoToVeloxConnector()
+virtual toVeloxSplit(...)
+virtual toVeloxColumnHandle(...)
+virtual toVeloxTableHandle(...)
+virtual createConnectorProtocol()
}
class HivePrestoToVeloxConnector {
+toVeloxSplit(...)
+toVeloxColumnHandle(...)
+toVeloxTableHandle(...)
+createConnectorProtocol()
}
class IcebergPrestoToVeloxConnector {
+toVeloxSplit(...)
+toVeloxColumnHandle(...)
+toVeloxTableHandle(...)
+createConnectorProtocol()
}
class TpchPrestoToVeloxConnector {
+toVeloxSplit(...)
+toVeloxColumnHandle(...)
+toVeloxTableHandle(...)
+createConnectorProtocol()
}
PrestoToVeloxConnector <|-- HivePrestoToVeloxConnector
PrestoToVeloxConnector <|-- IcebergPrestoToVeloxConnector
PrestoToVeloxConnector <|-- TpchPrestoToVeloxConnector
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Double‐check that the free helper functions you added in PrestoToVeloxConnector.h (stringToType, toRequiredSubfields, toHiveColumnType, toHiveTableHandle) are declared and defined in the same namespace so you don’t end up with linker mismatches.
- Ensure you actually register the new IcebergPrestoToVeloxConnector in Registration.cpp (with the correct connector name) so that the Iceberg connector is available at runtime.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Double‐check that the free helper functions you added in PrestoToVeloxConnector.h (stringToType, toRequiredSubfields, toHiveColumnType, toHiveTableHandle) are declared and defined in the same namespace so you don’t end up with linker mismatches.
- Ensure you actually register the new IcebergPrestoToVeloxConnector in Registration.cpp (with the correct connector name) so that the Iceberg connector is available at runtime.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/connectors/IcebergPrestoToVeloxConnector.cpp:25-33` </location>
<code_context>
+
+namespace {
+
+velox::connector::hive::iceberg::FileContent toVeloxFileContent(
+ const presto::protocol::iceberg::FileContent content) {
+ if (content == protocol::iceberg::FileContent::DATA) {
+ return velox::connector::hive::iceberg::FileContent::kData;
+ } else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
+ return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
+ }
+ VELOX_UNSUPPORTED("Unsupported file content: {}", fmt::underlying(content));
+}
+
</code_context>
<issue_to_address>
**suggestion:** Consider handling additional Iceberg file content types for future extensibility.
If new file content types are added to Iceberg, this function will fail. Please document this limitation or update the code to handle future types gracefully.
```suggestion
/*
* NOTE: This function currently only handles DATA and POSITION_DELETES file content types.
* If new file content types are added to Iceberg, this function must be updated to handle them.
* Otherwise, it will throw an unsupported error for unknown types.
*/
velox::connector::hive::iceberg::FileContent toVeloxFileContent(
const presto::protocol::iceberg::FileContent content) {
if (content == protocol::iceberg::FileContent::DATA) {
return velox::connector::hive::iceberg::FileContent::kData;
} else if (content == protocol::iceberg::FileContent::POSITION_DELETES) {
return velox::connector::hive::iceberg::FileContent::kPositionalDeletes;
}
// Future extensibility: handle new file content types here.
VELOX_UNSUPPORTED(
"Unsupported file content: {}. Please update toVeloxFileContent to handle new types if added to Iceberg.",
fmt::underlying(content));
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| velox::connector::hive::iceberg::FileContent toVeloxFileContent( | ||
| const presto::protocol::iceberg::FileContent content) { | ||
| if (content == protocol::iceberg::FileContent::DATA) { | ||
| return velox::connector::hive::iceberg::FileContent::kData; | ||
| } else if (content == protocol::iceberg::FileContent::POSITION_DELETES) { | ||
| return velox::connector::hive::iceberg::FileContent::kPositionalDeletes; | ||
| } | ||
| VELOX_UNSUPPORTED("Unsupported file content: {}", fmt::underlying(content)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider handling additional Iceberg file content types for future extensibility.
If new file content types are added to Iceberg, this function will fail. Please document this limitation or update the code to handle future types gracefully.
| velox::connector::hive::iceberg::FileContent toVeloxFileContent( | |
| const presto::protocol::iceberg::FileContent content) { | |
| if (content == protocol::iceberg::FileContent::DATA) { | |
| return velox::connector::hive::iceberg::FileContent::kData; | |
| } else if (content == protocol::iceberg::FileContent::POSITION_DELETES) { | |
| return velox::connector::hive::iceberg::FileContent::kPositionalDeletes; | |
| } | |
| VELOX_UNSUPPORTED("Unsupported file content: {}", fmt::underlying(content)); | |
| } | |
| /* | |
| * NOTE: This function currently only handles DATA and POSITION_DELETES file content types. | |
| * If new file content types are added to Iceberg, this function must be updated to handle them. | |
| * Otherwise, it will throw an unsupported error for unknown types. | |
| */ | |
| velox::connector::hive::iceberg::FileContent toVeloxFileContent( | |
| const presto::protocol::iceberg::FileContent content) { | |
| if (content == protocol::iceberg::FileContent::DATA) { | |
| return velox::connector::hive::iceberg::FileContent::kData; | |
| } else if (content == protocol::iceberg::FileContent::POSITION_DELETES) { | |
| return velox::connector::hive::iceberg::FileContent::kPositionalDeletes; | |
| } | |
| // Future extensibility: handle new file content types here. | |
| VELOX_UNSUPPORTED( | |
| "Unsupported file content: {}. Please update toVeloxFileContent to handle new types if added to Iceberg.", | |
| fmt::underlying(content)); | |
| } |
30a9a8c to
0f739c6
Compare
|
@aditi-pandit @yingsu00 Would you please help to have a look at this refactor PR, once this get merged I can submit another PR for the basic iceberg insertion. And since the basic insertion PR in velox has been merged we can merge code in Presto now. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a straightforward refactoring, but would appreciate other reviewers to chime in as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this refactoring @PingLiuPing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @PingLiuPing
| std::vector<velox::common::Subfield> toRequiredSubfields( | ||
| const protocol::List<protocol::Subfield>& subfields); | ||
|
|
||
| velox::connector::hive::HiveColumnHandle::ColumnType toHiveColumnType( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PingLiuPing : Maybe we should move the entire PrestoToVeloxHiveConnector in a separate file as well so that these methods are not exposed from this file. But you can do that as a second cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aditi-pandit I will do that later.
…e file (prestodb#26237) ## Description Refactor the IcebergPrestoToVeloxConnector class by extracting it from PrestoToVeloxConnector.{h,cpp} into dedicated files. This improves code organization and modularity for the Iceberg connector implementation. Easier to maintain and extend Iceberg-specific functionality. ``` == NO RELEASE NOTE == ```
) ## Description <!---Describe your changes in detail--> This is a follow up PR of #26237 (comment) This is a straightforward refactor. ## Motivation and Context <!---Why is this change required? What problem does it solve?--> <!---If it fixes an open issue, please link to the issue here.--> ## Impact <!---Describe any public API or user-facing feature change or any performance impact--> ## Test Plan <!---Please fill in how you tested your change--> ## Contributor checklist - [ ] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [ ] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [ ] Documented new properties (with its default value), SQL syntax, functions, or other functionality. - [ ] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines). - [ ] Adequate tests were added if applicable. - [ ] CI passed. - [ ] If adding new dependencies, verified they have an [OpenSSF Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or higher (or obtained explicit TSC approval for lower scores). ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == NO RELEASE NOTE == ```
Description
Refactor the IcebergPrestoToVeloxConnector class by extracting it from
PrestoToVeloxConnector.{h,cpp} into dedicated files. This improves code
organization and modularity for the Iceberg connector implementation.
Easier to maintain and extend Iceberg-specific functionality.
Changes:
See code review comments #25389 (comment) for motivation.
Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.