Skip to content

[native] Add CLP connector #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

[native] Add CLP connector #4

wants to merge 1 commit into from

Conversation

wraymo
Copy link
Member

@wraymo wraymo commented Mar 25, 2025

Description

This PR is a follow-up to #2 and adds Prestissimo support for the CLP connector. The native connector is now the default option and is responsible for deserializing classes from the Java coordinator, which are then used by Velox for query execution.

Motivation and Context

Check the RFC

Impact

This connector is independent from others and will not affect the existing functionalities.

Test Plan

We have performed end-to-end testing.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
Prestissimo (Native Execution)
* Add support for CLP connectors :pr:`4`

If release note is NOT required, use:

== NO RELEASE NOTE ==

Summary by CodeRabbit

  • New Features
    • Introduced support for a new CLP connector to enhance integration between Presto and Velox, improving data conversion and interoperability.
  • Chores
    • Updated build and testing configurations for improved stability and performance.
    • Enhanced protocol handling with robust JSON and protobuf integration.
  • Subproject Update
    • Upgraded underlying dependencies to leverage the latest improvements in external components.

Copy link

coderabbitai bot commented Mar 25, 2025

Walkthrough

The changes update build configurations and introduce a new CLP connector. The modifications in CMake files add the "-no-pie" option and update library dependencies. In the server code, the CLP connector is registered and the connector factory is conditionally added. New files, including headers, source files, JSON schemas, Mustache templates, and YAML configurations, provide CLP protocol definitions and serialization support. Additionally, special include files for serialization of CLP components and protocol integration files are added. Finally, the subproject commit for Velox has been updated.

Changes

Files Change Summary
presto-native-execution/.../main/CMakeLists.txt, .../main/tests/CMakeLists.txt Updated linking options with -no-pie and replaced/added library dependencies (switch from velox_tpch_connector to velox_clp_connector) for server and test targets.
presto-native-execution/.../main/PrestoServer.cpp Registered the new ClpPrestoToVeloxConnector in the server’s run method and updated connector factory registration logic.
presto-native-execution/.../main/types/PrestoToVeloxConnector.cpp, PrestoToVeloxConnector.h Added new class ClpPrestoToVeloxConnector with methods to convert Presto connector types to Velox equivalents.
presto-native-execution/.../main/types/tests/CMakeLists.txt Added dependency on velox_clp_connector in multiple test targets.
presto-native-execution/.../presto_protocol/Makefile Expanded build steps to generate CLP connector C++ source/header files, JSON, and protobuf files.
presto-native-execution/.../presto_protocol/connector/clp/* (e.g. .h, mustache, .cpp, .json, .yml) Introduced various new files for CLP protocol definitions, type aliases, JSON serialization/deserialization, and template generation.
presto-native-execution/.../presto_protocol/connector/clp/special/* Added special include files for ClpColumnHandle and ClpTransactionHandle with JSON serialization implementations.
presto-native-execution/.../presto_protocol/presto_protocol.cpp, presto_protocol.h Added new include directives to integrate CLP connector protocol files.
presto-native-execution/velox Updated subproject commit identifier.

Sequence Diagram(s)

sequenceDiagram
    participant Server as PrestoServer
    participant CLP as ClpPrestoToVeloxConnector
    participant Registry as ConnectorFactoryRegistry

    Server->>CLP: Instantiate new CLP connector
    Server->>Registry: Check if CLP factory exists
    alt Factory not registered
        Server->>Registry: Register ClpConnectorFactory
    else Factory already registered
        Registry-->>Server: Factory exists
    end
Loading
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (14)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json (2)

32-37: Consistent naming for boolean types.
The field_type is declared as boolean, while field_text is bool. For clarity and maintainability, consider unifying the naming in generated code to consistently use bool. This helps avoid confusion for future contributors.


44-114: Add usage examples or documentation blocks.
ClpSplit, ClpTableHandle, and ClpTableLayoutHandle are defined clearly, but some developers may benefit from usage examples or inline documentation describing expected values, especially for fields like query that can be optional.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache (2)

73-93: Consider switching to a faster lookup for enum serialization.
The current approach uses a linear search through a static array. Although this is typically acceptable for small enumerations, consider using a hash-based map if enums grow significantly, to provide faster lookups for serialization and deserialization.


107-142: Raise a more descriptive error for abstract mismatches.
When an unrecognised or invalid _type is encountered during from_json, a fairly generic error is thrown. Providing more descriptive guidance or referencing valid _type values could aid debugging for future integrations.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1)

23-27: Document plans for unimplemented functionality.

Many template parameters are set to NotImplemented, indicating features that aren't currently implemented. Consider adding documentation about which features are planned for future implementation versus those that will remain unimplemented.

presto-native-execution/presto_cpp/main/tests/CMakeLists.txt (1)

26-26: Consider the security implications of using -no-pie.

The -no-pie flag disables Position Independent Executable, which is a security feature that helps mitigate certain types of attacks. While this may be necessary for linking compatibility, it should be documented why this option is required.

+# Disable PIE to address linking issues with the CLP connector components
 target_link_options(presto_server_test PRIVATE "-no-pie")
presto-native-execution/presto_cpp/main/CMakeLists.txt (1)

102-102: Consider the security implications of using -no-pie.

Similar to the test configuration, disabling Position Independent Executable (PIE) might expose the application to certain security vulnerabilities. If possible, consider addressing the underlying linking issues that necessitate this flag.

+# Disable PIE to address linking issues with the CLP connector components
 target_link_options(presto_server PRIVATE "-no-pie")
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc (1)

18-33: Be cautious with dynamic_cast in operator< method.

The operator< implementation uses dynamic_cast without checking if the cast is successful. This could potentially throw a std::bad_cast exception if an object of a different type is compared.

- bool operator<(const ColumnHandle& o) const override {
-   return columnName < dynamic_cast<const ClpColumnHandle&>(o).columnName;
- }
+ bool operator<(const ColumnHandle& o) const override {
+   const auto* other = dynamic_cast<const ClpColumnHandle*>(&o);
+   if (false == other) {
+     throw std::runtime_error("Cannot compare with a non-ClpColumnHandle");
+   }
+   return columnName < other->columnName;
+ }

Additionally, note that the constructor ClpColumnHandle() noexcept is declared but not defined in this file. Ensure it's implemented elsewhere.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache (1)

25-34: Ensure comprehensive testing for JSON round-trip
The to_json/from_json implementations for ClpTransactionHandle properly push and read array elements, ensuring consistent ordering. Consider adding unit tests (or verifying auto-generated tests) to confirm that both functions work correctly under various scenarios.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (2)

25-34: Array-based JSON for ClpTransactionHandle
Storing _type and instance fields in a JSON array is consistent with the usage in from_json. This approach is valid, though a JSON object structure might be more descriptive. If you anticipate additional fields in the future, consider switching to an object-based schema for clarity.


110-156: Proper handling of table and layout handles
The ClpTableHandle and ClpTableLayoutHandle handle standard fields (schemaTableName, query) thoroughly. Confirm that these are tested in any integration tests that rely on JSON-based configuration or dynamic handle retrieval.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (3)

32-45: Duplicate License Header Detected

A second license header appears starting at line 32. Please verify if this repetition is intentional; if not, consider consolidating the license headers to avoid redundancy.


59-61: Dynamic Cast in Operator<

The overridden operator< employs a dynamic_cast to convert the base reference to a ClpColumnHandle. Ensure that the compared objects are always of the correct type so that the cast does not fail at runtime. It may be advisable to add runtime type verification or consider a safer casting approach if feasible.


66-76: ClpSplit Structure Review

The declaration of ClpSplit, including its members and the noexcept constructor, is clear. The use of std::shared_ptr<String> for the query field is notable; please verify that this design choice is intentional and offers a benefit over a simple String member.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2ac312 and b183dc9.

📒 Files selected for processing (20)
  • presto-native-execution/presto_cpp/main/CMakeLists.txt (1 hunks)
  • presto-native-execution/presto_cpp/main/PrestoServer.cpp (3 hunks)
  • presto-native-execution/presto_cpp/main/tests/CMakeLists.txt (2 hunks)
  • presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.cpp (2 hunks)
  • presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.h (1 hunks)
  • presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt (3 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/Makefile (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpColumnHandle.hpp.inc (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.cpp.inc (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.hpp.inc (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp (1 hunks)
  • presto-native-execution/presto_cpp/presto_protocol/presto_protocol.h (1 hunks)
  • presto-native-execution/velox (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.

**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp
  • presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.cpp
  • presto-native-execution/presto_cpp/main/PrestoServer.cpp
  • presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp
🧬 Code Definitions (3)
presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.h (1)
presto-native-execution/presto_cpp/main/SystemConnector.h (3)
  • catalogId (185-188)
  • column (190-192)
  • tableHandle (194-201)
presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.cpp (3)
presto-native-execution/presto_cpp/main/types/PrestoToVeloxSplit.cpp (2)
  • toVeloxSplit (22-47)
  • toVeloxSplit (22-23)
presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.h (15)
  • catalogId (46-49)
  • catalogId (117-120)
  • catalogId (169-172)
  • catalogId (196-199)
  • catalogId (223-226)
  • column (52-54)
  • column (122-124)
  • column (174-176)
  • column (201-203)
  • column (228-230)
  • tableHandle (57-64)
  • tableHandle (126-133)
  • tableHandle (178-185)
  • tableHandle (205-212)
  • tableHandle (232-239)
presto-native-execution/presto_cpp/main/SystemConnector.h (3)
  • catalogId (185-188)
  • column (190-192)
  • tableHandle (194-201)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (1)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (4)
  • ClpColumnHandle (57-57)
  • ClpSplit (72-72)
  • ClpTableHandle (81-81)
  • ClpTableLayoutHandle (91-91)
🔇 Additional comments (38)
presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.json (2)

3-4: Confirm the generation workflow.
These lines state that the file is generated and should not be edited manually. Please confirm that there is a secure and reliable pipeline for regenerating this file, to avoid inadvertent manual modifications that could be overwritten.

Would you like assistance in validating the generator script or verifying its usage via a build command or documentation lookup?


7-7: Caution with dynamic_cast in operator<.
Within the hinc block, operator<(...) relies on dynamic_cast to downcast the base class reference. This may produce unexpected behaviour if the cast fails, potentially throwing std::bad_cast at runtime. Ensure that all instances passed to operator< are indeed ClpColumnHandle.

Would you like me to generate a script to locate and verify all usage sites of ClpColumnHandle::operator< to confirm proper usage?

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-cpp.mustache (1)

46-71: Ensure _type usage matches the schema.
When assigning p._type inside the constructors and from JSON, confirm that _type aligns with the expected class or subclass key. Mismatched or missing _type assignments can lead to parsing or type lookup errors in the rest of the pipeline.

presto-native-execution/velox (1)

1-1: Subproject Commit Update Verification

The subproject commit has been updated to 16932ecfc32b94734640e44d03cbb1e5361c50c6. Please ensure that this update is fully compatible with the CLP connector native code integration and does not introduce any regressions or compatibility issues.

presto-native-execution/presto_cpp/presto_protocol/presto_protocol.h (2)

19-19: Include for new CLP connector protocol added successfully.

The include directive for the CLP protocol header file follows the same pattern as the other connector inclusions and is properly placed.


17-19: Consider the implications of adding to a deprecated file.

This file is marked as deprecated at line 17, yet new functionality is being added. While this approach may be necessary for transitional purposes, ensure there's a plan for eventually migrating this functionality when the file is removed.

presto-native-execution/presto_cpp/presto_protocol/presto_protocol.cpp (2)

18-18: Include for new CLP connector implementation added successfully.

The include statement follows the established pattern used for other connector implementations in this file.


16-18: Consider the implications of adding to a deprecated file.

Similar to the header file, this implementation file is marked as deprecated, yet new functionality is being added. Ensure there's a migration plan for when these files are eventually removed.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.cpp.inc (3)

15-17: Good documentation of special case handling.

The comment clearly explains why this type requires special handling (Java counterpart is an enum), which is helpful for future maintenance.


20-24: JSON serialization implementation looks correct.

The serialization method properly captures the structure as a JSON array containing the _type and instance values.


26-29: JSON deserialization implementation looks correct.

The deserialization method properly extracts values from the JSON array and assigns them to the appropriate struct members.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/ClpConnectorProtocol.h (1)

19-28: Type alias defined with appropriate template parameters.

The ClpConnectorProtocol type is properly defined as an alias for ConnectorProtocolTemplate with specific CLP connector types.

presto-native-execution/presto_cpp/main/types/tests/CMakeLists.txt (1)

29-29: These changes integrate the CLP connector into the test framework.

The addition of velox_clp_connector to all three test targets is consistent and ensures that the CLP connector functionality is properly tested across different components. This is a well-structured approach to extending the testing framework to include the new connector.

Also applies to: 66-66, 98-98

presto-native-execution/presto_cpp/main/tests/CMakeLists.txt (1)

48-48: Consistent integration of the CLP connector in the test suite.

Adding velox_clp_connector to the test executable ensures that the tests can properly exercise the CLP connector functionality, maintaining consistency with the main executable changes.

presto-native-execution/presto_cpp/main/CMakeLists.txt (1)

109-109:

❓ Verification inconclusive

CLP connector integration looks appropriate.

The addition of velox_clp_connector to the link libraries aligns with the integration goals of this PR. However, note that the existing TODO comment suggests there are unresolved linking issues that might need to be addressed in a future PR.


🏁 Script executed:

#!/bin/bash
# Verify that no references to TpchTableHandle remain unresolved in the codebase
# after replacing velox_tpch_connector with velox_clp_connector

# First, search for any imports or uses of TpchTableHandle in source files
echo "Searching for TpchTableHandle references in the codebase:"
rg "TpchTableHandle" --type cpp --type hpp

# Then check if there are any residual TPCH connector initializations that 
# might need to be updated to use CLP connector instead
echo "Searching for TPCH connector registrations that might need updating:"
rg "registerConnector.*(\"tpch\")" --type cpp

Length of output: 388


Attention: Manual Verification on Unresolved References Required

The CLP connector integration appears correct, and the linking adjustment at line 109 in presto-native-execution/presto_cpp/main/CMakeLists.txt properly includes velox_clp_connector. However, the automated search for any remaining TpchTableHandle references did not yield a conclusive output due to an error with the file type parameter (i.e. "unrecognized file type: hpp"). This means that while the current integration meets the PR's goals, you should manually verify that:

  • No unresolved TpchTableHandle references remain in the codebase.
  • The existing TODO-comment regarding linking issues is revisited in future updates.
presto-native-execution/presto_cpp/presto_protocol/connector/clp/special/ClpTransactionHandle.hpp.inc (1)

1-28:

❓ Verification inconclusive

The CLP transaction handle implementation looks good, but ensure complete implementation.

This header file properly defines the CLP transaction handle structure with appropriate JSON serialization declarations. The comment about this corresponding to a Java enum is useful documentation.

A few considerations:

  1. Make sure the to_json and from_json functions are properly implemented in a corresponding cpp file
  2. Since this struct corresponds to a Java enum, ensure that the serialization/deserialization correctly handles this difference in representation between languages

Otherwise, the code structure is clean and follows the project's patterns.


🏁 Script executed:

#!/bin/bash
# Verify that the to_json and from_json functions for ClpTransactionHandle are 
# implemented in a corresponding cpp file

echo "Searching for ClpTransactionHandle to_json/from_json implementations:"
rg "void to_json\(json& j, const ClpTransactionHandle& p\)" --type cpp

echo "Checking for associated CPP files that might contain the implementation:"
fd -e cpp --exec grep -l "ClpTransactionHandle" {} \;

Length of output: 809


CLP Transaction Handle – Verify JSON Deserialization

  • The to_json function is implemented in presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp as expected.
  • Please verify that the corresponding from_json function is also implemented, ensuring it correctly handles the transformation, particularly given that this struct corresponds to a Java enum.
  • Confirm that both serialization and deserialization account for any differences in enum representation between C++ and Java.

Overall, the structure and documentation look clean; just ensure that the complete JSON conversion (both to_json and from_json) is properly addressed.

presto-native-execution/presto_cpp/main/PrestoServer.cpp (3)

51-51: Include for the CLP connector has been properly added.

The include for the CLP connector header follows the existing pattern in the file.


275-276: New CLP connector registration correctly implemented.

The registration of ClpPrestoToVeloxConnector follows the same pattern as the other connector registrations (Hive, Iceberg, Tpch) in this file.


1189-1193: Proper conditional registration of the CLP connector factory.

This code correctly checks if the ClpConnectorFactory is already registered before registering it, following the pattern used for other connector factories. The implementation ensures that the factory is registered only once.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.yml (1)

13-39: The CLP connector protocol definition looks good.

The YAML configuration correctly defines the abstract classes and their CLP-specific subclasses, following the same pattern as other connectors in the system. The class hierarchy is properly specified with appropriate superclasses.

presto-native-execution/presto_cpp/presto_protocol/Makefile (3)

48-53: Build steps for CLP connector correctly added.

The additions follow the same pattern used for other connectors (Hive, Iceberg, Tpch) and properly generate the required files for the CLP connector using the same tools and process.


60-60: JSON generation for CLP connector properly added.

The command to generate the JSON file for the CLP connector follows the same pattern as the other connectors.


67-67: Protobuf generation for CLP connector correctly added.

The command to generate the protobuf file for the CLP connector follows the same pattern as the other connectors.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol-json-hpp.mustache (2)

1-13: Licence header is clearly defined
The Apache licence header is properly included.


36-69: Template expansions appear correct
These template expansions for structs and enums look coherent and aligned with the rest of the code. Ensure any optional fields or inherited classes are properly handled in to_json/from_json if you introduce them later.

presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.h (1)

217-243: New CLP connector class is well-structured
The ClpPrestoToVeloxConnector class integrates seamlessly with the existing framework. Its constructor and final methods match the base class interface and follow the established pattern from other connectors (e.g., TpchPrestoToVeloxConnector, HivePrestoToVeloxConnector). Ensure you add proper coverage in unit tests or integration tests to validate the CLP connector functionality (split creation, table handle generation, etc.).

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.cpp (3)

1-19: Generated file structure verified
This file is auto-generated, and it properly includes licensing and disclaimers. No issues found with the present formatting or versioning statements.


37-73: ClpColumnHandle covers all needed fields
Including properties like columnName, originalColumnName, columnType and nullable properly in JSON ensures robust serialization. Continue maintaining the same field names on both sides of to_json and from_json to avoid schema mismatches.


76-108: ClpSplit design aligns with the pattern
The _type field is correctly assigned "clp", consistent with the rest of the code. Deserializing archivePath and query from JSON objects ensures these fields are properly mapped. Keep an eye on edge cases such as empty strings for archivePath.

presto-native-execution/presto_cpp/main/types/PrestoToVeloxConnector.cpp (3)

16-16: Including ClpConnectorProtocol
The explicit inclusion of "ClpConnectorProtocol.h" is correct for instantiating the CLP-specific protocol in the createConnectorProtocol implementation.


22-24: CLP connector headers integrated
The new includes for ClpColumnHandle.h, ClpConnectorSplit.h, and ClpTableHandle.h look consistent, matching the usage in the updated methods below.


1560-1610: CLP connector methods

  1. toVeloxSplit: Properly casts to ClpSplit and creates a ClpConnectorSplit. Consider verifying that an empty archivePath or query is acceptable to the downstream logic.
  2. toVeloxColumnHandle: Uses ClpColumnHandle, consistent with the pattern in other connectors. The dynamic_cast check ensures correct type usage.
  3. toVeloxTableHandle: Extracts the ClpTableLayoutHandle and returns a ClpTableHandle. Consider addressing optional or missing fields if your future protocols evolve.
  4. createConnectorProtocol: Returns ClpConnectorProtocol to register with the dispatcher.

Overall, the new methods are coherent with the rest of the system.

presto-native-execution/presto_cpp/presto_protocol/connector/clp/presto_protocol_clp.h (6)

1-2: Auto-Generated File Notice

These lines indicate that the file is auto-generated. Manual changes are discouraged.


25-31: ClpTransactionHandle Structure Review

The declaration for ClpTransactionHandle is straightforward and conforms to the existing style. Its member initialization is clear and appropriate.


50-56: ClpColumnHandle Member Declaration – Type Query

The members are correctly declared; however, the type boolean used for the nullable member (line 55) is non-standard in C++. Please verify whether boolean is a defined type alias in the project context. If not, it may be preferable to use the standard bool.


63-64: JSON Function Declarations for ClpColumnHandle

The to_json and from_json function declarations follow the nlohmann::json pattern appropriately. Confirm that the corresponding implementations handle all member variables correctly.


77-85: ClpTableHandle Declaration Review

The ClpTableHandle structure is concise and its member and constructor are declared correctly. The JSON function declarations align with the pattern used elsewhere.


86-95: ClpTableLayoutHandle Structure Review

The ClpTableLayoutHandle is well-defined, with its composite ClpTableHandle and the use of a shared pointer for query fitting the intended design. The JSON serialization function declarations are consistent with other structures.

Comment on lines +29 to +33
void to_json(json& j, const ClpTransactionHandle& p) {
j = json::array();
j.push_back(p._type);
j.push_back(p.instance);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate JSON array length before indexing.
to_json(json& j, const ClpTransactionHandle& p) treats the JSON as an array and pushes back two elements. Conversely, from_json accesses [0] and [1] directly. Ensure the length is at least two before indexing to avoid out-of-range errors or unclear exception messages.

@wraymo wraymo changed the title Add CLP connector native code [native] Add CLP connector Mar 25, 2025
Comment on lines +48 to +53
# build clp connector related structs
echo "// DO NOT EDIT : This file is generated by chevron" > connector/clp/presto_protocol_clp.cpp
chevron -d connector/clp/presto_protocol_clp.json connector/clp/presto_protocol-json-cpp.mustache >> connector/clp/presto_protocol_clp.cpp
echo "// DO NOT EDIT : This file is generated by chevron" > connector/clp/presto_protocol_clp.h
chevron -d connector/clp/presto_protocol_clp.json connector/clp/presto_protocol-json-hpp.mustache >> connector/clp/presto_protocol_clp.h
clang-format -style=file -i connector/clp/presto_protocol_clp.h connector/clp/presto_protocol_clp.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tab instead of spaces, otherwise the Makefile cannot run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants