Skip to content

Comments

Updating Python SDK with new Rust core version#19

Merged
elenagaljak-db merged 9 commits intomainfrom
elenagaljak-db_wrapper
Feb 12, 2026
Merged

Updating Python SDK with new Rust core version#19
elenagaljak-db merged 9 commits intomainfrom
elenagaljak-db_wrapper

Conversation

@elenagaljak-db
Copy link
Contributor

@elenagaljak-db elenagaljak-db commented Feb 4, 2026

What changes are proposed in this pull request?

WHAT:

  • Migrated the Python SDK from a pure Python implementation to a Rust-backed implementation using PyO3 bindings
  • Created Rust wrapper modules (sync_wrapper.rs, async_wrapper.rs, common.rs, auth.rs) that expose the
    core Rust SDK functionality to Python
  • Implemented Python-compatible types: TableProperties, StreamConfigurationOptions, RecordType,
    AckCallback, HeadersProvider
  • Added support for both sync and async Python APIs through separate wrapper modules
  • Maintained backwards compatibility with the existing Python API surface

WHY:
The previous Python SDK duplicated logic from the Rust SDK, creating maintenance overhead and potential
inconsistencies. By wrapping the Rust SDK directly, we get:

  1. Single source of truth: All core logic lives in the Rust SDK, eliminating duplication
  2. Performance improvements: Native Rust implementation provides better throughput and lower latency
  3. Type safety: Rust's type system catches errors at compile time
  4. Easier maintenance: Python SDK automatically inherits bug fixes and features from Rust SDK
  5. Consistent behavior: Sync and async Python APIs use the same underlying Rust implementation

The Python API remains unchanged for end users, ensuring a smooth migration path.

Local benchmark results:

ingest_record_offset

Record Size   Throughput
    20 B      0.35 MB/s
    220 B      3.86 MB/s
    750 B     13.18 MB/s
   10000 B     188.41 MB/s

ingest_record_no_wait

Record Size   Throughput
    20 B      7.55 MB/s
    220 B     74.28 MB/s
    750 B     194.56 MB/s
   10000 B     382.67 MB/s

pure python:

Record Size   Throughput
    20 B      0.20 MB/s
    220 B     2.13 MB/s
    750 B     7.13 MB/s
   10000 B     83.06 MB/s

How is this tested?

  • 99 smoke tests pass, verifying API surface compatibility
  • Proto generation tests verify descriptor handling
  • Tested both sync and async SDK APIs
  • Verified backwards compatibility with deprecated ingest_record method
  • Confirmed TableProperties accepts protobuf descriptor objects as before
  • Integration testing covered by comprehensive Rust SDK test suite

@elenagaljak-db elenagaljak-db requested review from a team and teodordelibasic-db February 4, 2026 15:31
Copy link
Contributor

@teodordelibasic-db teodordelibasic-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good, I think we should just mention a few breaking changes.

Signed-off-by: elenagaljak-db <elena.galjak@databricks.com>
Copy link
Contributor

@teodordelibasic-db teodordelibasic-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@elenagaljak-db elenagaljak-db added this pull request to the merge queue Feb 12, 2026
Merged via the queue into main with commit 02dd522 Feb 12, 2026
10 checks passed
@elenagaljak-db elenagaljak-db deleted the elenagaljak-db_wrapper branch February 12, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants