Skip to content

Appender rework #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 8, 2025
Merged

Appender rework #295

merged 1 commit into from
Jul 8, 2025

Conversation

staticlibs
Copy link
Collaborator

@staticlibs staticlibs commented Jun 29, 2025

This change implements the access to Appender interface from Java with the following features:

  • C API is used to access the native Appender
  • necessary C API calls are exposed to Java using JNI wrappers as thin as possible - Java calls mirror corresponding C API calls 1 to 1
  • the data chunk interface of the Appender API is used: vector data is exposed as a direct ByteBuffer, all primitive appended values are written to this buffer from Java without going through JNI + C API (that is still used for some types with calls like: duckdb_vector_assign_string_element_len)
  • Java-side Appender/DataChunk/Vector data structures follow the Go lang's Appender implementation closely (with nested arrays initialization etc)
  • Java Appender usage is made thread-safe for concurrent Appender or Connection closure; append() calls are remained not thread-safe (to minimize the overhead), but their usage cannot cause the JNI-side crash
  • Java API of the new Appender is modeled after the java.lang.StringBuilder class and intended to be used with method chaining
  • support primitive arrays (one and two dimensional)
  • support for NULL and DEFAULT values
  • type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates Value instances and appends them one by on) is still available for compatibility. It can be created using Connection#createSingleValueAppender method. It is marked as 'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new Appender API.

Fixes: #84, #100, #110, #139, #157, #163, #219, #249

@staticlibs staticlibs marked this pull request as draft June 29, 2025 23:04
@staticlibs staticlibs force-pushed the appender_datachunk branch from a02716b to 64386b7 Compare July 3, 2025 15:56
@arouel
Copy link

arouel commented Jul 7, 2025

@staticlibs would it feasible to use the Foreign Function and Memory (FFM) API instead of Java Native Interface (JNI)? With jextract you could generate the mapping to the C API easily. I bring this up, because in a future release the interoperation with native code will be disallowed by default. The FFM API is the preferred alternative to JNI.

@staticlibs
Copy link
Collaborator Author

@arouel

The main target for the JDBC driver is Java 8. Access to DuckDB engine through FFM can be introduced in future as an optional alternative, but JNI remains the main approach for any foreseeable future. And upcoming restrictions to native code are likely to be applied (by the JDK) to both JNI and FFM the same way (because they have almost the same problems with app "integrity"). And both JNI and FFM are likely to use the same flags/manifests to disable the restrictions that are added.

Copy link

@arouel arouel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the plan to keep the current method names of the new DuckDBAppender as they were before? If possible, I would prefer if we change the naming pattern, so that one can understand which database type we append to, e.g. appendBlob(byte[] values), appendString(byte[] value), appendTimestamp(Instant value) etc.

When we keep the naming pattern as is, we keep the confusion to which type we map to.

@arouel
Copy link

arouel commented Jul 7, 2025

@arouel

The main target for the JDBC driver is Java 8. Access to DuckDB engine through FFM can be introduced in future as an optional alternative, but JNI remains the main approach for any foreseeable future. And upcoming restrictions to native code are likely to be applied (by the JDK) to both JNI and FFM the same way (because they have almost the same problems with app "integrity"). And both JNI and FFM are likely to use the same flags/manifests to disable the restrictions that are added.

Ok, I get it, when you target Java 8 specifically.

@staticlibs
Copy link
Collaborator Author

@arouel

Is it the plan to keep the current method names

New Appender follows the StringBuilder's approach with overloaded append() method. When the same Java type can be used for different DB type - then special names are used like appendDecimal().

@staticlibs staticlibs force-pushed the appender_datachunk branch from 64386b7 to 2329252 Compare July 8, 2025 15:50
This change implements the access to the Appender interface from Java with
the following features:

 - C API is used to access the native Appender
 - necessary C API calls are exposed to Java using JNI wrappers as thin
   as possible - Java calls mirror corresponding C API calls 1 to 1
 - the data chunk interface of the Appender API is used: vector data is
   exposed as a direct ByteBuffer, all primitive appended values are
   written to this buffer from Java without going through JNI + C API (
   that is still used for some types with calls like:
   `duckdb_vector_assign_string_element_len`)
 - Java-side Appender/DataChunk/Vector data structures follow the Go
   lang's Appender implementation closely (with nested arrays
   initialization etc)
 - Java Appender usage is made thread-safe for concurrent Appender or
   Connection closure; `append()` calls are remained not thread-safe (to
   minimize the overhead), but their usage cannot cause the JNI-side
   crash
 - Java API of the new Appender is modeled after the
   `java.lang.StringBuilder` class and intended to be used with method
   chaining
 - support primitive arrays (one and two dimensional)
 - support for `NULL` and `DEFAULT` values
 - type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates
`Value` instances and appends them one by one) is still available for
compatibility. It can be created using
`Connection#createSingleValueAppender` method. It is marked as
'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new
Appender API.

Fixes: duckdb#84, duckdb#100, duckdb#110, duckdb#139, duckdb#157, duckdb#163, duckdb#219, duckdb#249
@staticlibs staticlibs force-pushed the appender_datachunk branch from 2329252 to ac024e5 Compare July 8, 2025 18:39
@staticlibs staticlibs marked this pull request as ready for review July 8, 2025 18:42
@staticlibs staticlibs closed this Jul 8, 2025
@staticlibs staticlibs reopened this Jul 8, 2025
@staticlibs staticlibs merged commit 935f58b into duckdb:main Jul 8, 2025
10 checks passed
@staticlibs staticlibs deleted the appender_datachunk branch July 8, 2025 20:23
staticlibs added a commit to staticlibs/duckdb-java that referenced this pull request Jul 8, 2025
This is a backport of the PR duckdb#295 to `v1.3-ossivalis` stable branch.

This change implements the access to the Appender interface from Java with
the following features:

 - C API is used to access the native Appender
 - necessary C API calls are exposed to Java using JNI wrappers as thin
   as possible - Java calls mirror corresponding C API calls 1 to 1
 - the data chunk interface of the Appender API is used: vector data is
   exposed as a direct ByteBuffer, all primitive appended values are
   written to this buffer from Java without going through JNI + C API (
   that is still used for some types with calls like:
   `duckdb_vector_assign_string_element_len`)
 - Java-side Appender/DataChunk/Vector data structures follow the Go
   lang's Appender implementation closely (with nested arrays
   initialization etc)
 - Java Appender usage is made thread-safe for concurrent Appender or
   Connection closure; `append()` calls are remained not thread-safe (to
   minimize the overhead), but their usage cannot cause the JNI-side
   crash
 - Java API of the new Appender is modeled after the
   `java.lang.StringBuilder` class and intended to be used with method
   chaining
 - support primitive arrays (one and two dimensional)
 - support for `NULL` and `DEFAULT` values
 - type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates
`Value` instances and appends them one by one) is still available for
compatibility. It can be created using
`Connection#createSingleValueAppender` method. It is marked as
'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new
Appender API.

Fixes: duckdb#84, duckdb#100, duckdb#110, duckdb#139, duckdb#157, duckdb#163, duckdb#219, duckdb#249
@staticlibs staticlibs mentioned this pull request Jul 8, 2025
staticlibs added a commit that referenced this pull request Jul 8, 2025
This is a backport of the PR #295 to `v1.3-ossivalis` stable branch.

This change implements the access to the Appender interface from Java with
the following features:

 - C API is used to access the native Appender
 - necessary C API calls are exposed to Java using JNI wrappers as thin
   as possible - Java calls mirror corresponding C API calls 1 to 1
 - the data chunk interface of the Appender API is used: vector data is
   exposed as a direct ByteBuffer, all primitive appended values are
   written to this buffer from Java without going through JNI + C API (
   that is still used for some types with calls like:
   `duckdb_vector_assign_string_element_len`)
 - Java-side Appender/DataChunk/Vector data structures follow the Go
   lang's Appender implementation closely (with nested arrays
   initialization etc)
 - Java Appender usage is made thread-safe for concurrent Appender or
   Connection closure; `append()` calls are remained not thread-safe (to
   minimize the overhead), but their usage cannot cause the JNI-side
   crash
 - Java API of the new Appender is modeled after the
   `java.lang.StringBuilder` class and intended to be used with method
   chaining
 - support primitive arrays (one and two dimensional)
 - support for `NULL` and `DEFAULT` values
 - type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates
`Value` instances and appends them one by one) is still available for
compatibility. It can be created using
`Connection#createSingleValueAppender` method. It is marked as
'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new
Appender API.

Fixes: #84, #100, #110, #139, #157, #163, #219, #249
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Appender] Add AppendDefault
2 participants