Skip to content

ffi: expose array validity, allow creating primitive arrays#7148

Open
myrrc wants to merge 1 commit intodevelopfrom
myrrc/primitive-ffi
Open

ffi: expose array validity, allow creating primitive arrays#7148
myrrc wants to merge 1 commit intodevelopfrom
myrrc/primitive-ffi

Conversation

@myrrc
Copy link
Contributor

@myrrc myrrc commented Mar 24, 2026

  • Allow querying array validity from FFI.
  • Allow creating null and primitive vx_arrays from FFI.
  • Expose utility functions to query dtype and ptype from array
    directly instead of array -> dtype -> variant [-> ptype] chain
  • Breaking: rename vx_array_element_is_null from vx_array_is_null

@myrrc myrrc force-pushed the myrrc/primitive-ffi branch from b7d2f79 to 516884b Compare March 24, 2026 17:18
@myrrc myrrc requested review from 0ax1 and joseph-isaacs March 24, 2026 17:18
@myrrc myrrc added changelog/feature A new feature changelog/break A breaking API change labels Mar 24, 2026
@myrrc myrrc force-pushed the myrrc/primitive-ffi branch from 516884b to 430ec1e Compare March 24, 2026 17:20
@myrrc myrrc removed the changelog/feature A new feature label Mar 24, 2026
Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
@myrrc myrrc force-pushed the myrrc/primitive-ffi branch from 430ec1e to a3c15ad Compare March 24, 2026 17:23
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 24, 2026

Merging this PR will degrade performance by 15.51%

❌ 3 regressed benchmarks
✅ 1098 untouched benchmarks
⏩ 1522 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation bitwise_not_vortex_buffer_mut[1024] 477.2 ns 535.6 ns -10.89%
Simulation bitwise_not_vortex_buffer_mut[128] 317.8 ns 376.1 ns -15.51%
Simulation map_each[BufferMut<i32>, 128] 770.6 ns 858.1 ns -10.2%

Comparing myrrc/primitive-ffi (a3c15ad) with develop (0ea8973)

Open in CodSpeed

Footnotes

  1. 1522 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@myrrc myrrc enabled auto-merge (squash) March 24, 2026 17:28
);

#[unsafe(no_mangle)]
pub unsafe extern "C-unwind" fn vx_array_is_nullable(array: *const vx_array) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't introduced as a pattern in this PR, but we should consider documenting in the code what C-unwind implies / allows for (compared to extern "C"), explaining how the unwinding can be leveraged.

*/
void vx_array_free(const vx_array *ptr);

bool vx_array_is_nullable(const vx_array *array);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All public C API should have proper docs.


bool vx_array_is_nullable(const vx_array *array);

bool vx_array_is(const vx_array *array, vx_dtype_variant variant);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vx_array_has_dtype?

* validity can't be NULL.
*/
const vx_array *vx_array_new_primitive(vx_ptype ptype,
const void *ptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ptr parameter name should be more descriptive. Shall we have in the fn docs describing what each param is for, here and elsewhere?

vx_error
);

pub(crate) fn write_error(error_out: *mut *mut vx_error, error_string: &str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess part of the contract here is that error_out is not set elsewhere before, otherwise the previous error is leaked.

*/
bool vx_array_element_is_null(const vx_array *array, size_t index, vx_error **error);

uint32_t vx_array_null_count(const vx_array *array, vx_error **error_out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced in this PR but this should be size_t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And shall we consider matching the Rust name vx_array_null_count => ..invalid_count.

@@ -0,0 +1,24 @@
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright the Vortex contributors
#include <catch2/catch_test_macros.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a pragma once.

return {vx_string_ptr(msg), vx_string_len(msg)};
}

inline std::string_view to_string_view(vx_error *err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should they have different fn name prefixes? They kinda do different things no?

) -> *const vx_array {
let slice = unsafe { std::slice::from_raw_parts(ptr, len) };
let buffer = Buffer::copy_from(slice);
let array = PrimitiveArray::new(buffer, validity.into());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In impl From<&vx_validity> for Validity, we clone the arc validity::Array(vx_array::as_ref(value.array).clone()) (aka atomic ref counted pointer).

Saying we don't take ownership of validity as mentioned in the new_primitive docs.

Copy link
Contributor

@0ax1 0ax1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to double check the consistency of size types for indices:

vx_array_get_field` | `uint32_t` |
| `vx_array_slice` | `uint32_t` |
| `vx_array_get_u8` etc. | `uint32_t` |
| `vx_array_null_count` return | `uint32_t

should prob all be size_t.

@myrrc myrrc removed the request for review from joseph-isaacs March 26, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/break A breaking API change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants