-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Problem statement
The biggest hurdle with WebAssembly in the browser is that multiple Wasm modules can't share the same memory space. This means that having e.g. parquet-wasm
and geoarrow-wasm
as two separate NPM modules is annoying! You have to use parquet-wasm
to load parquet into Arrow in Wasm... but then copy the data to JS, and then copy it into the next wasm module to do more processing with it! This is slow, memory intensive, and not user friendly.
Solution
In https://github.com/domoritz/arrow-wasm, Dominik's goal appeared to be to see if Arrow in rust/wasm would be faster than Arrow in JS. But since working with raw buffers is pretty fast in JS, it's not surprising that Wasm overhead would outweigh any other speedups.
I think the potential of arrow-wasm instead is in being a foundational library for other wasm-bindgen libraries.
So I see various potential libraries:
- parquet-wasm: depends on arrow-wasm directly; used by consumers who only want to parse Parquet and get it into JS. PR is here: Depend on arrow-wasm parquet-wasm#292
- geoparquet-wasm: depends on arrow-wasm and geoarrow-rs; used by consumers who only want to parse GeoParquet to GeoArrow and get it into JS. PR is here: Use arrow-wasm geoparquet-wasm#6
- geoarrow-wasm-slim: depends on arrow-wasm, doesn't include geoparquet io to keep a small bundle size. Used by consumers who somehow otherwise have their data as geoarrow in the browser.
- geoarrow-wasm-full: includes extras like re-exporting
geoparquet-wasm
. Used by consumers who want to fetch geoparquet to geoarrow but also do some geospatial operations.
Other libraries for other formats might make sense to add in the future. like geoarrow-flatgeobuf
, which uses rust to parse flatgeobuf into geoarrow. Etc.
Drawbacks
- It would be nice if it were possible to add wasm-bindgen methods onto existing structs from another crate, but that doesn't appear to be possible 🥲 . This means that any crates other than arrow-wasm may only use structs defined in
arrow-wasm
in a functional manner. - There will be a proliferation of feature flags. E.g.
geoarrow-wasm
might have feature flags for each compression in parquet-wasm?
cc @H-Plus-Time