Skip to content

feat(rust/sedona-raster-functions): add RS_SetSRID/RS_SetCRS with batch-local cache refactoring#630

Merged
paleolimbot merged 6 commits intoapache:mainfrom
Kontinuation:rs-set-srid-crs
Feb 18, 2026
Merged

feat(rust/sedona-raster-functions): add RS_SetSRID/RS_SetCRS with batch-local cache refactoring#630
paleolimbot merged 6 commits intoapache:mainfrom
Kontinuation:rs-set-srid-crs

Conversation

@Kontinuation
Copy link
Member

Summary

  • Add RS_SetSRID and RS_SetCRS raster functions that set the CRS on raster values, with null-input semantics: null SRID/CRS input nulls the entire raster, while SRID=0/CRS="0" clears the CRS but preserves the raster.
  • Introduce CachedSRIDToCrs and CachedCrsNormalization batch-local cache structs in sedona-schema/src/crs.rs, shared by both ST_ (geometry) and RS_ (raster) setter functions.
  • Refactor ST_SetSRID/ST_SetCRS to use the new shared cache types instead of inline caching.

Details

RS_SetSRID / RS_SetCRS

  • Kernel structs carry engine: Option<Arc<dyn CrsEngine + Send + Sync>> matching the ST_ pattern
  • Both _with_engine_udf(engine) and _udf() (delegates with None) constructors
  • extract_input_nulls() captures the null buffer from the original input argument before any conversion, then replace_raster_crs() merges input nulls into the struct's null buffer via NullBuffer::union()
  • 17 tests including 4 null-input tests covering scalar and array cases

Batch-local caches

  • CachedSRIDToCrs: caches SRID integer → CRS string conversion (e.g. 4326 → "OGC:CRS84")
  • CachedCrsNormalization: caches CRS string → abbreviated authority:code form via deserialization
  • Both use #[derive(Default)] and are used identically by ST_ and RS_ functions

…SRID/CRS is null

Extract input nulls from the original SRID/CRS argument before conversion
and merge them into the raster struct's null buffer. This distinguishes
actual null inputs (which null the raster) from SRID=0/CRS="0" (which
clear the CRS but preserve the raster).
…nstead of Cow

The Cow<'static, str> was always used as Cow::Owned, never Borrowed.
Plain String removes the enum discriminant overhead and simplifies inserts.
srid_to_crs_columnar and normalize_crs_columnar now return StringViewArray
directly instead of ColumnarValue, removing redundant Scalar/Array
re-wrapping and the cast_to(Utf8View) in replace_raster_crs. Broadcasting
for scalar CRS applied to array raster is handled explicitly.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds RS_SetSRID and RS_SetCRS functions for setting the CRS on raster values, while refactoring CRS caching logic into shared batch-local cache structs that are used by both ST_ (geometry) and RS_ (raster) functions.

Changes:

  • Adds RS_SetSRID and RS_SetCRS functions with null-input semantics: null SRID/CRS input nulls the entire raster, while SRID=0/CRS="0" clears the CRS but preserves the raster
  • Introduces shared CachedSRIDToCrs and CachedCrsNormalization cache structs in sedona-schema/src/crs.rs for batch-local CRS conversion caching
  • Refactors ST_SetSRID/ST_SetCRS to use the new shared cache types, replacing inline caching with HashMap-based structures

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
rust/sedona-schema/src/crs.rs Adds CachedSRIDToCrs and CachedCrsNormalization batch-local cache structs; simplifies CachedCrsToSRIDMapping by removing Cow wrapper and using Default derive
rust/sedona-raster-functions/src/rs_setsrid.rs Implements RS_SetSRID and RS_SetCRS functions with 17 tests covering null handling, CRS normalization, and metadata preservation
rust/sedona-raster-functions/src/register.rs Registers the new RS_SetSRID and RS_SetCRS functions
rust/sedona-raster-functions/src/lib.rs Adds rs_setsrid module declaration
rust/sedona-raster-functions/benches/native-raster-functions.rs Adds benchmarks for rs_setcrs and rs_setsrid functions
rust/sedona-functions/src/st_setsrid.rs Refactors to use shared CachedSRIDToCrs and CachedCrsNormalization instead of inline HashSet/HashMap caching

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Kontinuation Kontinuation marked this pull request as ready for review February 17, 2026 15:27
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Arc::new(StringViewArray::new_null(len))
} else {
let value = array.value(0);
Arc::new(std::iter::repeat_n(Some(value), len).collect::<StringViewArray>())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an optimization for string views with respect to appending repeated values:

https://github.com/apache/datafusion/blob/6798dff0d7e1c52d6bea74b141d4df6ad9cc1516/datafusion/common/src/scalar/mod.rs#L3120-L3127

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method optimized for inserting repeated values (try_append_value_n) was introduced in arrow-rs 57.2.0, we need to upgrade to a later version of DataFusion and arrow-rs to use them. See apache/datafusion#19797

I would like to stick to the current method since it is the way DataFusion 51 uses to append repeated values: https://github.com/apache/datafusion/blob/51.0.0/datafusion/common/src/scalar/mod.rs#L2960C13-L2964

Co-authored-by: Dewey Dunnington <dewey@dunnington.ca>
@paleolimbot paleolimbot changed the title feat(raster): add RS_SetSRID/RS_SetCRS with batch-local cache refactoring feat(rust/sedona-raster-functions): add RS_SetSRID/RS_SetCRS with batch-local cache refactoring Feb 18, 2026
@paleolimbot paleolimbot merged commit 527c1aa into apache:main Feb 18, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments