Skip to content

SpatialData loader and writer #17

Open
enric-bazz wants to merge 11 commits intodpeerlab:release/v2-stablefrom
enric-bazz:sd-reader
Open

SpatialData loader and writer #17
enric-bazz wants to merge 11 commits intodpeerlab:release/v2-stablefrom
enric-bazz:sd-reader

Conversation

@enric-bazz
Copy link

@enric-bazz enric-bazz commented Feb 27, 2026

Change 1 – ISTDataModule._load_from_spatialdata

  1. Fixed missing/non-visible imports.
  2. Quality filtering is no longer conditioned on self.min_qv. This ensures that removal of non-gene transcripts is performed and skips QV filtering when not required or possible, given that the appropriate filter is retrieved automatically via SpatialDataQualityFilter based on the platform.

Change 2 – SpatialDataLoader

The class now implements preprocessing steps previously missing. These steps are normally performed by the "Technology"Preprocessor class returned by get_preprocessor() in ISTDataModule.load().

The reason is that get_preprocessor() infers the platform from the input structure. This information is lost when using SpatialData input, making the preprocessor unusable directly. The missing steps are now implemented based on the technology detected.


Change 3 – AnnDataWriter / SpatialDataWriter.write()

Calls to write() in cli/main.py do not explicitly handle overwriting. By default, writing would fail if the file already existed.

If overwrite is not explicitly passed, its value is determined automatically: existing files are now overwritten. A warning may be added in the future.


Change 4 – SpatialDataWriter

Updated element names in the SpatialData object and API calls to be compatible with newer versions.


Change 5 – XeniumTranscriptField and XeniumQualityFilter classes

  1. null_cell_id
    Now represented as a tuple: ('UNASSIGNED', '-1'). Older Xenium versions (e.g., 1.0.0) use -1 as an integer.

Values are cast to strings for consistent comparison with the tuple. All cell IDs must be aligned to strings elsewhere in the code. Consider if compatibility with older versions is required.

  1. is_gene field
    Added to detect gene columns in newer Xenium data and enable filtering transcripts without relying on specific name patterns.

Change 6 – find_mutually_exclusive_genes

Fixed filtering behavior: now filtering ensures cell types do not share ME genes, which prevents same-gene pairs to emerge in the MECR computation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant