SpatialData loader and writer #17
Open
enric-bazz wants to merge 11 commits intodpeerlab:release/v2-stablefrom
Open
SpatialData loader and writer #17enric-bazz wants to merge 11 commits intodpeerlab:release/v2-stablefrom
enric-bazz wants to merge 11 commits intodpeerlab:release/v2-stablefrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change 1 –
ISTDataModule._load_from_spatialdataself.min_qv. This ensures that removal of non-gene transcripts is performed and skips QV filtering when not required or possible, given that the appropriate filter is retrieved automatically viaSpatialDataQualityFilterbased on the platform.Change 2 –
SpatialDataLoaderThe class now implements preprocessing steps previously missing. These steps are normally performed by the
"Technology"Preprocessorclass returned byget_preprocessor()inISTDataModule.load().The reason is that
get_preprocessor()infers the platform from the input structure. This information is lost when usingSpatialDatainput, making the preprocessor unusable directly. The missing steps are now implemented based on the technology detected.Change 3 –
AnnDataWriter/SpatialDataWriter.write()Calls to
write()incli/main.pydo not explicitly handle overwriting. By default, writing would fail if the file already existed.If
overwriteis not explicitly passed, its value is determined automatically: existing files are now overwritten. A warning may be added in the future.Change 4 –
SpatialDataWriterUpdated element names in the
SpatialDataobject and API calls to be compatible with newer versions.Change 5 –
XeniumTranscriptFieldandXeniumQualityFilterclassesnull_cell_idNow represented as a tuple:
('UNASSIGNED', '-1'). Older Xenium versions (e.g., 1.0.0) use-1as an integer.Values are cast to strings for consistent comparison with the tuple. All cell IDs must be aligned to strings elsewhere in the code. Consider if compatibility with older versions is required.
is_genefieldAdded to detect gene columns in newer Xenium data and enable filtering transcripts without relying on specific name patterns.
Change 6 –
find_mutually_exclusive_genesFixed filtering behavior: now filtering ensures cell types do not share ME genes, which prevents same-gene pairs to emerge in the MECR computation.