Skip to content

Zarr+obstore directory listing truncates names (likely str.lstrip bug), breaks AnnData/Zarr reads on Azure #3639

@antoinegaston

Description

@antoinegaston

Zarr version

v3.1.5

Numcodecs version

v0.16.5

Python Version

3.14

Operating System

Mac

Installation

using uv

Description

When using obstore (Azure) through Zarr’s zarr.storage.ObjectStore, listing “directories” returns truncated child names (e.g. "resolution" becomes "olution", "names" becomes "ames", "logfoldchanges" becomes "gfoldchanges"). This causes Zarr to misinterpret valid keys as “not part of a Zarr hierarchy” and can ultimately lead to KeyError / FileNotFoundError when libraries like anndata traverse groups.

Steps to reproduce

# /// script
# requires-python = ">=3.10"
# dependencies = [
#   "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# ]
# ///
#
# This script automatically imports the development branch of zarr to check for issues

import asyncio
import zarr.storage._obstore as zarr_obstore

async def fake_list_result():
    prefix = "uns/leiden_res_0_25/params"
    return {
        "common_prefixes": [
            f"{prefix}/resolution",
            f"{prefix}/random_state",
        ],
        "objects": [],
    }

async def main():
    items = [x async for x in zarr_obstore._transform_list_dir(
        fake_list_result(),
        "uns/leiden_res_0_25/params",
    )]
    print(items)

asyncio.run(main())

This minimal repro. script returns ['olution', 'om_state'] instead of ['resolution', 'random_state']

Workaround / fix

A one-line fix in the adapter is enough: replace obj.lstrip(prefix) with obj.removeprefix(prefix) (and similarly for the object-path handling).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions