Skip to content

consolidate_metadata(zarr.Group) writes a corrupted .zmetadata #2206

Open
@ivirshup

Description

@ivirshup

Zarr version

'2.18.3'

Numcodecs version

'0.13.0'

Python Version

3.11

Operating System

Linux

Installation

pip

Description

While zarr.consolidate_metadata arguably should error when passed a group, it definitely shouldn't write a corrupt .zmetadata. It should probably just see that it was given a group, then throw a type error without writing anything,

Steps to reproduce

import zarr

# Create group
z = zarr.open("tmp.zarr", "w")
zarr.consolidate_metadata(z)  # This errors (as expected)
Full traceback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 6
      4 z = zarr.open("tmp.zarr", "w")
      5 # z.create_group("a-group")  # Some metadata ot 
----> 6 zarr.consolidate_metadata(z)

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/zarr/convenience.py:1296, in consolidate_metadata(store, metadata_key, path)
   1291 out = {
   1292     "zarr_consolidated_format": 1,
   1293     "metadata": {key: json_loads(store[key]) for key in store if is_zarr_key(key)},
   1294 }
   1295 store[metadata_key] = json_dumps(out)
-> 1296 return open_consolidated(store, metadata_key=metadata_key, path=path)

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/zarr/convenience.py:1360, in open_consolidated(store, metadata_key, mode, **kwargs)
   1357         metadata_key = "meta/root/consolidated/" + metadata_key
   1359 # setup metadata store
-> 1360 meta_store = ConsolidatedStoreClass(store, metadata_key=metadata_key)
   1362 # pass through
   1363 chunk_store = kwargs.pop("chunk_store", None) or store

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/zarr/storage.py:3046, in ConsolidatedMetadataStore.__init__(self, store, metadata_key)
   3043 self.store = Store._ensure_store(store)
   3045 # retrieve consolidated metadata
-> 3046 meta = json_loads(self.store[metadata_key])
   3048 # check format of consolidated metadata
   3049 consolidated_format = meta.get("zarr_consolidated_format", None)

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/zarr/util.py:76, in json_loads(s)
     74 def json_loads(s: Union[bytes, str]) -> Dict[str, Any]:
     75     """Read JSON in a consistent way."""
---> 76     return json.loads(ensure_text(s, "utf-8"))

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/numcodecs/compat.py:176, in ensure_text(s, encoding)
    174 def ensure_text(s, encoding="utf-8"):
    175     if not isinstance(s, str):
--> 176         s = ensure_contiguous_ndarray(s)
    177         s = codecs.decode(s, encoding)
    178     return s

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/numcodecs/compat.py:153, in ensure_contiguous_ndarray(buf, max_buffer_size, flatten)
    124 def ensure_contiguous_ndarray(buf, max_buffer_size=None, flatten=True) -> np.array:
    125     """Convenience function to coerce `buf` to a numpy array, if it is not already a
    126     numpy array. Also ensures that the returned value exports fully contiguous memory,
    127     and supports the new-style buffer interface. If the optional max_buffer_size is
   (...)
    149     return a view on memory exported by `buf`.
    150     """
    152     return ensure_ndarray(
--> 153         ensure_contiguous_ndarray_like(buf, max_buffer_size=max_buffer_size, flatten=flatten)
    154     )

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/numcodecs/compat.py:98, in ensure_contiguous_ndarray_like(buf, max_buffer_size, flatten)
     70 def ensure_contiguous_ndarray_like(buf, max_buffer_size=None, flatten=True) -> NDArrayLike:
     71     """Convenience function to coerce `buf` to ndarray-like array.
     72     Also ensures that the returned value exports fully contiguous memory,
     73     and supports the new-style buffer interface. If the optional max_buffer_size is
   (...)
     96     return a view on memory exported by `buf`.
     97     """
---> 98     arr = ensure_ndarray_like(buf)
    100     # check for object arrays, these are just memory pointers, actual memory holding
    101     # item data is scattered elsewhere
    102     if arr.dtype == object:

File ~/miniforge3/envs/cellxgene-census-dev-new/lib/python3.11/site-packages/numcodecs/compat.py:42, in ensure_ndarray_like(buf)
     38     raise TypeError("array.array with char or unicode type is not supported")
     39 else:
     40     # N.B., first take a memoryview to make sure that we subsequently create a
     41     # numpy array from a memory buffer with no copy
---> 42     mem = memoryview(buf)
     43     # instantiate array from memoryview, ensures no copy
     44     buf = np.array(mem, copy=False)

TypeError: memoryview: a bytes-like object is required, not 'Array'

We can see that a .zmetadata file got written despite the failure:

!ls -a tmp.zarr/
.  ..  .zgroup	.zmetadata

And if we open the store the .zmetadata shows up as an array:

z2 = zarr.open("tmp.zarr")
z2.visititems(print)
.zmetadata <zarr.core.Array '/.zmetadata' () |S57>

Additional output

cc: @ilan-gold @ebezzi

Metadata

Metadata

Assignees

No one assigned

    Labels

    V2Affects the v2 branchbugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions