Open
Description
Zarr version
v3.0.8
Numcodecs version
v0.16.1
Python Version
3.13.1
Operating System
Mac
Installation
using uv
Description
There seems to be an issue with the zarr + obstore integration, which results in a FileNotFoundError
when trying to write data into a newly created, empty array in case some of the chunks trying to write are empty (all zeros).
Curiously though the issue only surfaces when using a GCSStore
or a LocalStore
, for S3
it seems to work as expected.
import numpy as np
import zarr
from obstore.store import LocalStore
from zarr.storage import ObjectStore
zarr_store = ObjectStore(LocalStore("test_zarr_store")) # issue also comes up with GCSStore
arr = zarr.create_array(zarr_store, name="arr", shape=(5, 128, 128), dtype=np.uint16, chunks=(1, 32, 32))
# this will fail with: FileNotFoundError: Object at location debug/zarr_issue/arr/c/0/1/2 not found
arr[0, :, :] = np.zeros((128, 128), dtype=np.uint16)
FileNotFoundError, expand for Traceback
Traceback (most recent call last):
File "/Users/lukasbindreiter/Documents/tilebox/playground/zarr/gcs_issue.py", line 20, in <module>
arr[0, :, :] = np.zeros((128, 128), dtype=np.uint16)
~~~^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/array.py", line 2553, in __setitem__
self.set_orthogonal_selection(pure_selection, value, fields=fields)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/_compat.py", line 43, in inner_f
return f(*args, **kwargs)
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/array.py", line 3009, in set_orthogonal_selection
return sync(
self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype)
)
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/sync.py", line 163, in sync
raise return_result
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/sync.py", line 119, in _runner
return await coro
^^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/array.py", line 1446, in _set_selection
await self.codec_pipeline.write(
...<12 lines>...
)
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/codec_pipeline.py", line 481, in write
await concurrent_map(
...<6 lines>...
)
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/common.py", line 76, in concurrent_map
return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/common.py", line 74, in run
return await func(*item)
^^^^^^^^^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/codec_pipeline.py", line 431, in write_batch
await concurrent_map(
...<8 lines>...
)
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/common.py", line 76, in concurrent_map
return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/common.py", line 74, in run
return await func(*item)
^^^^^^^^^^^^^^^^^
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/core/codec_pipeline.py", line 427, in _write_key
await byte_setter.delete()
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/storage/_common.py", line 165, in delete
await self.store.delete(self.path)
File "/Users/lukasbindreiter/Library/Caches/uv/environments-v2/repr-6df0040f128f3d90/lib/python3.13/site-packages/zarr/storage/_obstore.py", line 184, in delete
await obs.delete_async(self.store, key)
FileNotFoundError: Object at location debug/zarr_issue/arr/c/0/1/2 not found: Error performing DELETE https://storage.googleapis.com/workflow%2Dcache%2D15c9850/debug%2Fzarr%5Fissue%2Farr%2Fc%2F0%2F1%2F2 in 72.532333ms - Server returned non-2xx status code: 404 Not Found: <?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Details>No such object: workflow-cache-15c9850/debug/zarr_issue/arr/c/0/1/2</Details></Error>
Debug source:
NotFound {
path: "debug/zarr_issue/arr/c/0/1/2",
source: RetryError {
method: DELETE,
uri: Some(
https://storage.googleapis.com/workflow%2Dcache%2D15c9850/debug%2Fzarr%5Fissue%2Farr%2Fc%2F0%2F1%2F2,
),
retries: 0,
max_retries: 10,
elapsed: 72.532333ms,
retry_timeout: 180s,
inner: Status {
status: 404,
body: Some(
"<?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Details>No such object: workflow-cache-15c9850/debug/zarr_issue/arr/c/0/1/2</Details></Error>",
),
},
},
}
The issue only surfaces when one (or all of the chunks) contains all zeros, if I fill up my array with random values it works:
# fails with FileNotFoundError
arr[0, :, :] = np.zeros((128, 128), dtype=np.uint16)
# works
arr[0, :, :] = np.random.randint(0, 100, size=(128, 128), dtype=np.uint16)
It makes sense to me that zarr
would delete chunks here, instead of writing ones with only zeros in them.
However, it seems that instead of a delete
operation it should use a delete_if_exists
, right?
Steps to reproduce
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# "obstore==0.6.0",
# ]
# ///
#
# This script automatically imports the development branch of zarr to check for issues
import shutil
from pathlib import Path
import numpy as np
import zarr
from obstore.store import LocalStore
from zarr.storage import ObjectStore
store_location = Path("test_zarr_store")
if store_location.exists():
# allow to run the script multiple times by deleting the previous run
shutil.rmtree(store_location)
store_location.mkdir()
zarr_store = ObjectStore(LocalStore(store_location))
arr = zarr.create_array(
zarr_store, name="arr", shape=(5, 128, 128), dtype=np.uint16, chunks=(1, 32, 32)
)
# this will fail with: # FileNotFoundError: Object at location test_zarr_store/arr/c/0/1/0 not found
arr[0, :, :] = np.zeros((128, 128), dtype=np.uint16)
# however, if we don't use zeros but random values it works
# arr[0, :, :] = np.random.randint(0, 100, size=(128, 128), dtype=np.uint16)
Additional output
No response