Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata comparison fails for NaN fill_values #2929

Open
TomNicholas opened this issue Mar 24, 2025 · 0 comments
Open

Metadata comparison fails for NaN fill_values #2929

TomNicholas opened this issue Mar 24, 2025 · 0 comments
Labels
bug Potential issues with the zarr-python library

Comments

@TomNicholas
Copy link
Member

Zarr version

main

Numcodecs version

n/a

Python Version

3.12

Operating System

linux

Installation

pip editable

Description

Two Metadata objects with identical attributes will compare not equal if they both have NaN for a fill_value. This is because the __eq__ check introspects deeper until it finds e.g. the np.float32(nan) type, but

In [4]: bool(np.float32('nan') == np.float32('nan'))
Out[4]: False

(See https://stackoverflow.com/a/10059796 for why numpy NaNs behave like this.)

The solution needs to be to actually check two Metadata classes are __eq__ with dedicated code, not just trusting the python dataclasses' automatically-generated __eq__ method to do it correctly.

xref zarr-developers/VirtualiZarr#501

Steps to reproduce

In [12]: metadata1 = ArrayV3Metadata(
    ...:     shape=(2,),
    ...:     data_type=np.float32,
    ...:     chunk_grid={
    ...:         "name": "regular",
    ...:         "configuration": {"chunk_shape": (2,)},
    ...:     },
    ...:     chunk_key_encoding={"name": "default"},
    ...:     fill_value=np.float32('nan'),
    ...:     codecs=({'name': 'bytes', 'configuration': {'endian': 'little'}},),
    ...:     attributes={},
    ...:     dimension_names=None,
    ...:     storage_transformers=None,
    ...: )

In [13]: metadata2 = ArrayV3Metadata(
    ...:     shape=(2,),
    ...:     data_type=np.float32,
    ...:     chunk_grid={
    ...:         "name": "regular",
    ...:         "configuration": {"chunk_shape": (2,)},
    ...:     },
    ...:     chunk_key_encoding={"name": "default"},
    ...:     fill_value=np.float32('nan'),
    ...:     codecs=({'name': 'bytes', 'configuration': {'endian': 'little'}},),
    ...:     attributes={},
    ...:     dimension_names=None,
    ...:     storage_transformers=None,
    ...: )

In [14]: bool(metadata1 == metadata2)
Out[14]: False

Additional output

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

1 participant