Skip to content

Metadata comparison fails for NaN fill_values #2929

Open
@TomNicholas

Description

@TomNicholas

Zarr version

main

Numcodecs version

n/a

Python Version

3.12

Operating System

linux

Installation

pip editable

Description

Two Metadata objects with identical attributes will compare not equal if they both have NaN for a fill_value. This is because the __eq__ check introspects deeper until it finds e.g. the np.float32(nan) type, but

In [4]: bool(np.float32('nan') == np.float32('nan'))
Out[4]: False

(See https://stackoverflow.com/a/10059796 for why numpy NaNs behave like this.)

The solution needs to be to actually check two Metadata classes are __eq__ with dedicated code, not just trusting the python dataclasses' automatically-generated __eq__ method to do it correctly.

xref zarr-developers/VirtualiZarr#501

Steps to reproduce

In [12]: metadata1 = ArrayV3Metadata(
    ...:     shape=(2,),
    ...:     data_type=np.float32,
    ...:     chunk_grid={
    ...:         "name": "regular",
    ...:         "configuration": {"chunk_shape": (2,)},
    ...:     },
    ...:     chunk_key_encoding={"name": "default"},
    ...:     fill_value=np.float32('nan'),
    ...:     codecs=({'name': 'bytes', 'configuration': {'endian': 'little'}},),
    ...:     attributes={},
    ...:     dimension_names=None,
    ...:     storage_transformers=None,
    ...: )

In [13]: metadata2 = ArrayV3Metadata(
    ...:     shape=(2,),
    ...:     data_type=np.float32,
    ...:     chunk_grid={
    ...:         "name": "regular",
    ...:         "configuration": {"chunk_shape": (2,)},
    ...:     },
    ...:     chunk_key_encoding={"name": "default"},
    ...:     fill_value=np.float32('nan'),
    ...:     codecs=({'name': 'bytes', 'configuration': {'endian': 'little'}},),
    ...:     attributes={},
    ...:     dimension_names=None,
    ...:     storage_transformers=None,
    ...: )

In [14]: bool(metadata1 == metadata2)
Out[14]: False

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions