Skip to content

Allow concatenation of string coordinates of differing width (#6676)#7125

Open
gaoflow wants to merge 1 commit into
SciTools:mainfrom
gaoflow:fix/concat-string-coord-widths
Open

Allow concatenation of string coordinates of differing width (#6676)#7125
gaoflow wants to merge 1 commit into
SciTools:mainfrom
gaoflow:fix/concat-string-coord-widths

Conversation

@gaoflow
Copy link
Copy Markdown

@gaoflow gaoflow commented Jun 1, 2026

🚀 Pull Request

Description

Closes #6676.

Two cubes whose string coordinate differs only in width (i.e. dtype, such as <U1 vs <U5) could not be concatenated:

cube_a = iris.cube.Cube(
    [0, 1], long_name="test",
    dim_coords_and_dims=[(iris.coords.DimCoord([0, 1], long_name="dim"), 0)],
    aux_coords_and_dims=[(iris.coords.AuxCoord(["1", "2"], long_name="example"), 0)],
)
cube_b = iris.cube.Cube(
    [10, 12, 13], long_name="test",
    dim_coords_and_dims=[(iris.coords.DimCoord([10, 11, 12], long_name="dim"), 0)],
    aux_coords_and_dims=[(iris.coords.AuxCoord(["1", "123", "12345"], long_name="example"), 0)],
)
iris.cube.CubeList([cube_a, cube_b]).concatenate_cube()
ConcatenateError: failed to concatenate into a single cube.
  Auxiliary coordinates metadata differ: example != example

Cause

When building the coordinate signature, _CoordMetaData records each coordinate's points_dtype/bounds_dtype, and __eq__ compares them exactly. For string coordinates, <U1 and <U5 are different dtypes, so the metadata was reported as differing even though the coordinates are otherwise compatible.

Fix

When comparing coordinate signatures, collapse string dtypes (kind "U"/"S") to their kind, so that differing widths no longer block concatenation. numpy promotes the points to a common width when they are joined, so the result coordinate gets the wider dtype (<U5 above). Genuine dtype-kind differences (e.g. string vs integer) are unaffected and still rejected.

Verification

  • New TestStringAuxCoordWidths:
    • test_different_widths — the two cubes above now concatenate; the result coordinate has dtype <U5 and the expected points. Fails on main, passes here.
    • test_different_dtype_kind_still_rejected — string-vs-integer coordinates still raise ConcatenateError.
  • Full tests/unit/concatenate/, tests/integration/concatenate/ and tests/test_concatenate.py suites pass (283 + 54 tests).
  • ruff check / ruff format clean.

Concatenation compared coordinate dtypes exactly, so two cubes whose
string coordinate differed only in width (e.g. '<U1' vs '<U5') were
reported as having differing metadata and could not be concatenated.
When comparing the coordinate signatures, collapse string dtypes to
their kind so differing widths no longer block concatenation; numpy
promotes the points to a common width when they are joined. Genuine
dtype-kind differences (e.g. string vs integer) are still rejected.

Fixes SciTools#6676.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Concatenation doesn't support string aux-coords with different widths

1 participant