Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI test failures #393

Merged
merged 5 commits into from
Mar 22, 2025
Merged

Fix CI test failures #393

merged 5 commits into from
Mar 22, 2025

Conversation

m-albert
Copy link
Collaborator

@m-albert m-albert commented Feb 24, 2025

With this PR I started out just wanting to ensure compatibility with numpy 2 as reported in #392, however different issues came up in the tests (and if unaddressed they make the CI fail), so I decided to address them together in the same PR.

Summary:

  1. Use np.ptp instead of the method to ensure compatibility with numpy 2 as reported in NumPy 2 removes ptp method (use function instead) #392.

  2. Replace the deprecated tifffile.TiffWriter.save by tifffile.TiffWriter.save, see here.

  3. dask-image currently doesn't pass tests with dask>=2025.1.0 because of New Dask Arrow-based strings cause test failures #335 which came back because of changes in dask.dataframe (more details here). This PR applies the proposed workaround. An alternative would be to pin dask<2025.1.0, but probably a very limiting option.

  4. (edit) Stop using Mambaforge in the CI. According to https://github.com/conda-forge/miniforge mambaforge is deprecated since 2024 and installers have been retired in 2025.

@jakirkham @GenevieveBuckley

@m-albert
Copy link
Collaborator Author

The CI fails in trying to get mamba from https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh.

The error might be related to conda-incubator/setup-miniconda#392.

@m-albert
Copy link
Collaborator Author

Okay switching from Mambaforge to Miniforge solved the issue.

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Marvin! 🙏

Had a couple comments below

Comment on lines 70 to +75
if isinstance(df1, Delayed):
df1 = dd.from_delayed(df1, meta=meta)
with dask_config.set({'dataframe.convert-string': False}):
df1 = dd.from_delayed(df1, meta=meta)
if isinstance(df2, Delayed):
df2 = dd.from_delayed(df2, meta=meta)
with dask_config.set({'dataframe.convert-string': False}):
df2 = dd.from_delayed(df2, meta=meta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough given compute happens later?

Do we have a way to test this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough given compute happens later?

Yes, consider this code snippet:

import dask.dataframe as dd
import dask.array as da
from dask import delayed
import pandas as pd
import dask.config as dask_config

x = dd.from_delayed(delayed(pd.Series([slice(2)])))

with dask_config.set({'dataframe.convert-string': False}):
    x_cm = dd.from_delayed(delayed(pd.Series([slice(2)])))

print('x', x)
print('x_cm', x_cm)

print('x computed', x.compute())
print('x_cm computed', x_cm.compute())

outputs

x Dask Series Structure:
npartitions=1
    string
       ...
Dask Name: to_string_dtype, 3 expressions
Expr=ArrowStringConversion(frame=FromDelayed(5621810))
x_cm Dask Series Structure:
npartitions=1
    object
       ...
Dask Name: fromdelayed, 2 expressions
Expr=FromDelayed(79a507f)
x computed 0    slice(None, 2, None)
dtype: string
x_cm computed 0    slice(None, 2, None)
dtype: object

Also he tests do perform a compute and fail without the context manager in place, so it should be covered:

computed_result = result.compute()
assert isinstance(computed_result, pd.DataFrame)
expected = pd.DataFrame.from_dict(
{0: {111: slice(1, 3), 222: slice(3, 4), 333: slice(0, 2)},
1: {111: slice(0, 2), 222: slice(3, 8), 333: slice(7, 10)}}
)
assert computed_result.equals(expected)

@m-albert
Copy link
Collaborator Author

Thanks for the review @jakirkham! I addressed your points.

@jakirkham
Copy link
Member

(edit) Stop using Mambaforge in the CI. According to https://github.com/conda-forge/miniforge mambaforge is deprecated since 2024 and installers have been retired in 2025

Just a clarification point, they became the same installer. IOW Mambaforge and Miniforge include all the same things. Hence deprecating to clean up the namespace

@jakirkham
Copy link
Member

Thanks Marvin! 🙏

LGTM. Let's see what Genevieve thinks 🙂

@m-albert
Copy link
Collaborator Author

Sorry for the slightly complicated PR here! It's a bit annoying that several issues are addressed at the same time, but individually they make the CI fail so I thought I'd address them together.

Happy to split out individual points if you think that makes things easier @jakirkham @GenevieveBuckley.

It'll be good to get back to green CI one way or another, also with regard to #384.

@jakirkham
Copy link
Member

Let's go ahead and merge. The changes here look good and having working CI would be a great help for other work

We can follow up on any feedback in a new PR

@jakirkham jakirkham merged commit d528071 into dask:main Mar 22, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants