-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CI test failures #393
Fix CI test failures #393
Conversation
The CI fails in trying to get mamba from https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh. The error might be related to conda-incubator/setup-miniconda#392. |
Okay switching from Mambaforge to Miniforge solved the issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Marvin! 🙏
Had a couple comments below
if isinstance(df1, Delayed): | ||
df1 = dd.from_delayed(df1, meta=meta) | ||
with dask_config.set({'dataframe.convert-string': False}): | ||
df1 = dd.from_delayed(df1, meta=meta) | ||
if isinstance(df2, Delayed): | ||
df2 = dd.from_delayed(df2, meta=meta) | ||
with dask_config.set({'dataframe.convert-string': False}): | ||
df2 = dd.from_delayed(df2, meta=meta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enough given compute happens later?
Do we have a way to test this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enough given compute happens later?
Yes, consider this code snippet:
import dask.dataframe as dd
import dask.array as da
from dask import delayed
import pandas as pd
import dask.config as dask_config
x = dd.from_delayed(delayed(pd.Series([slice(2)])))
with dask_config.set({'dataframe.convert-string': False}):
x_cm = dd.from_delayed(delayed(pd.Series([slice(2)])))
print('x', x)
print('x_cm', x_cm)
print('x computed', x.compute())
print('x_cm computed', x_cm.compute())
outputs
x Dask Series Structure:
npartitions=1
string
...
Dask Name: to_string_dtype, 3 expressions
Expr=ArrowStringConversion(frame=FromDelayed(5621810))
x_cm Dask Series Structure:
npartitions=1
object
...
Dask Name: fromdelayed, 2 expressions
Expr=FromDelayed(79a507f)
x computed 0 slice(None, 2, None)
dtype: string
x_cm computed 0 slice(None, 2, None)
dtype: object
Also he tests do perform a compute and fail without the context manager in place, so it should be covered:
dask-image/tests/test_dask_image/test_ndmeasure/test_find_objects.py
Lines 61 to 67 in cb1360a
computed_result = result.compute() | |
assert isinstance(computed_result, pd.DataFrame) | |
expected = pd.DataFrame.from_dict( | |
{0: {111: slice(1, 3), 222: slice(3, 4), 333: slice(0, 2)}, | |
1: {111: slice(0, 2), 222: slice(3, 8), 333: slice(7, 10)}} | |
) | |
assert computed_result.equals(expected) |
Co-authored-by: jakirkham <[email protected]>
Thanks for the review @jakirkham! I addressed your points. |
Just a clarification point, they became the same installer. IOW Mambaforge and Miniforge include all the same things. Hence deprecating to clean up the namespace |
Thanks Marvin! 🙏 LGTM. Let's see what Genevieve thinks 🙂 |
Sorry for the slightly complicated PR here! It's a bit annoying that several issues are addressed at the same time, but individually they make the CI fail so I thought I'd address them together. Happy to split out individual points if you think that makes things easier @jakirkham @GenevieveBuckley. It'll be good to get back to green CI one way or another, also with regard to #384. |
Let's go ahead and merge. The changes here look good and having working CI would be a great help for other work We can follow up on any feedback in a new PR |
With this PR I started out just wanting to ensure compatibility with numpy 2 as reported in #392, however different issues came up in the tests (and if unaddressed they make the CI fail), so I decided to address them together in the same PR.
Summary:
Use
np.ptp
instead of the method to ensure compatibility with numpy 2 as reported in NumPy 2 removesptp
method (use function instead) #392.Replace the deprecated
tifffile.TiffWriter.save
bytifffile.TiffWriter.save
, see here.dask-image
currently doesn't pass tests withdask>=2025.1.0
because of New Dask Arrow-based strings cause test failures #335 which came back because of changes indask.dataframe
(more details here). This PR applies the proposed workaround. An alternative would be to pindask<2025.1.0
, but probably a very limiting option.(edit) Stop using Mambaforge in the CI. According to https://github.com/conda-forge/miniforge mambaforge is deprecated since 2024 and installers have been retired in 2025.
@jakirkham @GenevieveBuckley