Skip to content

Conversation

@mr-c
Copy link
Contributor

@mr-c mr-c commented Nov 19, 2025

  • Python versions testing & support: drop 3.9 add 3.14
  • mass reformat & upgrade to Python 3.10+ syntax
  • sprinkle in some structural pattern matching (new Python 3.10 syntax)

Changelog Entry

To be copied to the draft changelog by merger:

  • PR submitter writes their recommendation for a changelog entry here

Reviewer Checklist

  • Make sure it is coming from issues/XXXX-fix-the-thing in the Toil repo, or from an external repo.
    • If it is coming from an external repo, make sure to pull it in for CI with:
      contrib/admin/test-pr otheruser theirbranchname issues/XXXX-fix-the-thing
      
    • If there is no associated issue, create one.
  • Read through the code changes. Make sure that it doesn't have:
    • Addition of trailing whitespace.
    • New variable or member names in camelCase that want to be in snake_case.
    • New functions without type hints.
    • New functions or classes without informative docstrings.
    • Changes to semantics not reflected in the relevant docstrings.
    • New or changed command line options for Toil workflows that are not reflected in docs/running/{cliOptions,cwl,wdl}.rst
    • New features without tests.
  • Comment on the lines of code where problems exist with a review comment. You can shift-click the line numbers in the diff to select multiple lines.
  • Finish the review with an overall description of your opinion.

Merger Checklist

  • Make sure the PR passed tests, including the Gitlab tests, for the most recent commit in its branch.
  • Make sure the PR has been reviewed. If not, review it. If it has been reviewed and any requested changes seem to have been addressed, proceed.
  • Merge with the Github "Squash and merge" feature.
    • If there are multiple authors' commits, add Co-authored-by to give credit to all contributing authors.
  • Copy its recommended changelog entry to the Draft Changelog.
  • Append the issue number in parentheses to the changelog entry.

@mr-c mr-c force-pushed the py_drop3.9_add3.14 branch from af670a0 to d092e76 Compare November 20, 2025 13:32
Copy link
Member

@adamnovak adamnovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good, but I'm overall negative on the auto-walrus-ification.

Most of the places where the tool is adding it get better, but some get worse, which I think means we can't put the tool in the same code translation pass as we use when upgrading Python versions and we need to treat its output as suggestions.

Some places where it kicks in are probably places that should eventually be refactored so you no longer need the offending variable, though probably not in this PR.

Comment on lines 1314 to 1318
"""
maxJobDuration = self.config.maxJobDuration
jobsToKill = []
if (
maxJobDuration < 10000000
): # We won't bother doing anything if rescue time > 16 weeks.
maxJobDuration := self.config.maxJobDuration
) < 10000000: # We won't bother doing anything if rescue time > 16 weeks.
Copy link
Member

@adamnovak adamnovak Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that, just because maxJobDuration happens not to be used outside the conditional, this is clearer code. We don't need a variable for this at all because it's not slow to compute. We're using one to reduce the number of moving parts we need to account for at a time, by grabbing the concept by a short name and keeping self and config out of our minds later.

But then when we try to do that at the same time as the comparison, now we have a statement that has 7 moving parts instead of two with 5 and 3 each, and 7 moving parts is starting to bump up against working memory. Especially if you have to worry about what's in the body and why.

Plus, the pragmatics of the walrus I think are something along the lines of "I need to generate this thing and work on it", and that's not what's happening here. The condition body isn't about processing maxJobDuration; we're using the condition to bail out early on the processing of something else. Probably what we really want is:

maxJobDuration = self.config.maxJobDuration
if maxJobDuration >= 10000000:
    # We won't bother doing anything if rescue time > 16 weeks.
    return
... other code using maxJobDuration ...

Then we can save a level of indentation. And maybe that could use a walrus (are we meant to use it when we want to define-and-validate for later code?).

It looks like auto-walrus doesn't have any notion of a complexity limit besides line length, and if we're also wrapping our lines properly I'm not sure it can hit a sensible line length limit. So I'm not sure we want to use it and take all its suggestions whenever we bump Python versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can drop the auto-walrus changes, if you'd like

Comment on lines 457 to 458
localID = self.handleLocalJob(command, jobNode)
if localID is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place where I'm not sure it makes pragmatic sense to use the walrus is in cases where the assignment isn't meant to populate the variable but is primarily there for its side effects. Here, handleLocalJob is meant to do a bunch of work, and incidentally we need to save its return value so we can tell if it actually did the work, and so we can pass it along.

I'm not sure this makes more sense when the job handling happens as a side effect of the conditional.

The right design here is probably to move the responsibility for assessing whether a job actually is local out of handleLocalJob(), maybe into a separate helper method. Then handleLocalJob()'s return value can just get unconditionally returned in the cases where we need to call it, and we won't need to store any condition predicates to variables at all.

@mr-c mr-c force-pushed the py_drop3.9_add3.14 branch 3 times, most recently from d0b3640 to c0596ea Compare November 22, 2025 10:53
@mr-c mr-c force-pushed the py_drop3.9_add3.14 branch from c0596ea to e968716 Compare November 22, 2025 12:04
@mr-c
Copy link
Contributor Author

mr-c commented Nov 22, 2025

@adamnovak Can you build and upload a new version of quay.io/vgteam/dind using Ubuntu Jammy 22.04? The deadsnakes PPA dropped support for Ubuntu Focal 20.04 on October 1st.

PR to update dind: vgteam/dind#1

And then quay.io/ucsc_cgl/toil_ci_prebake can be built & published so we can run the Python 3.14 testing.

# from a checkout of https://github.com/DataBiosphere/toil/tree/py_drop3.9_add3.14
cd contrib/toil-ci-prebake
docker build . -t quay.io/ucsc_cgl/toil_ci_prebake:latest
docker push quay.io/ucsc_cgl/toil_ci_prebake:latest 

Copy link
Member

@adamnovak adamnovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the CI images and written the CI bypass commit out of this PR.

I think this is good to merge now. We probably wanted some of those walruses, but we don't need them.

@mr-c mr-c marked this pull request as ready for review November 27, 2025 14:18
@mr-c
Copy link
Contributor Author

mr-c commented Nov 27, 2025

When the tests pass, I'm going to force this as separate commits, so that just the mass reformatting commit ID can be noted in .git-blame-ignore-revs

@adamnovak
Copy link
Member

Looks like one of the WDL tests failed (src/toil/test/wdl/wdltoil_test.py::TestWDL::test_giraffe). Probably this didn't cause it; it might be a flaky test or it might be that I touched the workflow it is trying to run and somehow broke the Toil test.

@adamnovak
Copy link
Member

There's also a failing WDL conformance test in src/toil/test/wdl/wdltoil_test.py::TestWDLConformance::test_conformance_tests_development, here and in #5413

84: FAILED: Test that sibling directories are kept in the same directory when downloaded, and siblings of their files are in the right place
Iteration: 1
REASON: 'outputs' section expected 6 results (['wf.result1', 'wf.result2', 'wf.result3', 'wf.result4', 'wf.result5', 'wf.result6']), got 0 instead ([]) with exit code 128

Probably I managed to merge that without it actually working right, somehow, and I need to add that WDL conformance test to the ones to skip because they aren't implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants