Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toil-cwl-runner complains Docker is not installed when it is installed but dockerImageId is not set #5195

Open
peterg1t opened this issue Jan 4, 2025 · 4 comments

Comments

@peterg1t
Copy link

peterg1t commented Jan 4, 2025

I installed toil[cwl] in a new arm mac laptop. I have a simple job that zips a file using a container image with a zip utility.

cwltool --debug workflows/pipeline-simple_workflow.cwl workflows/inputs.yml

I ran however into the following situation

Traceback (most recent call last):
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/job.py", line 759, in run
    self.get_from_requirements(
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/docker.py", line 208, in get_from_requirements
    if self.get_image(cast(dict[str, str], r), pull_image, force_pull, tmp_outdir_prefix):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/docker.py", line 114, in get_image
    if docker_requirement["dockerImageId"] in _IMAGES:
       ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/ruamel/yaml/comments.py", line 853, in __getitem__
    return ordereddict.__getitem__(self, key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'dockerImageId'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/main.py", line 1313, in main
    (out, status) = real_executor(
                    ^^^^^^^^^^^^^^
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/executors.py", line 53, in __call__
    return self.execute(process, job_order_object, runtime_context, logger)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/executors.py", line 136, in execute
    self.run_jobs(process, job_order_object, logger, runtime_context)
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/executors.py", line 246, in run_jobs
    job.run(runtime_context)
  File "/Users/pemartin/Scripts/CWL-training/src/env/lib/python3.12/site-packages/cwltool/job.py", line 800, in run
    raise UnsupportedRequirement(
cwltool.errors.UnsupportedRequirement: Docker is required to run this tool: 'dockerImageId'

I have Docker installed in the laptop with the following tool versions
cwltool 3.1.20240112164112
toil-cwl-runner --version 6.1.0

I wonder what I might be doing incorrectly regarding my configuration. Docker is available in the system and I can do docker load -i image without any issues.

Thank you very much in advance for your help.

Pedro

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1694

@stxue1
Copy link
Contributor

stxue1 commented Jan 15, 2025

Hey, @peterg1t thanks for the issue!

From a glance, it seems as is dockerImageId is missing from the CWL. Though I'm not sure why the behavior of cwltool and toil-cwl-runner would differ. Do you happen to have a CWL example you could provide that displays this issue?

Additionally, does using Toil 7.0.0 or installing from source help at all? It has been a while since 6.1.0 released.

@peterg1t
Copy link
Author

Hi @stxue1
I'm using 6.1.0 due to this issue 5174. These are the files that produced the error.

cwlVersion: v1.2
class: Workflow

# requirements:  # none here but here we add plugins for more advanced features
requirements:
  ScatterFeatureRequirement: {}
  #SubworkflowFeatureRequirement: {}
  #StepInputExpressionRequirement: {}
  #InlineJavascriptRequirement: {}


inputs:
  input_file: File

steps:
  zip_file:
    run: tasks/zip_file.cwl
    in:
      input_unzipped: input_file
    out: [output_zipped]

outputs:
  output_zipped:
    type: File
    outputSource: zip_file/output_zipped
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool

baseCommand:
  # - bgzip
  - gzip
  - -c

requirements:
  InlineJavascriptRequirement: {}
  DockerRequirement:
    dockerLoad: '/Users/pemartin/Projects/BIOINFORMATICS_CWL/bioinformatics-tools/perl-local.tar'
  InitialWorkDirRequirement:
    listing:
      - $(inputs.input_unzipped)

inputs:
  input_unzipped:
    type: File
    inputBinding:
      position: 1
      prefix: -f

outputs:
  output_zipped:
    type: stdout

stdout: $(inputs.input_unzipped.nameroot).gz

Please feel free to point out if I made any mistakes.

Best,

Pedro

@adamnovak
Copy link
Member

I think your CWL task is slightly wrong according to the DockerRequirement spec. It says the optional dockerImageId field:

May be skipped if dockerPull is specified, in which case the dockerPull image id must be used.

Which to me indicates that if you don't set dockerPull, you must set dockerImageId.

The error handling here is I think interpreting anything going wrong trying to start the container as Docker not being installed, which is wrong. But it's also cwltool error handling code we imported. Maybe we can add a Toil-level check for dockerImageId being set to fix the error message.

I'm not sure why cwltool puts up with this form. Maybe it helpfully fills in dockerImageId automatically when it does the load. I think we run a separate script over the workflow to pre-load all the containers, so we don't actually have access to whatever the load loaded the tar file as when we go to run it.

@mr-c am I reading the spec right here?

@adamnovak adamnovak changed the title Docker not found but it is present in the system. toil-cwl-runner complains Docker is not installed when it is installed but dockerImageId is not set Jan 15, 2025
@peterg1t
Copy link
Author

Hi all,
In CWL dockerLoad is used to load an image using docker load. In my case I have a tar and the path to that docker image file is what I pass to dockerLoad. Can dockerLoad and dockerImageId be used interchangeably? Why would dockerLoad fail if docker is present in the system and the path is correct? Is dockerLoad, deprecated in favour of dockerImageId?

Image

Best,

Pedro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants