custom script - parallelrunstep not working #118

michalmar · 2021-02-11T12:32:53Z

when I try run the example Custom_Script/02_CustomScript_Training_Pipeline.ipynb I cannot create ParallelRunConfig

parallel_run_config = ParallelRunConfig(
    source_directory='./scripts',
    entry_script='train.py',
    mini_batch_size="1",
    run_invocation_timeout=timeout,
    error_threshold=10,
    output_action="append_row",
    environment=train_env,
    process_count_per_node=processes_per_node,
    compute_target=compute,
    node_count=node_count)

it gives error:

in /anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/pipeline/steps/parallel_run_config.py
...
TypeError: __init__() got an unexpected keyword argument 'allowed_failed_count'

I have updated to latest SDK (pipeline):

zureml-pipeline-core==1.22.0
azureml-pipeline-steps==1.22.0

when I downgrade to 1.20.0 it works:

zureml-pipeline-core==1.20.0
azureml-pipeline-steps==1.20.0

so fix is:

!pip install update azureml-pipeline-steps==1.20.0

The text was updated successfully, but these errors were encountered:

dkmiller · 2021-02-11T16:43:20Z

The official docs for ParallelRunConfig still show that keyword argument: azureml.pipeline.steps.ParallelRunConfig.

I wonder if you're using the stale one — azureml.contrib.pipeline.steps.parallel_run_config.ParallelRunConfig?

michalmar · 2021-02-11T20:40:35Z

@dkmiller how can I check the stale one?

dkmiller · 2021-02-11T20:57:51Z

Look in your script to see from where you are importing the ParallelRunConfig.

Also, suggest you make sure to pull the latest version of this repo.

michalmar · 2021-02-12T12:08:40Z

I am using official not staled repo:
from azureml.pipeline.steps import ParallelRunConfig

repo cloned couple days ago - so not sure where the problem comes from..

dkmiller · 2021-02-12T16:51:59Z

I could not reproduce this. Try this "clean" Dockerfile:

FROM python:3.8

RUN pip install azureml-pipeline-steps==1.22.0 azureml-pipeline-core==1.22.0

RUN python -c "from azureml.pipeline.steps import ParallelRunConfig; cfg = ParallelRunConfig(allowed_failed_count=1,entry_script='hi.py',environment='foo',error_threshold=1,output_action='append_row',compute_target='cluster',node_count=1)"

Docker build fails with:

ValueError: Parameter environment must be an instance of azureml.core.Environment. The actual value is foo.

which means that there is no problem with the keyword allowed_failed_count. I'd suggest re-creating your Python environment.

michalmar · 2021-02-17T10:52:59Z

@dkmiller I am running on AML CI - should I create new conda env?

dkmiller · 2021-02-17T16:45:49Z

Yes, I'd suggest creating a new Conda environment from scratch. Follow this article to expose that Conda environment as a Jupyter kernel: https://medium.com/@nrk25693/how-to-add-your-conda-environment-to-your-jupyter-notebook-in-just-4-steps-abeab8b8d084 .

cartacioS assigned cartacioS and SKrupa Jan 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

custom script - parallelrunstep not working #118

custom script - parallelrunstep not working #118

michalmar commented Feb 11, 2021

dkmiller commented Feb 11, 2021

michalmar commented Feb 11, 2021

dkmiller commented Feb 11, 2021

michalmar commented Feb 12, 2021

dkmiller commented Feb 12, 2021 •

edited

Loading

michalmar commented Feb 17, 2021

dkmiller commented Feb 17, 2021

custom script - parallelrunstep not working #118

custom script - parallelrunstep not working #118

Comments

michalmar commented Feb 11, 2021

dkmiller commented Feb 11, 2021

michalmar commented Feb 11, 2021

dkmiller commented Feb 11, 2021

michalmar commented Feb 12, 2021

dkmiller commented Feb 12, 2021 • edited Loading

michalmar commented Feb 17, 2021

dkmiller commented Feb 17, 2021

dkmiller commented Feb 12, 2021 •

edited

Loading