|
| 1 | +++++++++++++++++++++++++ |
| 2 | +Local Pipeline Execution |
| 3 | +++++++++++++++++++++++++ |
| 4 | + |
| 5 | +Your pipeline can be executed locally to facilitate development and troubleshooting. Each pipeline step is executed in its own local container. |
| 6 | + |
| 7 | +------------- |
| 8 | +Prerequisites |
| 9 | +------------- |
| 10 | + |
| 11 | +1. :doc:`Install ADS CLI<../../quickstart>` |
| 12 | +2. :doc:`Build Development Container Image<./jobs_container_image>` and :doc:`install a conda environment<./condapack>` |
| 13 | + |
| 14 | +------------ |
| 15 | +Restrictions |
| 16 | +------------ |
| 17 | + |
| 18 | +Your pipeline steps are subject to the :doc:`same restrictions as local jobs<./local_jobs>`. |
| 19 | + |
| 20 | +They are also subject to these additional restrictions: |
| 21 | + |
| 22 | + - Pipeline steps must be of kind ``customScript``. |
| 23 | + - Custom container images are not yet supported. You must use the development container image with a conda environment. |
| 24 | + |
| 25 | +--------------------------------------- |
| 26 | +Configuring Local Pipeline Orchestrator |
| 27 | +--------------------------------------- |
| 28 | + |
| 29 | +Use ``ads opctl configure``. Refer to the ``local_backend.ini`` description in the configuration :doc:`instructions<../configure>`. |
| 30 | + |
| 31 | +Most importantly, ``max_parallel_containers`` controls how many pipeline steps may be executed in parallel on your machine. Your pipeline DAG may allow multiple steps to be executed in parallel, |
| 32 | +but your local machine may not have enough cpu cores / memory to effectively run them all simultaneously. |
| 33 | + |
| 34 | +--------------------- |
| 35 | +Running your Pipeline |
| 36 | +--------------------- |
| 37 | + |
| 38 | +Local pipeline execution requires you to define your pipeline in a yaml file. Refer to the YAML examples :doc:`here<../../../pipeline/examples>`. |
| 39 | + |
| 40 | +Then, invoke the following command to run your pipeline. |
| 41 | + |
| 42 | +.. code-block:: shell |
| 43 | +
|
| 44 | + ads opctl run --backend local --file my_pipeline.yaml --source-folder /path/to/my/pipeline/step/files |
| 45 | +
|
| 46 | +Parameter explanation: |
| 47 | + - ``--backend local``: Run the pipeline locally using docker containers. |
| 48 | + - ``--file my_pipeline.yaml``: The yaml file defining your pipeline. |
| 49 | + - ``--source-folder /path/to/my/pipeline/step/files``: The local directory containing the files used by your pipeline steps. This directory is mounted into the container as a volume. |
| 50 | + Defaults to the current working directory if no value is provided. |
| 51 | + |
| 52 | +Source folder and relative paths |
| 53 | +================================ |
| 54 | +If your pipeline step runtimes are of type ``script`` or ``notebook``, the paths in your yaml files must be relative to the ``--source-folder``. |
| 55 | + |
| 56 | +Pipeline steps using a runtime of type ``python`` are able to define their own working directory that will be mounted into the step's container instead. |
| 57 | + |
| 58 | +For example, suppose your yaml file looked like this: |
| 59 | + |
| 60 | +.. code-block:: yaml |
| 61 | +
|
| 62 | + kind: pipeline |
| 63 | + spec: |
| 64 | + displayName: example |
| 65 | + dag: |
| 66 | + - (step_1, step_2) >> step_3 |
| 67 | + stepDetails: |
| 68 | + - kind: customScript |
| 69 | + spec: |
| 70 | + description: A step running a notebook |
| 71 | + name: step_1 |
| 72 | + runtime: |
| 73 | + kind: runtime |
| 74 | + spec: |
| 75 | + conda: |
| 76 | + slug: myconda_p38_cpu_v1 |
| 77 | + type: service |
| 78 | + notebookEncoding: utf-8 |
| 79 | + notebookPathURI: step_1_files/my-notebook.ipynb |
| 80 | + type: notebook |
| 81 | + - kind: customScript |
| 82 | + spec: |
| 83 | + description: A step running a shell script |
| 84 | + name: step_2 |
| 85 | + runtime: |
| 86 | + kind: runtime |
| 87 | + spec: |
| 88 | + conda: |
| 89 | + slug: myconda_p38_cpu_v1 |
| 90 | + type: service |
| 91 | + scriptPathURI: step_2_files/my-script.sh |
| 92 | + type: script |
| 93 | + - kind: customScript |
| 94 | + spec: |
| 95 | + description: A step running a python script |
| 96 | + name: step_3 |
| 97 | + runtime: |
| 98 | + kind: runtime |
| 99 | + spec: |
| 100 | + conda: |
| 101 | + slug: myconda_p38_cpu_v1 |
| 102 | + type: service |
| 103 | + workingDir: /step_3/custom/working/dir |
| 104 | + scriptPathURI: my-python.py |
| 105 | + type: python |
| 106 | + type: pipeline |
| 107 | +
|
| 108 | +And suppose the pipeline is executed locally with the following command: |
| 109 | + |
| 110 | +.. code-block:: shell |
| 111 | +
|
| 112 | + ads opctl run --backend local --file my_pipeline.yaml --source-folder /my/files |
| 113 | +
|
| 114 | +``step_1`` uses a ``notebook`` runtime. The container for ``step_1`` will mount the ``/my/files`` directory into the container. The ``/my/files/step_1_files/my-notebook.ipynb`` notebook file |
| 115 | +will be converted into a python script and executed in the container. |
| 116 | + |
| 117 | +``step_2`` uses a ``script`` runtime. The container for ``step_2`` will mount the ``/my/files`` directory into the container. The ``/my/files/step_2_files/my-script.sh`` shell script will |
| 118 | +be executed in the container. |
| 119 | + |
| 120 | +``step_3`` uses a ``python`` runtime. Instead of mounting the ``/my/files`` directory specified by ``--source-folder``, the ``/step_3/custom/working/dir`` directory will be mounted into the |
| 121 | +container. The ``/step_3/custom/working/dir/my-python.py`` script will be executed in the container. |
| 122 | + |
| 123 | +Viewing container output and orchestration messages |
| 124 | +=================================================== |
| 125 | +When a container is running, you can use the ``docker logs`` command to view its output. See https://docs.docker.com/engine/reference/commandline/logs/ |
| 126 | + |
| 127 | +Alternatively, you can use the ``--debug`` parameter to print each container's stdout/stderr messages to your shell. Note that Python buffers output by default, so you may see output written |
| 128 | +to the shell in bursts. If you want to see output displayed in real-time for a particular step, specify a non-zero value for the ``PYTHONUNBUFFERED`` environment variable in your step's runtime |
| 129 | +specification. For example: |
| 130 | + |
| 131 | +.. code-block:: yaml |
| 132 | +
|
| 133 | + - kind: customScript |
| 134 | + spec: |
| 135 | + description: A step running a shell script |
| 136 | + name: step_1 |
| 137 | + runtime: |
| 138 | + kind: runtime |
| 139 | + spec: |
| 140 | + conda: |
| 141 | + slug: myconda_p38_cpu_v1 |
| 142 | + type: service |
| 143 | + scriptPathURI: my-script.sh |
| 144 | + env: |
| 145 | + PYTHONUNBUFFERED: 1 |
| 146 | + type: script |
| 147 | +
|
| 148 | +
|
| 149 | +Pipeline steps can run in parallel. You may want your pipeline steps to prefix their log output to easily distinguish which lines of output are coming from which step. |
| 150 | + |
| 151 | +When the ``--debug`` parameter is specified, the CLI will also output pipeline orchestration messages. These include messages about which steps are being started and a summary of each |
| 152 | +step's result when the pipeline finishes execution. |
| 153 | + |
0 commit comments