@@ -18,8 +18,6 @@ nextflow.py
18
18
.. |license | image :: https://img.shields.io/pypi/l/nextflowpy.svg?color=blue)
19
19
:target: https://github.com/goodwright/nextflow.py/blob/master/LICENSE
20
20
21
- **IMPORTANT: The name of the package on PyPI has now changed from `nextflow` to `nextflowpy`. **
22
-
23
21
nextflow.py is a Python wrapper around the Nextflow pipeline framework. It lets
24
22
you run Nextflow pipelines from Python code.
25
23
@@ -41,11 +39,6 @@ nextflow.py can be installed using pip::
41
39
42
40
$ pip install nextflowpy
43
41
44
- If you get permission errors, try using ``sudo ``::
45
-
46
- $ sudo pip install nextflowpy
47
-
48
-
49
42
Development
50
43
~~~~~~~~~~~
51
44
@@ -81,45 +74,94 @@ You can opt to only run unit tests or integration tests::
81
74
Overview
82
75
--------
83
76
84
- The starting point for any nextflow.py pipeline is the ``Pipeline ``
85
- object. This is initialised with a path to the file in question, and,
86
- optionally, the location of an accompanying config file:
87
-
88
- >>> pipeline1 = nextflow.Pipeline(" pipelines/my-pipeline.nf" )
89
- >>> pipeline2 = nextflow.Pipeline(" main.nf" , config = " nextflow.config" )
90
-
91
77
Running
92
78
~~~~~~~
93
79
94
- To actually execute the pipeline, the ``run `` method is used:
80
+ To run a pipeline, the ``run `` function is used. The only required
81
+ parameter is the path to the pipeline file:
95
82
83
+ >>> pipeline = nextflow.Pipeline(" pipelines/my-pipeline.nf" )
96
84
>>> execution = pipeline.run()
97
85
98
86
This will return an ``Execution `` object, which represents the pipeline
99
- execution that just took place. You can customise the execution with various
100
- options:
87
+ execution that just took place (see below for details on this object). You can
88
+ customise the execution with various options:
89
+
90
+ >>> execution = pipeline.run(location = " ./rundir" , params = {" param1" : " 123" }, profiles = [" docker" , " test" ], version = " 22.0.1" , configs = [" env.config" ])
101
91
102
- >>> execution = pipeline.run(location = " ./rundir" , params = {" param1" : " 123" }, profile = [" docker" , " test" ], version = " 22.0.1" , config = [" env.config" ])
92
+ * ``location `` - The location to run the pipeline from, which by default is just
93
+ the current working directory.
94
+
95
+ * ``params `` - A dictionary of parameters to pass to the pipeline as command.
96
+ In the above example, this would run the pipeline with ``--param1=123 ``.
97
+
98
+ * ``profiles `` - A list of Nextflow profiles to use when running the pipeline.
99
+ These are defined in the ``nextflow.config `` file, and can be used to
100
+ configure things like the executor to use, or the container engine to use.
101
+ In the above example, this would run the pipeline with ``-profile docker,test ``.
102
+
103
+ * ``version `` - The version of Nextflow to use when running the pipeline. By
104
+ default, the version of Nextflow installed on the system is used, but this
105
+ can be overridden with this parameter.
106
+
107
+ * ``configs `` - A list of config files to use when running the pipeline. These
108
+ are merged with the config files specified in the pipeline itself, and can
109
+ be used to override any of the settings in the pipeline config.
110
+
111
+ Custom Runners
112
+ ~~~~~~~~~~~~~~
103
113
104
- This sets the execution to take place in a different location, passes
105
- ``--param1=123 `` as a command line argument when the pipeline is run, uses the
106
- Nextflow profiles 'docker' and 'test', runs with Nextflow version 22.0.1
107
- (regardless of what version of Nextflow is installed), and passes in an extra
108
- config file to use on the run.
114
+ When you run a pipeline with nextflow.py, it will generate the command string
115
+ that you would use at the command line if you were running the pipeline
116
+ manually. This will be some variant of ``nextflow run some-pipeline.nf ``, and
117
+ will include any parameters, profiles, versions, and config files that you
118
+ passed in.
119
+
120
+ By default, nextflow.py will then run this command using the standard Python
121
+ ``subprocess `` module. However, you can customise this behaviour by passing in
122
+ a custom 'runner' function. This is a function which takes the command string
123
+ and submits the job in some other way. For example, you could use a custom
124
+ runner to submit the job to a cluster, or to a cloud platform.
125
+
126
+ This runner function is passed to the ``run `` method as the
127
+ ``runner `` parameter:
128
+
129
+ >>> execution = pipeline.run(" my-pipeline.nf" , runner = my_custom_runner)
130
+
131
+ Once the run command string has been passed to the runner, nextflow.py will
132
+ wait for the pipeline to complete by watching the execution directory, and then
133
+ return the ``Execution `` object as normal.
134
+
135
+ Polling
136
+ ~~~~~~~
137
+
138
+ The function described above will run the pipeline and wait while it does, with
139
+ the completed ``Execution `` being returned only at the end.
140
+
141
+ An alternate method is to use ``run_and_poll ``, which returns an
142
+ ``Execution `` object every few seconds representing the state of the
143
+ pipeline execution at that moment in time, as a generator::
144
+
145
+ for execution in pipeline.run_and_poll(sleep=2, location="./rundir", params={"param1": "123"}):
146
+ print("Processing intermediate execution")
147
+
148
+ By default, an ``Execution `` will be returned every second, but you can
149
+ adjust this as required with the ``sleep `` paramater. This is useful if you want
150
+ to get information about the progress of the pipeline execution as it proceeds.
109
151
110
152
Executions
111
- ##########
153
+ ~~~~~~~~~~
112
154
113
- An ``Execution `` represents a single execution of a
114
- `` Pipeline ``. It has properties for:
155
+ An ``Execution `` represents a single execution of a pipeline. It has
156
+ properties for:
115
157
116
- * ``id `` - The unique ID of that run, generated by Nextflow.
158
+ * ``identifier `` - The unique ID of that run, generated by Nextflow.
117
159
118
- * ``started `` - When the pipeline ran (as a UNIX timestamp ).
160
+ * ``started `` - When the pipeline ran (as a Python datetime ).
119
161
120
- * ``started_dt `` - When the pipeline ran (as a Python datetime).
162
+ * ``finished `` - When the pipeline completed (as a Python datetime).
121
163
122
- * ``duration `` - how long the execution took in seconds .
164
+ * ``duration `` - how long the pipeline ran for (if finished) .
123
165
124
166
* ``status `` - the status Nextflow reports on completion.
125
167
@@ -131,17 +173,17 @@ An ``Execution`` represents a single execution of a
131
173
132
174
* ``log `` - the full text of the log file produced.
133
175
134
- * ``returncode `` - the exit code of the run - usually 0 or 1.
176
+ * ``return_code `` - the exit code of the run - usually 0 or 1.
135
177
136
- * ``pipeline `` - the `` Pipeline `` that created the execution.
178
+ * ``path `` - the path to the execution directory .
137
179
138
180
It also has a ``process_executions `` property, which is a list of
139
181
``ProcessExecution `` objects. Nextflow processes data by chaining
140
182
together isolated 'processes', and each of these has a
141
183
``ProcessExecution `` object representing its execution. These have the
142
184
following properties:
143
185
144
- * ``hash `` - The unique ID generated by Nextflow, of the form ``xx/xxxxxx ``.
186
+ * ``identifier `` - The unique ID generated by Nextflow, of the form ``xx/xxxxxx ``.
145
187
146
188
* ``process `` - The name of the process that spawned the process execution.
147
189
@@ -153,13 +195,19 @@ following properties:
153
195
154
196
* ``stderr `` - the stderr of the process execution.
155
197
156
- * ``started `` - When the process execution ran (as a UNIX timestamp ).
198
+ * ``started `` - When the process execution ran (as a Python datetime ).
157
199
158
- * ``started_dt `` - When the process execution ran (as a Python datetime).
200
+ * ``started `` - When the process execution completed (as a Python datetime).
159
201
160
202
* ``duration `` - how long the process execution took in seconds.
161
203
162
- * ``returncode `` - the exit code of the process execution - usually 0 or 1.
204
+ * ``return_code `` - the exit code of the process execution - usually 0 or 1.
205
+
206
+ * ``path `` - the local path to the process execution directory.
207
+
208
+ * ``full_path `` - the absolute path to the process execution directory.
209
+
210
+ * ``bash `` - the bash file contents generated for the process execution.
163
211
164
212
Process executions can have various files passed to them, and will create files
165
213
during their execution too. These can be obtained as follows:
@@ -175,37 +223,18 @@ during their execution too. These can be obtained as follows:
175
223
distinguish these once execution is complete, so nextflow.py reports all
176
224
output files, not just those which are 'published'.
177
225
178
- Polling
179
- ~~~~~~~
180
-
181
- The method described above will run the pipeline and wait while it does, with
182
- the completed ``Execution `` being returned only at the end.
183
-
184
- An alternate method is to use ``run_and_poll ``, which returns an
185
- ``Execution `` object every few seconds representing the state of the
186
- pipeline execution at that moment in time, as a generator::
187
-
188
- for execution in pipeline.run_and_poll(sleep=2, location="./rundir", params={"param1": "123"}, profile=["docker", "test"], version="22.0.1"):
189
- print("Processing intermediate execution")
190
-
191
- By default, an ``Execution `` will be returned every 5 seconds, but you
192
- can adjust this as required with the ``sleep `` paramater. This is useful if you
193
- want to get information about the progress of the pipeline execution as it
194
- proceeds.
226
+ Changelog
227
+ ---------
195
228
196
- Direct Running
197
- ~~~~~~~~~~~~~~
229
+ Release 0.6.0
230
+ ~~~~~~~~~~~~~
198
231
199
- If you just want to run a single pipeline without initialising a
200
- ``Pipeline `` object first, you can ``run `` or
201
- ``run_and_poll `` directly, without needing to create a
202
- ``Pipeline ``:
232
+ `24th May 2023 `
203
233
204
- >>> import nextflow
205
- >>> execution = nextflow.run(path = " pipeline.nf" , config = [" settings.config" ], params = {" param1" : " 123" })
234
+ * Added ability to use custom runners for starting jobs.
235
+ * Removed pipeline class to.
236
+ * Overhauled architecture.
206
237
207
- Changelog
208
- ---------
209
238
210
239
Release 0.5.0
211
240
~~~~~~~~~~~~~
0 commit comments