Skip to content

Commit 22dcbc3

Browse files
committed
Update README
1 parent 29b7517 commit 22dcbc3

File tree

2 files changed

+94
-65
lines changed

2 files changed

+94
-65
lines changed

README.rst

Lines changed: 92 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,6 @@ nextflow.py
1818
.. |license| image:: https://img.shields.io/pypi/l/nextflowpy.svg?color=blue)
1919
:target: https://github.com/goodwright/nextflow.py/blob/master/LICENSE
2020

21-
**IMPORTANT: The name of the package on PyPI has now changed from `nextflow` to `nextflowpy`.**
22-
2321
nextflow.py is a Python wrapper around the Nextflow pipeline framework. It lets
2422
you run Nextflow pipelines from Python code.
2523

@@ -41,11 +39,6 @@ nextflow.py can be installed using pip::
4139

4240
$ pip install nextflowpy
4341

44-
If you get permission errors, try using ``sudo``::
45-
46-
$ sudo pip install nextflowpy
47-
48-
4942
Development
5043
~~~~~~~~~~~
5144

@@ -81,45 +74,94 @@ You can opt to only run unit tests or integration tests::
8174
Overview
8275
--------
8376

84-
The starting point for any nextflow.py pipeline is the ``Pipeline``
85-
object. This is initialised with a path to the file in question, and,
86-
optionally, the location of an accompanying config file:
87-
88-
>>> pipeline1 = nextflow.Pipeline("pipelines/my-pipeline.nf")
89-
>>> pipeline2 = nextflow.Pipeline("main.nf", config="nextflow.config")
90-
9177
Running
9278
~~~~~~~
9379

94-
To actually execute the pipeline, the ``run`` method is used:
80+
To run a pipeline, the ``run`` function is used. The only required
81+
parameter is the path to the pipeline file:
9582

83+
>>> pipeline = nextflow.Pipeline("pipelines/my-pipeline.nf")
9684
>>> execution = pipeline.run()
9785

9886
This will return an ``Execution`` object, which represents the pipeline
99-
execution that just took place. You can customise the execution with various
100-
options:
87+
execution that just took place (see below for details on this object). You can
88+
customise the execution with various options:
89+
90+
>>> execution = pipeline.run(location="./rundir", params={"param1": "123"}, profiles=["docker", "test"], version="22.0.1", configs=["env.config"])
10191

102-
>>> execution = pipeline.run(location="./rundir", params={"param1": "123"}, profile=["docker", "test"], version="22.0.1", config=["env.config"])
92+
* ``location`` - The location to run the pipeline from, which by default is just
93+
the current working directory.
94+
95+
* ``params`` - A dictionary of parameters to pass to the pipeline as command.
96+
In the above example, this would run the pipeline with ``--param1=123``.
97+
98+
* ``profiles`` - A list of Nextflow profiles to use when running the pipeline.
99+
These are defined in the ``nextflow.config`` file, and can be used to
100+
configure things like the executor to use, or the container engine to use.
101+
In the above example, this would run the pipeline with ``-profile docker,test``.
102+
103+
* ``version`` - The version of Nextflow to use when running the pipeline. By
104+
default, the version of Nextflow installed on the system is used, but this
105+
can be overridden with this parameter.
106+
107+
* ``configs`` - A list of config files to use when running the pipeline. These
108+
are merged with the config files specified in the pipeline itself, and can
109+
be used to override any of the settings in the pipeline config.
110+
111+
Custom Runners
112+
~~~~~~~~~~~~~~
103113

104-
This sets the execution to take place in a different location, passes
105-
``--param1=123`` as a command line argument when the pipeline is run, uses the
106-
Nextflow profiles 'docker' and 'test', runs with Nextflow version 22.0.1
107-
(regardless of what version of Nextflow is installed), and passes in an extra
108-
config file to use on the run.
114+
When you run a pipeline with nextflow.py, it will generate the command string
115+
that you would use at the command line if you were running the pipeline
116+
manually. This will be some variant of ``nextflow run some-pipeline.nf``, and
117+
will include any parameters, profiles, versions, and config files that you
118+
passed in.
119+
120+
By default, nextflow.py will then run this command using the standard Python
121+
``subprocess`` module. However, you can customise this behaviour by passing in
122+
a custom 'runner' function. This is a function which takes the command string
123+
and submits the job in some other way. For example, you could use a custom
124+
runner to submit the job to a cluster, or to a cloud platform.
125+
126+
This runner function is passed to the ``run`` method as the
127+
``runner`` parameter:
128+
129+
>>> execution = pipeline.run("my-pipeline.nf", runner=my_custom_runner)
130+
131+
Once the run command string has been passed to the runner, nextflow.py will
132+
wait for the pipeline to complete by watching the execution directory, and then
133+
return the ``Execution`` object as normal.
134+
135+
Polling
136+
~~~~~~~
137+
138+
The function described above will run the pipeline and wait while it does, with
139+
the completed ``Execution`` being returned only at the end.
140+
141+
An alternate method is to use ``run_and_poll``, which returns an
142+
``Execution`` object every few seconds representing the state of the
143+
pipeline execution at that moment in time, as a generator::
144+
145+
for execution in pipeline.run_and_poll(sleep=2, location="./rundir", params={"param1": "123"}):
146+
print("Processing intermediate execution")
147+
148+
By default, an ``Execution`` will be returned every second, but you can
149+
adjust this as required with the ``sleep`` paramater. This is useful if you want
150+
to get information about the progress of the pipeline execution as it proceeds.
109151

110152
Executions
111-
##########
153+
~~~~~~~~~~
112154

113-
An ``Execution`` represents a single execution of a
114-
``Pipeline``. It has properties for:
155+
An ``Execution`` represents a single execution of a pipeline. It has
156+
properties for:
115157

116-
* ``id`` - The unique ID of that run, generated by Nextflow.
158+
* ``identifier`` - The unique ID of that run, generated by Nextflow.
117159

118-
* ``started`` - When the pipeline ran (as a UNIX timestamp).
160+
* ``started`` - When the pipeline ran (as a Python datetime).
119161

120-
* ``started_dt`` - When the pipeline ran (as a Python datetime).
162+
* ``finished`` - When the pipeline completed (as a Python datetime).
121163

122-
* ``duration`` - how long the execution took in seconds.
164+
* ``duration`` - how long the pipeline ran for (if finished).
123165

124166
* ``status`` - the status Nextflow reports on completion.
125167

@@ -131,17 +173,17 @@ An ``Execution`` represents a single execution of a
131173

132174
* ``log`` - the full text of the log file produced.
133175

134-
* ``returncode`` - the exit code of the run - usually 0 or 1.
176+
* ``return_code`` - the exit code of the run - usually 0 or 1.
135177

136-
* ``pipeline`` - the ``Pipeline`` that created the execution.
178+
* ``path`` - the path to the execution directory.
137179

138180
It also has a ``process_executions`` property, which is a list of
139181
``ProcessExecution`` objects. Nextflow processes data by chaining
140182
together isolated 'processes', and each of these has a
141183
``ProcessExecution`` object representing its execution. These have the
142184
following properties:
143185

144-
* ``hash`` - The unique ID generated by Nextflow, of the form ``xx/xxxxxx``.
186+
* ``identifier`` - The unique ID generated by Nextflow, of the form ``xx/xxxxxx``.
145187

146188
* ``process`` - The name of the process that spawned the process execution.
147189

@@ -153,13 +195,19 @@ following properties:
153195

154196
* ``stderr`` - the stderr of the process execution.
155197

156-
* ``started`` - When the process execution ran (as a UNIX timestamp).
198+
* ``started`` - When the process execution ran (as a Python datetime).
157199

158-
* ``started_dt`` - When the process execution ran (as a Python datetime).
200+
* ``started`` - When the process execution completed (as a Python datetime).
159201

160202
* ``duration`` - how long the process execution took in seconds.
161203

162-
* ``returncode`` - the exit code of the process execution - usually 0 or 1.
204+
* ``return_code`` - the exit code of the process execution - usually 0 or 1.
205+
206+
* ``path`` - the local path to the process execution directory.
207+
208+
* ``full_path`` - the absolute path to the process execution directory.
209+
210+
* ``bash`` - the bash file contents generated for the process execution.
163211

164212
Process executions can have various files passed to them, and will create files
165213
during their execution too. These can be obtained as follows:
@@ -175,37 +223,18 @@ during their execution too. These can be obtained as follows:
175223
distinguish these once execution is complete, so nextflow.py reports all
176224
output files, not just those which are 'published'.
177225

178-
Polling
179-
~~~~~~~
180-
181-
The method described above will run the pipeline and wait while it does, with
182-
the completed ``Execution`` being returned only at the end.
183-
184-
An alternate method is to use ``run_and_poll``, which returns an
185-
``Execution`` object every few seconds representing the state of the
186-
pipeline execution at that moment in time, as a generator::
187-
188-
for execution in pipeline.run_and_poll(sleep=2, location="./rundir", params={"param1": "123"}, profile=["docker", "test"], version="22.0.1"):
189-
print("Processing intermediate execution")
190-
191-
By default, an ``Execution`` will be returned every 5 seconds, but you
192-
can adjust this as required with the ``sleep`` paramater. This is useful if you
193-
want to get information about the progress of the pipeline execution as it
194-
proceeds.
226+
Changelog
227+
---------
195228

196-
Direct Running
197-
~~~~~~~~~~~~~~
229+
Release 0.6.0
230+
~~~~~~~~~~~~~
198231

199-
If you just want to run a single pipeline without initialising a
200-
``Pipeline`` object first, you can ``run`` or
201-
``run_and_poll`` directly, without needing to create a
202-
``Pipeline``:
232+
`24th May 2023`
203233

204-
>>> import nextflow
205-
>>> execution = nextflow.run(path="pipeline.nf", config=["settings.config"], params={"param1": "123"})
234+
* Added ability to use custom runners for starting jobs.
235+
* Removed pipeline class to.
236+
* Overhauled architecture.
206237

207-
Changelog
208-
---------
209238

210239
Release 0.5.0
211240
~~~~~~~~~~~~~

docs/source/installing.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ pip
66

77
nextflow.py can be installed using pip::
88

9-
$ pip install nextflow
9+
$ pip install nextflowpy
1010

1111
If you get permission errors, try using ``sudo``::
1212

13-
$ sudo pip install nextflow
13+
$ sudo pip install nextflowpy
1414

1515

1616
Development

0 commit comments

Comments
 (0)