Skip to content

Commit a93c69d

Browse files
committed
update README.md
1 parent 0efb366 commit a93c69d

File tree

1 file changed

+75
-58
lines changed

1 file changed

+75
-58
lines changed

Diff for: README.md

+75-58
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,10 @@ For this example to work, you will need:
6565
</ul>
6666
</details>
6767

68+
6869
### Converted Workflow with ZnTrack
6970

70-
To make this workflow reproducible, we convert it into a graph structure:
71+
To make this workflow reproducible, we convert it into a **directed graph structure** where each step is represented as a **Node**. Nodes define their inputs, outputs, and the computational logic to execute. Here's the graph structure for our example:
7172

7273
```mermaid
7374
flowchart LR
@@ -78,141 +79,157 @@ MACE_MP --> StructureOptimization
7879

7980
#### Node Definitions
8081

81-
Within ZnTrack, each Node is defined by a class. The class attributes define the
82-
inputs and outputs for each Node, while the `run` method provides the actual
83-
code that will be executed at runtime.
82+
In ZnTrack, each **Node** is defined as a Python class. The class attributes define the **inputs** (parameters and dependencies) and **outputs**, while the `run` method contains the computational logic to be executed.
8483

85-
> [INFO] ZnTrack uses Python dataclasses under the hood to provide you with an
86-
> automatic `__init__`. Starting from Python 3.11 most IDEs should also reliably
87-
> provide type hints for ZnTrack nodes.
84+
> [!NOTE]
85+
> ZnTrack uses Python dataclasses under the hood, providing an automatic `__init__` method. Starting from Python 3.11, most IDEs should reliably provide type hints for ZnTrack Nodes.
8886
89-
> [NOTE] For files produces during the `run`, ZnTrack provides a unique node
90-
> working directort (`zntrack.nwd`) which should be used to store files within.
87+
> [!TIP]
88+
> For files produced during the `run` method, ZnTrack provides a unique **Node Working Directory** (`zntrack.nwd`). Always use this directory to store files to ensure reproducibility and avoid conflicts.
9189
9290
```python
9391
import zntrack
9492
import ase.io
9593
from pathlib import Path
9694

9795
class Smiles2Conformers(zntrack.Node):
98-
smiles: str = zntrack.params() # a required parameter
99-
numConfs: int = zntrack.params(32) # a default parameter
96+
smiles: str = zntrack.params() # A required parameter
97+
numConfs: int = zntrack.params(32) # A parameter with a default value
10098

101-
frames_path: Path = zntrack.outs_path(zntrack.nwd / "frames.xyz") # node output in the node working directory
99+
frames_path: Path = zntrack.outs_path(zntrack.nwd / "frames.xyz") # Output file path
102100

103101
def run(self) -> None:
102+
# Generate molecular conformers from a SMILES string
104103
frames = smiles2conformers(smiles=self.smiles, numConfs=self.numConfs)
104+
# Save the frames to the output file
105105
ase.io.write(frames, self.frames_path)
106106

107107
@property
108108
def frames(self) -> list[ase.Atoms]:
109-
with self.state.fs.open(self.frames_path, "r") as f: # we use the node state filesystem to read the Node to enable automatic data download and comparison of results. This will become important later.
109+
# Load the frames from the output file using the node's filesystem
110+
with self.state.fs.open(self.frames_path, "r") as f:
110111
return list(ase.io.iread(f, ":", format="extxyz"))
111112

112113

113114
class Pack(zntrack.Node):
114-
data: list[list[ase.Atoms]] = zntrack.deps() # in addition to parameters we can define dependencies as inputs
115-
counts: list[int] = zntrack.params()
116-
density: float = zntrack.params()
115+
data: list[list[ase.Atoms]] = zntrack.deps() # Input dependency (list of ASE Atoms)
116+
counts: list[int] = zntrack.params() # Parameter (list of counts)
117+
density: float = zntrack.params() # Parameter (density value)
117118

118-
frames_path: Path = zntrack.outs_path(zntrack.nwd / "frames.xyz")
119+
frames_path: Path = zntrack.outs_path(zntrack.nwd / "frames.xyz") # Output file path
119120

120121
def run(self) -> None:
122+
# Pack the molecular frames into a periodic box
121123
box = pack(data=self.data, counts=self.counts, density=self.density)
124+
# Save the packed structure to the output file
122125
ase.io.write(box, self.frames_path)
123126

124127
@property
125128
def frames(self) -> list[ase.Atoms]:
129+
# Load the packed structure from the output file
126130
with self.state.fs.open(self.frames_path, "r") as f:
127131
return list(ase.io.iread(f, ":", format="extxyz"))
128132

129133

130134
# We could hardcode the MACE_MP model into the StructureOptimization Node, but we
131-
# can also define it as a dependency. In contrast to `Smiles2Conformers` and
132-
# `Pack` the model does not require a `run` method and thus we can define it as a
133-
# `@dataclass`
135+
# can also define it as a dependency. Since the model doesn't require a `run` method,
136+
# we define it as a `@dataclass`.
134137

135138
@dataclass
136139
class MACE_MP:
137-
model: str = "medium"
140+
model: str = "medium" # Default model type
138141

139142
def get_calculator(self, **kwargs):
143+
# Return a MACE-MP calculator instance
140144
return mace_mp(model=self.model)
141145

142146

143147
class StructureOptimization(zntrack.Node):
144-
model: MACE_MP = zntrack.deps() # model dependency
145-
data: list[ase.Atoms] = zntrack.deps() # ase.Atoms dependency
146-
data_id: int = zntrack.params()
147-
fmax: float = zntrack.params(0.05)
148+
model: MACE_MP = zntrack.deps() # Dependency (MACE_MP model)
149+
data: list[ase.Atoms] = zntrack.deps() # Dependency (list of ASE Atoms)
150+
data_id: int = zntrack.params() # Parameter (index of the structure to optimize)
151+
fmax: float = zntrack.params(0.05) # Parameter (force convergence threshold)
148152

149-
frames_path: Path = zntrack.outs_path(zntrack.nwd / "frames.traj")
153+
frames_path: Path = zntrack.outs_path(zntrack.nwd / "frames.traj") # Output file path
150154

151155
def run(self):
156+
# Select the structure to optimize
152157
atoms = self.data[self.data_id]
158+
# Attach the MACE-MP calculator
153159
atoms.calc = self.model.get_calculator()
160+
# Run the geometry optimization
154161
dyn = LBFGS(atoms, trajectory=self.frames_path)
155162
dyn.run(fmax=0.5)
156163

157164
@property
158165
def frames(self) -> list[ase.Atoms]:
166+
# Load the optimization trajectory from the output file
159167
with self.state.fs.open(self.frames_path, "rb") as f:
160168
return list(ase.io.iread(f, ":", format="traj"))
161169
```
162170

163171
#### Building and Running the Workflow
164172

165-
Now that we have defined all necessary Nodes we can put them to use and build
166-
our graph. Best to go into a new and empty directory, run `git init` followed by
167-
`dvc init`. Then we create a file `src/__init__.py` and place the Node
168-
definitions in there. Finally we create a new file `main.py` as described bellow
169-
and execute it using `python main.py` to build and access our workflow.
173+
Now that we’ve defined all the necessary Nodes, we can build and execute the workflow. Follow these steps:
170174

171-
```python
172-
import zntrack
173-
from src import MACE_MP, Smiles2Conformers, Pack, StructureOptimization
175+
1. **Initialize a new directory** for your project:
176+
```bash
177+
git init
178+
dvc init
179+
```
174180

175-
project = zntrack.Project()
181+
2. **Create a Python module** for the Node definitions:
182+
- Create a file `src/__init__.py` and place the Node definitions inside it.
176183

177-
model = MACE_MP()
184+
3. **Define and execute the workflow** in a `main.py` file:
185+
```python
186+
import zntrack
187+
from src import MACE_MP, Smiles2Conformers, Pack, StructureOptimization
178188

179-
with project:
180-
# within the project context we can define and connect nodes
181-
etoh = Smiles2Conformers(smiles="CCO", numConfs=32)
182-
box = Pack(data=[etoh.frames], counts=[32], density=789)
183-
optm = StructureOptimization(model=model, data=box.frames, data_id=-1, fmax=0.5)
189+
# Initialize the ZnTrack project
190+
project = zntrack.Project()
184191

185-
# the nodes will only be executed afterwards in seperate python kernels.
186-
# if you don't want to execute the graph immediatly, use `project.build()` instead
187-
# and run the graph alter using `dvc repro` or the paraffin package.
188-
project.repro()
189-
```
192+
# Define the MACE-MP model
193+
model = MACE_MP()
194+
195+
# Build the workflow graph
196+
with project:
197+
etoh = Smiles2Conformers(smiles="CCO", numConfs=32) # Generate conformers
198+
box = Pack(data=[etoh.frames], counts=[32], density=789) # Pack the structures
199+
optm = StructureOptimization(model=model, data=box.frames, data_id=-1, fmax=0.5) # Optimize the structure
200+
201+
# Execute the workflow
202+
project.repro()
203+
```
204+
205+
> **TIP**
206+
> If you don’t want to execute the graph immediately, use `project.build()` instead. You can run the graph later using `dvc repro` or the [paraffin](https://github.com/zincware/paraffin) package.
190207
191208
#### Accessing Results
192209

193-
Once the graph has been executed, the respective files will have been written.
194-
For example, you could load the `nodes/StructureOptimization/frames.traj`
195-
trajectory directly from the file path.
210+
Once the workflow has been executed, the results are stored in the respective files. For example, the optimized trajectory is saved in `nodes/StructureOptimization/frames.traj`.
196211

197-
Alternatively, you can load ZnTrack nodes after they have been executed and need
198-
not to worry about where the file was stored or in which format, because you can
199-
look at the `list[ase.Atoms]` direclty from within Python by loading the node as
200-
follows:
212+
You can load the results directly using ZnTrack, without worrying about file paths or formats:
201213

202214
```python
203215
import zntrack
204216

217+
# Load the StructureOptimization Node
205218
optm = zntrack.from_rev(name="StructureOptimization")
219+
220+
# Access the optimization trajectory
206221
print(optm.frames)
207222
```
208223

209-
For more examples, check out the following packages that build on top of
210-
ZnTrack:
224+
---
211225

212-
- [MLIPx](https://mlipx.readthedocs.io/en/latest/)
213-
- [IPSuite](https://github.com/zincware/IPSuite)
226+
### More Examples
227+
228+
For additional examples and advanced use cases, check out these packages built on top of ZnTrack:
229+
230+
- [MLIPx](https://mlipx.readthedocs.io/en/latest/) - Machine Learning Interatomic Potentials.
231+
- [IPSuite](https://github.com/zincware/IPSuite) - Interatomic Potential Suite for materials science.
214232

215-
______________________________________________________________________
216233

217234
## Technical Details
218235

0 commit comments

Comments
 (0)