Skip to content

Commit b27a981

Browse files
authored
Merge pull request #2 from schwallergroup/checks
Improved typing, returning lists instead of tuples, docstrings, coverage
2 parents 6bc61c8 + bb82d0f commit b27a981

20 files changed

+1831
-562
lines changed

.github/workflows/test.yaml

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
name: Run tests
2+
on: push
3+
4+
jobs:
5+
test:
6+
runs-on: ubuntu-latest
7+
strategy:
8+
matrix:
9+
python-version: ["3.10", "3.11"]
10+
steps:
11+
- uses: actions/checkout@v3
12+
- name: Set up Python ${{ matrix.python-version }}
13+
uses: actions/setup-python@v2
14+
with:
15+
python-version: ${{ matrix.python-version }}
16+
- name: Install dependencies
17+
run: |
18+
python -m pip install --upgrade pip
19+
pip install tox tox-gh-actions
20+
- name: Test with tox
21+
run: tox

.pre-commit-config.yaml

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
repos:
2+
- repo: https://github.com/pre-commit/pre-commit-hooks
3+
rev: v4.3.0
4+
hooks:
5+
- id: check-added-large-files
6+
args: [--maxkb=5000]
7+
- id: check-merge-conflict
8+
9+
# - repo: https://github.com/astral-sh/ruff-pre-commit
10+
# rev: v0.3.2
11+
# hooks:
12+
# - id: ruff
13+
# args: [ --fix, --exit-non-zero-on-fix ]
14+
# - id: ruff-format

README.md

+27-8
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Rxn-INSIGHT: Fast Chemical Reaction Analysis Using Bond-Electron Matrices
22

3+
![Coverage Status](coverage-badge.svg)
4+
35
Rxn-INSIGHT is an open-source algorithm, written in python, to classify and name chemical reactions, and suggest reaction conditions based on similarity and popularity.
46
* https://doi.org/10.1186/s13321-024-00834-z: Peer-reviewed publication on Rxn-INSIGHT
57
## 1. Installation
@@ -8,28 +10,45 @@ Rxn-INSIGHT relies on NumPy, Pandas, RDKit, RDChiral, and RXNMapper.
810
A virtual environment can be installed with Anaconda as follows:
911

1012
```console
11-
conda env create -f environment.yml
13+
conda create -n rxn-insight python=3.10
1214
conda activate rxn-insight
1315
```
1416

15-
To add the rxn-insight environment to Jupyter Notebook:
17+
```
18+
git clone https://github.com/schwallergroup/Rxn-INSIGHT.git
19+
cd Rxn-INSIGHT
20+
pip install .
21+
```
1622

17-
```console
18-
python -m ipykernel install --user --name=rxn-insight
23+
Or, for developing with the optional dependencies, which are required to run the tests
24+
and build the docs:
25+
```
26+
pip install -e ".[test,doc]"
27+
```
28+
29+
All of the test environments can be run using the command `tox` from the top directory.
30+
Alternatively, individual test environments can be run using the `-e` flag as
31+
in `tox -e env-name`. To run the tests, tests with coverage report, style checks, and
32+
docs build, respectively:
33+
```
34+
tox -e py3
35+
tox -e py3-coverage
36+
tox -e style
37+
tox -e docs
1938
```
2039

2140
## 2. Usage
2241

2342
### Basic Usage
24-
```console
43+
```python
2544
from rxn_insight.reaction import Reaction
2645
r = "c1ccccc1I.C=CC(=O)OC>>COC(=O)/C=C/c1ccccc1" # Define a Reaction SMILES identifier
2746
rxn = Reaction(r)
2847
ri = rxn.get_reaction_info()
2948
```
3049

3150
The reaction info contains most of the information:
32-
```console
51+
```python
3352
{'REACTION': 'C=CC(=O)OC.Ic1ccccc1>>COC(=O)/C=C/c1ccccc1',
3453
'MAPPED_REACTION': '[CH3:1][O:2][C:3](=[O:4])[CH:5]=[CH2:6].I[c:7]1[cH:8][cH:9][cH:10][cH:11][cH:12]1>>[CH3:1][O:2][C:3](=[O:4])/[CH:5]=[CH:6]/[c:7]1[cH:8][cH:9][cH:10][cH:11][cH:12]1',
3554
'N_REACTANTS': 2,
@@ -52,13 +71,13 @@ The reaction info contains most of the information:
5271

5372
### Similarity Search
5473
A similarity search can be performed when a database with similar reactions is provided as a pandas DataFrame (df in this case). Another Pandas DataFrame is returned.
55-
```console
74+
```python
5675
df_nbs = rxn.find_neighbors(df, fp="MACCS", concatenate=True, threshold=0.5, broaden=True, full_search=False)
5776
```
5877

5978
### Condition Suggestion
6079
Reaction conditions can be suggested when a Pandas DataFrame is provided.
61-
```console
80+
```python
6281
rxn.suggest_conditions(df)
6382
suggested_solvents = rxn.suggested_solvent
6483
suggested_catalysts = rxn.suggested_catalyst

coverage-badge.svg

+1
Loading

docs/source/_static/.gitkeep

Whitespace-only changes.

docs/source/api/modules.rst

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
rxn_insight
2+
===========
3+
4+
.. toctree::
5+
:maxdepth: 4
6+
7+
rxn_insight

docs/source/api/rxn_insight.rst

+45
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
rxn\_insight package
2+
====================
3+
4+
Submodules
5+
----------
6+
7+
rxn\_insight.classification module
8+
----------------------------------
9+
10+
.. automodule:: rxn_insight.classification
11+
:members:
12+
:undoc-members:
13+
:show-inheritance:
14+
15+
rxn\_insight.reaction module
16+
----------------------------
17+
18+
.. automodule:: rxn_insight.reaction
19+
:members:
20+
:undoc-members:
21+
:show-inheritance:
22+
23+
rxn\_insight.representation module
24+
----------------------------------
25+
26+
.. automodule:: rxn_insight.representation
27+
:members:
28+
:undoc-members:
29+
:show-inheritance:
30+
31+
rxn\_insight.utils module
32+
-------------------------
33+
34+
.. automodule:: rxn_insight.utils
35+
:members:
36+
:undoc-members:
37+
:show-inheritance:
38+
39+
Module contents
40+
---------------
41+
42+
.. automodule:: rxn_insight
43+
:members:
44+
:undoc-members:
45+
:show-inheritance:

docs/source/conf.py

+92
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
"""Sphinx configuration."""
2+
3+
# This file is execfile()d with the current directory set to its containing dir.
4+
#
5+
# This file only contains a selection of the most common options. For a full
6+
# list see the documentation:
7+
# https://www.sphinx-doc.org/en/master/usage/configuration.html
8+
#
9+
# All configuration values have a default; values that are commented out
10+
# serve to show the default.
11+
12+
import os
13+
import shutil
14+
import sys
15+
from importlib.metadata import metadata
16+
17+
# -- Path setup
18+
19+
__location__ = os.path.dirname(__file__)
20+
21+
# If extensions (or modules to document with autodoc) are in another directory,
22+
# add these directories to sys.path here. If the directory is relative to the
23+
# documentation root, use os.path.abspath to make it absolute, like shown here.
24+
sys.path.insert(0, os.path.join(__location__, "../../src"))
25+
26+
# -- Run sphinx-apidoc
27+
# This hack is necessary since RTD does not issue `sphinx-apidoc` before running
28+
# `sphinx-build -b html . _build/html`. See Issue:
29+
# https://github.com/readthedocs/readthedocs.org/issues/1139
30+
# DON'T FORGET: Check the box "Install your project inside a virtualenv
31+
# Additionally it helps us to avoid running apidoc manually
32+
33+
try: # for Sphinx >= 1.7
34+
from sphinx.ext import apidoc
35+
except ImportError:
36+
from sphinx import apidoc
37+
38+
output_dir = os.path.join(__location__, "api")
39+
module_dir = os.path.join(__location__, "../../src/rxn_insight")
40+
try:
41+
shutil.rmtree(output_dir)
42+
except FileNotFoundError:
43+
pass
44+
45+
try:
46+
import sphinx
47+
48+
cmd_line = f"sphinx-apidoc --implicit-namespaces -f -o {output_dir} {module_dir}"
49+
50+
args = cmd_line.split(" ")
51+
if tuple(sphinx.__version__.split(".")) >= ("1", "7"):
52+
# This is a rudimentary parse_version to avoid external dependencies
53+
args = args[1:]
54+
55+
apidoc.main(args)
56+
except Exception as e:
57+
print(f"Running `sphinx-apidoc` failed!\n{e}")
58+
59+
# -- Project information
60+
61+
_metadata = metadata("rxn_insight")
62+
63+
project = _metadata["Name"]
64+
author = _metadata["Author-email"].split("<", 1)[0].strip()
65+
copyright = f"2024, {author}"
66+
67+
version = _metadata["Version"]
68+
release = ".".join(version.split(".")[:2])
69+
70+
71+
# -- General configuration
72+
73+
extensions = [
74+
"myst_parser",
75+
"sphinx_copybutton",
76+
"sphinx.ext.autodoc",
77+
# "sphinx.ext.intersphinx",
78+
"sphinx.ext.viewcode",
79+
]
80+
81+
templates_path = ["_templates"]
82+
83+
exclude_patterns = [
84+
"Thumbs.db",
85+
".DS_Store",
86+
".ipynb_checkpoints",
87+
]
88+
89+
# -- Options for HTML output
90+
91+
html_theme = "furo"
92+
html_static_path = ["_static"]

docs/source/index.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
```{include} ../../README.md
2+
:relative-images:
3+
```
4+
5+
```{toctree}
6+
:caption: 'Contents:'
7+
:maxdepth: 2
8+
api/modules
9+
```
10+
11+
# Indices and tables
12+
13+
- {ref}`genindex`
14+
- {ref}`modindex`
15+
- {ref}`search`

notebooks/demo.ipynb

+32-5
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,13 @@
1515
"metadata": {},
1616
"outputs": [],
1717
"source": [
18-
"from rxn_insight.reaction import *\n",
18+
"import pandas as pd\n",
19+
"import numpy as np\n",
20+
"from rxnmapper import RXNMapper\n",
21+
"\n",
22+
"from rxn_insight.reaction import Reaction\n",
23+
"from rxn_insight.utils import draw_chemical_reaction, curate_smirks, get_similarity, get_fp\n",
24+
"from IPython.display import SVG, display\n",
1925
"import time"
2026
]
2127
},
@@ -34,7 +40,7 @@
3440
"metadata": {},
3541
"outputs": [],
3642
"source": [
37-
"df_uspto = pd.read_parquet(\"data/example.gzip\")"
43+
"df_uspto = pd.read_parquet(\"../data/example.gzip\")"
3844
]
3945
},
4046
{
@@ -107,6 +113,27 @@
107113
"rxn.get_reaction_info()"
108114
]
109115
},
116+
{
117+
"cell_type": "code",
118+
"execution_count": null,
119+
"id": "1eed892f",
120+
"metadata": {},
121+
"outputs": [],
122+
"source": [
123+
"rxn = '[CH3:1][O:2][C:3](=[O:4])[CH:5]=[CH2:6].I[c:7]1[cH:8][cH:9][cH:10][cH:11][cH:12]1>>[CH3:1][O:2][C:3](=[O:4])/[CH:5]=[CH:6]/[c:7]1[cH:8][cH:9][cH:10][cH:11][cH:12]1'"
124+
]
125+
},
126+
{
127+
"cell_type": "code",
128+
"execution_count": null,
129+
"id": "91a73156",
130+
"metadata": {},
131+
"outputs": [],
132+
"source": [
133+
"rxn2 = Reaction(rxn, keep_mapping=True)\n",
134+
"rxn2.get_reaction_info()"
135+
]
136+
},
110137
{
111138
"cell_type": "markdown",
112139
"id": "b8a0b8de",
@@ -133,9 +160,9 @@
133160
"outputs": [],
134161
"source": [
135162
"rxn_mapper = RXNMapper()\n",
136-
"smirks = pd.read_json(\"src/rxn_insight/json/smirks.json\", orient='records', lines=True)\n",
163+
"smirks = pd.read_json(\"../src/rxn_insight/json/smirks.json\", orient='records', lines=True)\n",
137164
"smirks = curate_smirks(smirks)\n",
138-
"fg = pd.read_json(\"src/rxn_insight/json/functional_groups.json\", orient='records', lines=True)"
165+
"fg = pd.read_json(\"../src/rxn_insight/json/functional_groups.json\", orient='records', lines=True)"
139166
]
140167
},
141168
{
@@ -185,7 +212,7 @@
185212
"outputs": [],
186213
"source": [
187214
"# df_analyzed = pd.read_parquet(\"data/uspto.gzip\")\n",
188-
"df_analyzed = pd.read_parquet(\"data/1000rxns.gzip\")"
215+
"df_analyzed = pd.read_parquet(\"../data/1000rxns.gzip\")"
189216
]
190217
},
191218
{

0 commit comments

Comments
 (0)