Skip to content

Commit

Permalink
Initial commit of PyValem 2.0
Browse files Browse the repository at this point in the history
  • Loading branch information
xnx committed Mar 24, 2020
0 parents commit e21e084
Show file tree
Hide file tree
Showing 111 changed files with 30,727 additions and 0 deletions.
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
*.pyc
*.swp
*.swo
.DS_Store
build/
dist/
pyvalem.egg-info/
res/
674 changes: 674 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

123 changes: 123 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
Introduction to PyValem
***********************

PyValem is a Python package for parsing, validating, manipulating and
interpreting the chemical formulas, quantum states and labels of atoms, ions
and small molecules.

Species and states are specfied as strings using a simple and flexible syntax,
and may be compared, output in different formats and manipulated using a
variety of predefined Python methods.

Installation
============

For now::

python setup.py install


Examples
========

The basic (state-less) chemical formula class is ``Formula``. A ``Formula`` object
can be created by passing its constructor a valid string. This object contains
attributes for producing its plain text, HTML and LaTeX representations, and
for calculating its molar mass::

In [1]: from pyvalem.formula import Formula

In [2]: f = Formula('C2H5OH')

In [3]: print(f)
C2H5OH

In [4]: print(f.html)
C<sub>2</sub>H<sub>5</sub>OH

In [5]: print(f.latex)
$\mathrm{C}_{2}\mathrm{H}_{5}\mathrm{O}\mathrm{H}$

In [6]: print(f.rmm) # g.mol-1
46.069

Note that there is no underscore character (``_``) before between the element
symbol and its stoichiometry. Isotopes are specified with the mass number
placed before the element symbol, with both surrounded by parentheses. Do not
use a caret (``^``) to indicate a superscript::

In [7]: f = Formula('(14C)')

In [8]: print(f.html)
<sup>14</sup>C

In [9]: print(f.rmm)
14.0032419884

In [10]: f = Formula('H2(18O)')

In [11]: print(f.rmm)
20.015159612799998

For isotopically-pure compounds, the mass returned is the atomic mass.

Charges are specified as ``+n`` or ``-n``, where ``n`` may be omitted if it is 1.
Do not use a caret (``^``) to indicate a superscript::

In [12]: f = Formula('H3O+')
In [13]: print(f.charge)
1

In [14]: print(f.html)
H<sub>3</sub>O<sup>+</sup>

In [15]: f = Formula('Co(H2O)6+2')
In [16]: print(f.charge)
2

In [17]: print(f.html)
Co(H<sub>2</sub>O)<sub>6</sub><sup>2+</sup>

"Stateful" species are formulas which consist of a valid ``Formula`` string,
followed by whitespace, followed by a semicolon-delimited sequence of valid
quantum state or label specifications. Stateful species know which states they possess and can render these states in different ways. For example::

In [18]: from pyvalem.stateful_species import StatefulSpecies
In [19]: ss1 = StatefulSpecies('Ne+ 1s2.2s2.2p5; 2P_1/2')
In [20]: ss1.states
Out[21]: [1s2.2s2.2p5, 2P_1/2]

In [22]: ss1.states[1].__class__
Out[22]: pyvalem.atomic_term_symbol.AtomicTermSymbol

In [23]: ss1.html
Out[23]: 'Ne<sup>+</sup> 1s<sup>2</sup>2s<sup>2</sup>2p<sup>5</sup>; <sup>2</sup>P<sub>1/2</sub>'

This HTML renders as:

.. raw:: html

Ne<sup>+</sup> 1s<sup>2</sup>2s<sup>2</sup>2p<sup>5</sup>; <sup>2</sup>P<sub>1/2</sub>

.. raw:: latex

$\mathrm{Ne}^+ \; 1s^22s^22p^5; \; {}^2P_{1/2}$

Another example::

In [24]: ss2 = StatefulSpecies('(52Cr)(1H) 1σ2.2σ1.1δ2.1π2; 6Σ+; v=0; J=2')
In [25]: ss2.html
<sup>52</sup>Cr<sup>1</sup>H 1σ<sup>2</sup>.2σ<sup>1</sup>.1δ<sup>2</sup>.1π<sup>2</sup>; <sup>6</sup>Σ<sup>+</sup>; v=0; J=2

which produces:

.. raw:: html

<sup>52</sup>Cr<sup>1</sup>H 1σ<sup>2</sup>.2σ<sup>1</sup>.1δ<sup>2</sup>.1π<sup>2</sup>; <sup>6</sup>Σ<sup>+</sup>; v=0; J=2

.. raw:: latex

$\mathrm{{}^{52}Cr^1H} \; 1\sigma^2.2\sigma^1.1\delta^2.1\pi^2; \; {}^6\Sigma^+; \; v=0; \; J=2$

The syntax for writing different types of quantum state are described in later pages of this documentation.

20 changes: 20 additions & 0 deletions doc/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Binary file added doc/_build/doctrees/environment.pickle
Binary file not shown.
Binary file added doc/_build/doctrees/formula.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/index.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/introduction.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/reaction.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/stateful_species.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/states.doctree
Binary file not shown.
4 changes: 4 additions & 0 deletions doc/_build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 9743c862002b28e0acd6d81b228ff948
tags: 645f666f9bcd5a90fca523b33c5a78b7
109 changes: 109 additions & 0 deletions doc/_build/html/_sources/formula.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
Formula
*******

A ``Formula`` object instance represents the chemical formula of an atom, isotope, ion, molecule, molecular ion, or certain sorts of other particle. The ``Formula`` class has methods for producing representations of the formula in HTML, LaTeX and plain text.

``Formula`` objects are not supposed to be unique (different ``Formula`` objects can represent the same formula); nor is their syntax or this library designed to be applied to large or complex molecules. PyValem is intended as a lightweight, easy-to-use library with an expressive syntax for representing many common small atoms and molecules and their isotopes, states and reactions.

Furthermore, whilst some validation functionality is built into the library, PyValem does not attempt to verify that a provided formula is chemically plausible. In particular, it knows nothing about valence or oxidation state.


Instantiation
=============

A ``Formula`` object may be instantiated by passing a valid string, conforming to the following grammar:

* Single atoms, with atomic weights given by a default natural isotopic abundance are specfied with their element symbol, e.g. ``H``, ``Be``, ``Fr``.

* Isotopes are specified in parentheses (round brackets) with the isotope mass number preceding the element symbol, e.g. ``(12C)``, ``(35Cl)``, ``(235U)``. Note that no caret (``^``) is used to indicate a superscript.

* Charged species are specified with the charge following the formula in the format ``+n`` or ``-n``, where ``n`` may be omitted if it is 1. Do not use a caret (``^``) to indicate a superscript. For example, ``He+``, ``C+2``, ``W-``, ``(79Br)-2``.

* Molecular formulas are written as a sequence of element symbols (which may be repeated for clarity over the structure), with their stoichiometries specified as an integer following the symbol. No underscore (``_``) character is used. For example, ``H2O``, ``(1H)2(16O)``, ``C2H6OH``, ``CH3CH2OH``, ``NH+``, ``CO3+2``.

* Moieties within formula can be bracketed for clarity, for example ``CH3C(CH3)2CH3``.

* A limited number of formula prefixes are supported, for example ``L-CH3CH(NH2)CO2H``, ``cis-CH3CHCHCH3``, ``ortho-C6H4(CH3)2``

* There are some special species:
* ``e-`` is the electron;
* ``e+`` is the positron;
* ``M`` is a generic third-body with no specific identity (and no defined mass or charge);
* ```` (or ``hv``) is the photon.


Output as HTML, LaTeX and slugs
===============================

The ``Formula`` attributes ``html`` and ``latex`` return strings representing the formula in HTML and LaTeX respectively. The attribute ``slug`` returns a URL-safe slug which uniquely identifies the formula's plain text string. For example::

In [1]: from pyvalem.formula import Formula

In [2]: f = Formula('')
In [3]: print(f.formula) # or simply print(f)
Co(H2O)6+2

In [4]: print(f.html)
Co(H<sub>2</sub>O)<sub>6</sub><sup>2+</sup>

In [5]: print(f.latex)
\mathrm{Co}(\mathrm{H}_{2}\mathrm{O})_{6}^{2+}

In [6]: print(f.slug)
Co-_l_H2O_r_6_p2

The HTML and LaTeX representations render as:

.. raw:: html

Co(H<sub>2</sub>O)<sub>6</sub><sup>2+</sup>



Formula Attributes
==================

``Formula`` objects can count atoms, calculate masses and record the total species charge::

In [7]: f = Formula('CO3-2')
In [8]: print(f.natoms)
4
In [9]: print(f.rmm)
60.008

In [10]: print(f.charge)
-2

In [11]: lys = Formula('(NH3+)(CH2)4CH(NH2)CO2-')
In [12]: print(lys.natoms)
24

In [13]: print(lys.rmm) # relative molecular mass
146.19

In [14]: print(lys.charge)
0

This last example is the Lysine zwitterion,

.. raw:: html

(NH<sub>3</sub><sup>+</sup>)(CH<sub>2</sub>)<sub>4</sub>CH(NH<sub>2</sub>)CO<sub>2</sub><sup>-</sup>

.. raw:: latex

(\mathrm{N}\mathrm{H}_{3}^{+})(\mathrm{C}\mathrm{H}_{2})_{4}\mathrm{C}\mathrm{H}(\mathrm{N}\mathrm{H}_{2})\mathrm{C}\mathrm{O}_{2}^{-}

The same applies to isotopes and isotopically-pure molecules, in which case the exact mass is held by the ``mass`` attribute::

In [15]: f = formula('(1H)(35Cl)+')
In [16]: print(f.mass)
35.9766777262

The stoichiometric formula can be output either in order of increasing atomic number (the default) or in alphabetical order::

In [17]: print(lys.stoichiometric_formula())
H14C6N2O2

In [18]: print(lys.stoichiometric_formula('alphabetical'))
C6H14N2O2
25 changes: 25 additions & 0 deletions doc/_build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
.. pyvalem documentation master file, created by
sphinx-quickstart on Fri Mar 6 13:56:04 2020.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to pyvalem's documentation!
===================================

.. toctree::
:maxdepth: 2
:caption: Contents:

introduction.rst
formula.rst
states.rst
stateful_species.rst
reaction.rst


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
Loading

0 comments on commit e21e084

Please sign in to comment.