Skip to content

Commit ca6bb4e

Browse files
Merge pull request #44 from jeromekelleher/more-docs
More docs
2 parents 8c0321a + ea3e94c commit ca6bb4e

File tree

9 files changed

+114
-176
lines changed

9 files changed

+114
-176
lines changed

docs/interchange.rst renamed to docs/data-model.rst

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
1-
.. _sec_interchange:
1+
.. _sec_data_model:
22

3-
#########################
4-
Tree sequence interchange
5-
#########################
3+
##########
4+
Data model
5+
##########
66

77
The correlated genealogical trees that describe the shared ancestry of a set of
8-
samples are stored concisely in ``msprime`` as a collection of
8+
samples are stored concisely in ``tskit`` as a collection of
99
easy-to-understand tables. These are output by coalescent simulation in
1010
``msprime`` or can be read in from another source. This page documents
1111
the structure of the tables, and the different methods of interchanging
12-
genealogical data to and from the msprime API. We begin by defining
12+
genealogical data to and from the tskit API. We begin by defining
1313
the basic concepts that we need and the structure of the tables in the
1414
`Data model`_ section. We then describe the tabular text formats that can
1515
be used as simple interchange mechanism for small amounts of data in the
1616
`Text file formats`_ section. The `Binary interchange`_ section then describes
1717
the efficient Python API for table interchange using numpy arrays. Finally,
18-
we describe the binary format used by msprime to efficiently
18+
we describe the binary format used by tskit to efficiently
1919
store tree sequences on disk in the `Tree sequence file format`_ section.
2020

2121

22-
.. _sec_data_model:
22+
.. _sec_data_model_definitions:
2323

24-
**********
25-
Data model
26-
**********
24+
***********
25+
Definitions
26+
***********
2727

2828
To begin, here are definitions of some key ideas encountered later.
2929

@@ -156,7 +156,7 @@ term "genome" at times, for concreteness.
156156
Several properties naturally associated with individuals are in fact assigned
157157
to nodes in what follows: birth time and population. This is for two reasons:
158158
First, since coalescent simulations naturally lack a notion of polyploidy, earlier
159-
versions of ``msprime`` lacked the notion of an individual. Second, ancestral
159+
versions of ``tskit`` lacked the notion of an individual. Second, ancestral
160160
nodes are not naturally grouped together into individuals -- we know they must have
161161
existed, but have no way of inferring this grouping, so in fact many nodes in
162162
an empirically-derived tree sequence will not be associated with individuals,
@@ -405,7 +405,7 @@ helpful for inferring demographic history to record this history.
405405
Migrations are performed by individual ancestors, but most likely not by an
406406
individual whose genome is tracked as a ``node`` (as in a discrete-deme model they are
407407
unlikely to be both a migrant and a most recent common ancestor). So,
408-
``msprime`` records when a segment of ancestry has moved between
408+
``tskit`` records when a segment of ancestry has moved between
409409
populations. This table is not required, even if different nodes come from
410410
different populations.
411411

@@ -491,7 +491,7 @@ the library itself can use. All other information is considered to be
491491
tables.
492492

493493
Arbitrary binary data can be stored in ``metadata`` columns, and the
494-
``msprime`` library makes no attempt to interpret this information. How the
494+
``tskit`` library makes no attempt to interpret this information. How the
495495
information held in this field is encoded is entirely the choice of client code.
496496

497497
To ensure that metadata can be safely interchanged using the :ref:`sec_text_file_format`,
@@ -1046,7 +1046,7 @@ length. To encode such columns in the tables API, we store **two** columns:
10461046
one contains the flattened array of data and another stores the **offsets**
10471047
of each row into this flattened array. Consider an example::
10481048

1049-
>>> s = msprime.SiteTable()
1049+
>>> s = tskit.SiteTable()
10501050
>>> s.add_row(0, "A")
10511051
>>> s.add_row(0, "")
10521052
>>> s.add_row(0, "TTT")
@@ -1231,7 +1231,7 @@ Legacy Versions
12311231
===============
12321232

12331233
Tree sequence files written by older versions of tskit are not readable by
1234-
newer versions of msprime. For major releases of tskit, ``tskit upgrade``
1234+
newer versions of tskit. For major releases of tskit, ``tskit upgrade``
12351235
will convert older tree sequence files to the latest version.
12361236

12371237
File formats from version 11 onwards are based on

docs/development.rst

Lines changed: 54 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,57 @@
11
.. _sec_development:
22

3-
=======================
4-
Developer documentation
5-
=======================
3+
===========
4+
Development
5+
===========
66

7-
.. todo:: Port developer docs
7+
If you would like to add some features to ``tskit``, please read the
8+
following. If you think there is anything missing,
9+
please open an `issue <http://github.com/tskit-dev/tskit/issues>`_ or
10+
`pull request <http://github.com/tskit-dev/tskit/pulls>`_ on GitHub!
11+
12+
**********
13+
Quickstart
14+
**********
15+
16+
- Make a fork of the tskit repo on `GitHub <github.com/tskit-dev/tskit>`_
17+
- Clone your fork into a local directory, making sure that the **submodules
18+
are correctly initialised**::
19+
20+
$ git clone [email protected]:tskit-dev/tskit.git --recurse-submodules
21+
22+
For an already checked out repo, the submodules can be initialised using::
23+
24+
$ git submodule update --init --recursive
25+
26+
- Install the Python development requirements using
27+
``pip install -r python/requirements/development.txt``.
28+
- Build the low level module by running ``make`` in the ``python`` directory. Python 3.x
29+
is the default for developement (Python 2.x is discouraged).
30+
- Run the tests to ensure everything has worked: ``python -m nose -vs``. These should
31+
all pass.
32+
- Make your changes in a local branch, and open a pull request on GitHub when you
33+
are ready. Please make sure that (a) the tests pass before you open the PR; and
34+
(b) your code passes PEP8 checks (see below for a git commit hook to ensure this
35+
happens automatically) before opening the PR.
36+
37+
****************************
38+
Continuous integration tests
39+
****************************
40+
41+
Three different continuous integration providers are used, which run different
42+
combinations of tests on different platforms:
43+
44+
1. `Travis CI <https://travis-ci.org/>`_ runs tests on Linux and OSX using the
45+
`Conda <https://conda.io/docs/>`__ infrastructure for the system level
46+
requirements. All supported versions of Python are tested here.
47+
48+
2. `CircleCI <https://circleci.com/>`_ Runs all Python tests using the apt-get
49+
infrastructure for system requirements. Additionally, the low-level tests
50+
are run, coverage statistics calculated using `CodeCov <https://codecov.io/gh>`__,
51+
and the documentation built.
52+
53+
3. `AppVeyor <https://www.appveyor.com/>`_ Runs Python tests on 32 and 64 bit
54+
Windows using conda.
55+
56+
57+
.. todo:: Complete porting the documentation from msprime

docs/index.rst

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,10 @@ Welcome to tskit's documentation!
1414
installation
1515
python-api
1616
c-api
17+
data-model
18+
provenance
1719
development
1820
changelogs
19-
interchange
20-
provenance
21-
tutorial
2221

2322

2423
Indices and tables

docs/installation.rst

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,13 @@
44
Installation
55
============
66

7-
.. todo:: Port installation docs.
7+
.. note:: This documentation is incomplete. Once we have a conda package that
8+
is installable we'll update the documentation to use that route also. However,
9+
as there are no external dependencies, pip should work well for all
10+
non-Windows users.
11+
12+
Please install ``tskit`` from PyPI using pip::
13+
14+
$ python -m pip install tskit
15+
16+

docs/introduction.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
Introduction
55
============
66

7-
.. note:: This documentation is incomplete and still under developement. If
8-
you would like to help, please open an issue on
7+
This is the documentation for tskit, the tree sequence toolkit.
8+
9+
.. note:: This documentation is incomplete and under development. If
10+
you would like to help, please open an issue or pull request at
911
`GitHub <https://github.com/tskit-dev/tskit>`_.

docs/python-api.rst

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,12 @@
44
Python API
55
==========
66

7+
This page provides detailed documentation for the ``tskit`` Python API.
8+
9+
************************
10+
Trees and tree sequences
11+
************************
12+
713
The :class:`.TreeSequence` class represents a sequence of correlated trees
814
output by a simulation. The :class:`.Tree` class represents a single
915
tree in this sequence.
@@ -13,6 +19,7 @@ There are also methods for loading data into these objects, either from the nati
1319
format using :func:`tskit.load`, or from another sources
1420
using :func:`tskit.load_text` or :meth:`.TableCollection.tree_sequence`.
1521

22+
1623
+++++++++++++++++
1724
Top level-classes
1825
+++++++++++++++++
@@ -49,7 +56,7 @@ Simple container classes
4956
++++++++++++++++++++++++
5057

5158
These classes are simple shallow containers representing the entities defined
52-
in the :ref:`sec_data_model`. These classes are not intended to be instantiated
59+
in the :ref:`sec_data_model_definitions`. These classes are not intended to be instantiated
5360
directly, but are the return types for the various iterators provided by the
5461
:class:`.TreeSequence` and :class:`.Tree` classes.
5562

@@ -96,22 +103,11 @@ using the :ref:`Tables API <sec_tables_api>`.
96103
.. autofunction:: tskit.load_text
97104

98105

99-
**********************
100-
Calculating statistics
101-
**********************
102-
103-
The ``tskit`` API provides methods for efficiently calculating
104-
population genetics statistics from a given :class:`.TreeSequence`.
105-
106-
.. autoclass:: tskit.LdCalculator(tree_sequence)
107-
:members:
108-
109-
110106
.. _sec_tables_api:
111107

112-
***********
113-
Tables API
114-
***********
108+
******
109+
Tables
110+
******
115111

116112
The :ref:`tables API <sec_binary_interchange>` provides an efficient way of working
117113
with and interchanging :ref:`tree sequence data <sec_data_model>`. Each table
@@ -442,12 +438,24 @@ Table functions
442438

443439
.. autofunction:: tskit.unpack_bytes
444440

441+
.. _sec_stats_api:
442+
443+
**********
444+
Statistics
445+
**********
446+
447+
The ``tskit`` API provides methods for efficiently calculating
448+
population genetics statistics from a given :class:`.TreeSequence`.
449+
450+
.. autoclass:: tskit.LdCalculator(tree_sequence)
451+
:members:
452+
445453

446454
.. _sec_provenance_api:
447455

448-
**************
449-
Provenance API
450-
**************
456+
**********
457+
Provenance
458+
**********
451459

452460
We provide some preliminary support for validating JSON documents against the
453461
:ref:`provenance schema <sec_provenance>`. Programmatic access to provenance
@@ -457,5 +465,3 @@ information is planned for future versions.
457465

458466
.. autoexception:: tskit.ProvenanceValidationError
459467

460-
461-

docs/rtd_requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
numpy
22
six
3+
svgwrite
4+
jsonschema
35
breathe

0 commit comments

Comments
 (0)