-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit b28a1ea
Showing
244 changed files
with
85,526 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 9ea53df5e29e4199cacaaa405e3b21db | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added
BIN
+104 KB
.doctrees/autoapi/grelu/transforms/prediction_transforms/index.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file added
BIN
+26.9 KB
_images/04794fdfa25154ce4597ed72ccee16bf5baf704e4de03d3d9e7b0451fd03089b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+84.7 KB
_images/100e6ddd697f6bb7d211fabc7e219f5e9921e2e9b3ddea33884988e219d4a2b1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+217 KB
_images/1434bc1b803242b9170480d8b00516cbedfc2093c4708f68833f93a9db8d4d45.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+16.2 KB
_images/1586b59295f0fa9085c1191aa61d06d881389c19bd25b3916dacaa7f53ac37f9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+33.1 KB
_images/171c55617b5eeefbd31d23a1b82d523373baf6a6eb4783e8c18d8a80852b5fa2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+48.1 KB
_images/18321c29fe6c8de1293a00591dc4f4642a16d266420626425be5d1a7863cb0a1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+290 KB
_images/1ce34283e8b3ab00b4ab63e6a12392cfe6c7c7b5f8b7737b4fd2732c8e5c2894.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+35.1 KB
_images/1d3a4dd5fdb3f0b1aa3a636d77d22d95c9d3bdb98f2e2a1748550046c60e177d.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+23.8 KB
_images/243c0a551b4b0165d174b31fee579443be1be35240869a11e7865b49bba99ec3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+36.5 KB
_images/294252300b2f6fdf157b2452bc5db8c4f99d6c8093709e5b282754148b4693e5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+51.4 KB
_images/44a21024121e986faa764484c97d949477e9331fa9af4899539441a9e2de82f3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+21.8 KB
_images/68f227cc3b4f79102b8902ada1717bdbc1390438db5cc9535d6bf0aff2e22aae.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+62.8 KB
_images/698ad3a678ec306c38a6f1b99620b792c2cb6770e6c54765d72a20da3696cbde.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+32.7 KB
_images/6e1580a222ff22a3441b819a9191fe4ab637c7969ff452142b489a52e2a88d9f.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+20.7 KB
_images/8639fe25a4626315fe54993de8302bada1baa249149732aa38ffeac75f0ffad7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+29.8 KB
_images/9574a452eabc64a270d9819616a46f1ba1a03cb07537abd814a81490a5743e30.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+113 KB
_images/a394de89206b1d8d796e0c931bd4621e40efcc93664a2ac1bd9a5c6aeb4d0717.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+18.2 KB
_images/a6e18102767da3d1a3a3e494083826b37d95959075649b046d672d1983e090ed.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+46.7 KB
_images/b0bb29b3c05067b19489735ffdd9d9d2078d3e161c82f34149d5177c35599750.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+12.2 KB
_images/b0dc8936f9a54a85f3d567beb8ad97fb4af7bd4e6ddd7567fc69be29a0a39416.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+412 KB
_images/c0834933458a9643846aaa60a621b3dfb2b30869cab91f0a9b18d16234350825.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+16.6 KB
_images/c443c7e0c686335938a77190631d52d4d14bbf35a2780614bc5e47c46fbd99d1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+9.56 KB
_images/c6d402aa5ba422a06a93a5ca3edd6bff968447c78d0c1a398d02ea31b26393de.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+20.6 KB
_images/df8df9bd13d7fc112998f661dabb765bd64a00de6e8572ba08fd73fc7d87840f.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+43 KB
_images/e873666edbd444bb73d9a3d728da11b2f8f708c99b04a923f115a78a7719a740.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+34.8 KB
_images/ed83c88f49d43afdf7a351c160ee1d799a309aa6ebb53423299d2135322336eb.png
Oops, something went wrong.
Binary file added
BIN
+67.6 KB
_images/ee21815a290e5b34800ccda7427acadb25d590d6ae6714ba180a8d831beae716.png
Oops, something went wrong.
Binary file added
BIN
+33.5 KB
_images/fe60b33a7790ba0fd2f60b671a54e0013eb316603ad610e74f7f6d801833879d.png
Oops, something went wrong.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.. _authors: | ||
.. include:: ../AUTHORS.md | ||
:parser: myst_parser.sphinx_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
grelu.data.augment | ||
================== | ||
|
||
.. py:module:: grelu.data.augment | ||
.. autoapi-nested-parse:: | ||
|
||
Functions to augment data. All functions assume that the input is a numpy array containing an integer | ||
encoded DNA sequence of shape (L,) or a numpy array containing a label of shape (T, L). | ||
The augmented output will be in the same format. | ||
|
||
|
||
|
||
Attributes | ||
---------- | ||
|
||
.. autoapisummary:: | ||
|
||
grelu.data.augment.AUGMENTATION_MULTIPLIER_FUNCS | ||
|
||
|
||
Classes | ||
------- | ||
|
||
.. autoapisummary:: | ||
|
||
grelu.data.augment.Augmenter | ||
|
||
|
||
Functions | ||
--------- | ||
|
||
.. autoapisummary:: | ||
|
||
grelu.data.augment.random_mutate | ||
grelu.data.augment.reverse_complement | ||
grelu.data.augment._get_multipliers | ||
grelu.data.augment._split_overall_idx | ||
grelu.data.augment.shift | ||
grelu.data.augment.rc_seq | ||
grelu.data.augment.rc_label | ||
|
||
|
||
Module Contents | ||
--------------- | ||
|
||
.. py:function:: random_mutate(seq: Union[str, numpy.ndarray], rng: Optional[numpy.random.RandomState] = None, pos: Optional[int] = None, drop_ref: bool = True, input_type: Optional[str] = None, protect: List[int] = []) -> Union[str, numpy.ndarray] | ||
Introduce a random single-base substitution into a DNA sequence. | ||
|
||
:param seq: A single DNA sequence in string or integer encoded format. | ||
:param rng: np.random.RandomState object for reproducibility | ||
:param pos: Position at which to insert a random mutation. If None, a random position will be chosen. | ||
:param drop_ref: If True, the reference base will be dropped from the list of possible bases at the mutated position. | ||
If False, there is a possibility that the original sequence will be returned. | ||
:param input_type: Format of the input sequence. Accepted values are "strings" or "indices". | ||
:param protect: A list of positions to protect from mutation. Only needed if `pos` is None. | ||
|
||
:returns: A mutated sequence in the same format as the input sequence | ||
|
||
:raises ValueError: if the input is not a string or integer encoded DNA sequence. | ||
|
||
|
||
.. py:function:: reverse_complement(seqs: [str, List[str], numpy.ndarray], input_type: Optional[str] = None) -> Union[str, List[str], numpy.ndarray] | ||
Reverse complement input DNA sequences | ||
|
||
:param seqs: DNA sequences as strings or index encoding | ||
:param input_type: Format of the input sequences. Accepted values | ||
are "strings" or "indices". | ||
|
||
:returns: reverse complemented sequences in the same format as the input. | ||
|
||
:raises ValueError: If the input DNA sequence is not in string or index encoded format. | ||
|
||
|
||
.. py:data:: AUGMENTATION_MULTIPLIER_FUNCS | ||
.. py:function:: _get_multipliers(**kwargs) -> List[int] | ||
.. py:function:: _split_overall_idx(idx: int, max_values: List[int]) -> List[List[int]] | ||
Given an integer index, split it into multiple indices, each ranging from 0 | ||
to a specified maximum value | ||
|
||
|
||
.. py:function:: shift(arr: numpy.ndarray, seq_len: int, idx: int) -> numpy.ndarray | ||
Shift a sliding window along a sequence or label by the given number of bases. | ||
|
||
:param arr: Numpy array with length as the last dimension. | ||
:param seq_len: Desired length for the output sequence. | ||
:param idx: Start position | ||
|
||
:returns: Shifted sequence | ||
|
||
|
||
.. py:function:: rc_seq(seq: numpy.ndarray, idx: bool) -> numpy.ndarray | ||
Reverse complement a sequence based on the index | ||
|
||
:param seq: Integer-encoded sequence. | ||
:param idx: If True, the reverse complement sequence will be returned. | ||
If False, the sequence will be returned unchanged. | ||
|
||
:returns: Same or reverse complemented sequence | ||
|
||
|
||
.. py:function:: rc_label(label: numpy.ndarray, idx: bool) -> numpy.ndarray | ||
Reverse a label based on the index | ||
|
||
:param label: Numpy array with length as the last dimension | ||
:param idx: If True, the label will be reversed along the length axis. | ||
If False, the label will be returned unchanged. | ||
|
||
:returns: Same or reversed label | ||
|
||
|
||
.. py:class:: Augmenter(rc: bool = False, max_seq_shift: int = 0, max_pair_shift: int = 0, n_mutated_seqs: int = 0, n_mutated_bases: Optional[int] = None, protect: List[int] = [], seq_len: Optional[int] = None, label_len: Optional[int] = None, seed: Optional[int] = None, mode: str = 'serial') | ||
A class that generates augmented DNA sequences or (sequence, label) pairs. | ||
|
||
:param rc: If True, augmentation by reverse complementation will be performed. | ||
:param max_seq_shift: Maximum number of bases by which the sequence alone can be shifted. | ||
This is normally a small value (< 10). | ||
:param max_pair_shift: Maximum number of bases by which the sequence and label can be jointly | ||
shifted. This can be a larger value. | ||
:param n_mutated_seqs: Number of augmented sequences to generate by random mutation | ||
:param n_mutated_bases: The number of bases to mutate in each augmented sequence. Only used | ||
if n_mutated_seqs is greater than 0. | ||
:param protect: A list of positions to protect from random mutation. Only used | ||
if n_mutated_seqs is greater than 0. | ||
:param seq_len: Length of the augmented sequences | ||
:param label_len: Length of the augmented labels | ||
:param seed: Random seed for reproducibility. | ||
:param mode: "random" or "serial" | ||
|
||
|
||
.. py:method:: __len__() -> int | ||
The total number of augmented sequences that can be produced from a single | ||
DNA sequence | ||
|
||
|
||
|
||
.. py:method:: _split(idx: int) -> List[tuple] | ||
Function to split an input index into indices specifying each type | ||
of augmentation | ||
|
||
|
||
|
||
.. py:method:: _get_random_idxs() -> List[tuple] | ||
Function to select indices for each type of augmentation randomly | ||
|
||
|
||
|
||
.. py:method:: __call__(idx: int, seq: numpy.ndarray, label: Optional[numpy.ndarray] = None) -> Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]] | ||
Perform augmentation on a given integer-encoded DNA sequence or (sequence, label) pair | ||
|
||
:param idx: Index specifying the augmentation to be performed. | ||
:param seq: A single integer encoded DNA sequence | ||
:param label: A numpy array of shape (T, L) containing the label | ||
|
||
:returns: The augmented DNA sequence or (sequence, label) pair if label is supplied. | ||
|
||
|
||
|
Oops, something went wrong.