CNery

breseq copy-number-variation extension. CNery reads the sequencing coverage output from breseq and predicts copy-number variation (CNV) across the genome. Predictions are corrected for coverage biases introduced by sequencing chemistry (GC-content bias) and prokaryotic replication state during DNA isolation (origin-to-terminus / OTR bias).

Recent updates (latest commits):

Multi-genome CNV analysis — CNery now processes all reference sequences found in the breseq BAM/FASTA in one pass. Each reference (chromosome, plasmid, contig, etc.) is preprocessed per genome, pooled for a shared LOWESS GC-bias fit, and then bias-corrected and CN-called independently.
Input/output flexibility — inputs default to <input>/data/reference.bam and <input>/data/reference.fasta; output prefix defaults to <input>/CNV_out/. Output subfolders (CNV_plt/, CNV_csv/, GC_bias/, OTR_corr/) are created automatically.
Modular bias correction — the --bias flag lets you choose all (GC + OTR), gc, otr, or none.
Pip-installable package — requirements.txt and a fixed pyproject.toml allow install directly from GitHub via pip install git+....

Installation

Recommended: create a conda/mamba environment from the provided spec.

mamba env create -f environment.yml
mamba activate CNery

Install CNery (a.k.a. breseq-ext-cnv) from GitHub:

pip install git+https://github.com/barricklab/breseq-ext-cnv.git

Quick start

Run CNery inside a breseq output folder that contains the data/ and output/ subfolders:

CNery [-o <output folder>] [-w <window>] [-s <step size>] [-f <fragment length>]

To run from a different working directory, point -i at the breseq output folder (or supply -ref and the BAM path manually):

CNery -i <breseq output folder> \
      -ref <reference.fasta> \
      -o  <output folder> \
      -w  <window> \
      -s  <step size> \
      -f  <fragment length>

Usage examples

Calculate coverage with a 500 bp window sliding in 250 bp steps; sequencing fragment length is 300 bp:

CNery -o <output folder> -w 500 -s 250 -f 300

Analyze coverage across the whole genome, but restrict CNV/coverage plots to a specific genomic segment:

CNery -o <output folder> --region 3497890-3955678 -w 1000 -s 500

The --region argument accepts open intervals too (-reg 3497890- from a start to end of genome, -reg -3955678 from start of genome to an end position).

Control which bias correction is applied before CN prediction:

# Both GC + OTR corrections (default)
CNery -o <output folder> -w 500 -s 250 --bias all

# Only correct OTR (replication) bias
CNery -o <output folder> -w 500 -s 250 --bias otr

# Only correct GC-content bias
CNery -o <output folder> -w 500 -s 250 --bias gc

# No bias correction before CN prediction
CNery -o <output folder> -w 500 -s 250 --bias none

When OTR correction is applied, the origin and terminus of replication are automatically inferred from the coverage profile — no manual coordinates are required.

Outputs

Given an output folder CNV_out/, CNery writes:

CNV_out/CNV_plt/ — per-reference CNV prediction plots.
CNV_out/CNV_csv/ — per-window coverage + CN calls as CSV.
CNV_out/GC_bias/ — pooled LOWESS GC-bias diagnostic plot.
CNV_out/OTR_corr/ — per-reference OTR bias plots and a JSON summary (*_otr_results.json) containing the inferred origin window, terminus window, normalized coverage at each, and the origin-to-terminus ratio.

Each reference sequence in the BAM/FASTA produces its own set of outputs, named with the reference / genome identifier.

All command-line options

$ CNery -h

usage: CNery [-h] [-i I] [-ref REF] [-reg REG] [-o O] [-w W] [-s S] [-f F] [-e E]
             [--bias {all,none,gc,otr}]

CNery is a Python package extension to breseq that analyzes the sequencing
coverage across the genome to predict copy number variation (CNV).

options:
  -h, --help            show this help message and exit
  -i, --input I         input folder path (the breseq output folder with
                        'data' and 'output' folders). Defaults to the current
                        folder.
  -ref REF              select the reference file used for breseq. Defaults
                        to data/reference.fasta.
  -reg REG              select the region of the genome to evaluate
                        (format: START-END, e.g. 1000-50000).
  -o, --output O        output file prefix / storage location. Defaults to
                        the 'CNV_out' folder in the current dir.
  -w, --window W        Window length used to parse the genome and compute
                        coverage and GC statistics. Default: 200.
  -s, --step-size S     Step size (<= window size) for each progression of
                        the window across the genome. Set step-size = window
                        size for non-overlapping windows. Default: 100.
  -f, --frag_size F     Average fragment size of the sequencing reads.
                        Default: 500.
  -e, --error-rate E    Approximate error rate in sequencing read coverage /
                        reference alignment. Default: 0.05.
  --bias {all,none,gc,otr}
                        Select which bias correction to apply before CN
                        prediction. 'all' applies GC + OTR, 'gc' or 'otr'
                        applies only that one, 'none' skips bias correction.
                        Default: all.

Run this script in the breseq output folder that contains 'data' and 'output'
folders.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
src/CNery		src/CNery
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNery

Installation

Quick start

Usage examples

Outputs

All command-line options

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CNery

Installation

Quick start

Usage examples

Outputs

All command-line options

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages