Skip to content

Commit

Permalink
version update for LRSDAY: v1.3.1 -> v1.4.0
Browse files Browse the repository at this point in the history
  • Loading branch information
yjx1217 committed Mar 21, 2019
1 parent c690370 commit 8f05225
Show file tree
Hide file tree
Showing 29 changed files with 568 additions and 421 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

## [Unreleased]

## [1.4.0] - 2019-03-21
### Changed
- Supports for multi-round assembly polishing using both long and short reads.
- Setting adjustment for long-read filtering and downsampling.
- Software version updates for a number of dependencies.
### Fixed
- Compatibility issues due to recent version updates of conda an bioconda.
- Typos in the installation script.

## [1.3.1] - 2019-01-22
### Added
- A script for generated demultiplexed fastq reads based on nanopore's guppy demultiplexing summary file.
Expand Down
Binary file modified Example_Outputs/SK1.assembly.final.fa.gz
Binary file not shown.
Binary file modified Example_Outputs/SK1.assembly.final.filter.mummer2vcf.INDEL.vcf.gz
Binary file not shown.
Binary file modified Example_Outputs/SK1.assembly.final.filter.mummer2vcf.SNP.vcf.gz
Binary file not shown.
Binary file modified Example_Outputs/SK1.assembly.final.filter.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions Example_Outputs/SK1.assembly.final.stats.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
total sequence count: 34
total sequence length: 12448004
total sequence length: 12448003
min sequence length: 1248
max sequence length: 1480301
mean sequence length: 366117.76
mean sequence length: 366117.74
median sequence length: 60826.50
N50: 923676
L50: 6
Expand Down
Binary file modified Example_Outputs/SK1.final.cds.fa.gz
Binary file not shown.
Binary file modified Example_Outputs/SK1.final.gff3.gz
Binary file not shown.
25 changes: 11 additions & 14 deletions Example_Outputs/SK1.final.manual_check.list
Original file line number Diff line number Diff line change
Expand Up @@ -218,25 +218,22 @@ SK1_G0056290|SK1_G0056290.mRNA.1 unexpected start & end codons based on standard
SK1_G0057300|SK1_G0057300.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;incorrect CDS length
SK1_G0057390|SK1_G0057390.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057410|SK1_G0057410.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057460|SK1_G0057460.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057480|SK1_G0057480.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057470|SK1_G0057470.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057490|SK1_G0057490.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057500|SK1_G0057500.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057510|SK1_G0057510.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057570|SK1_G0057570.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057590|SK1_G0057590.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057750|SK1_G0057750.mRNA.1 incorrect CDS length
SK1_G0057780|SK1_G0057780.mRNA.1 incorrect CDS length
SK1_G0057800|SK1_G0057800.mRNA.1 incorrect CDS length
SK1_G0057560|SK1_G0057560.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057580|SK1_G0057580.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057770|SK1_G0057770.mRNA.1 incorrect CDS length
SK1_G0057950|SK1_G0057950.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057960|SK1_G0057960.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057970|SK1_G0057970.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0057980|SK1_G0057980.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058000|SK1_G0058000.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058010|SK1_G0058010.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058020|SK1_G0058020.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058030|SK1_G0058030.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058070|SK1_G0058070.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058160|SK1_G0058160.mRNA.1 unexpected start codon based on standard genentic code;your selected code table is 1;incorrect CDS length
SK1_G0058260|SK1_G0058260.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058280|SK1_G0058280.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058060|SK1_G0058060.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058150|SK1_G0058150.mRNA.1 unexpected start codon based on standard genentic code;your selected code table is 1;incorrect CDS length
SK1_G0058250|SK1_G0058250.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
SK1_G0058270|SK1_G0058270.mRNA.1 unexpected start & end codons based on standard genentic code;your selected code table is 1;internal stop codon(s)
cox2|cox2.mRNA.1 unexpected stop codon based on standard genentic code;your selected code table is 3;incorrect CDS length
orf474|orf474.mRNA.1 unexpected start codon based on standard genentic code;your selected code table is 3;incorrect CDS length
orf90|orf90.mRNA.1 unexpected start codon based on standard genentic code;your selected code table is 3
Expand Down
Binary file modified Example_Outputs/SK1.final.pep.fa.gz
Binary file not shown.
Binary file modified Example_Outputs/SK1.final.trimmed_cds.fa.gz
Binary file not shown.
Binary file modified Manual.pdf
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -12,27 +12,31 @@ reads="./../00.Long_Reads/YGL3210.fq.gz" # The file path of the long reads file
reads_type="nanopore-raw" # The long reads data type: "pacbio-raw" or "pacbio-corrected" or "nanopore-raw" or "nanopore-corrected".
run_filtering="yes" # Whether to filter the reads: "yes" or "no". Default = "yes".
genome_size="12500000" # The haploid genome size (in bp) of sequenced organism. Default = "12500000" (i.e. 12.5 Mb for the budding yeast S. cereviaie genome). This is used to calculate targeted sequencing coverage after read filtering (see below).
post_filtering_coverage="30" # Targeted sequencing coverage after read filtering. Default = "30" (i.e. 30x coverage).
post_filtering_coverage="40" # Targeted sequencing coverage after read filtering. Default = "40" (i.e. 40x coverage).
threads=1 # The number of threads to use. Default = "1".

#######################################
# process the pipeline

filtlong_target_bases=$(($genome_size * $post_filtering_coverage))
echo ""
echo "genome_size=$genome_size, post_filtering_coverage=$post_filtering_coverage, filtlong_target_bases=$filtlong_target_bases"
echo ""
if [[ "$reads_type" == "nanopore-raw" || "$reads_type" == "nanopore-corrected" ]]
then
$porechop_dir/porechop -i $reads -o $prefix.porechop.fastq.gz --discard_middle --threads $threads > $prefix.porechop.summary.txt
if [[ "$run_filtering" == "yes" ]]
then
$filtlong_dir/filtlong --min_length 1000 --keep_percent 90 --target_bases $filtlong_target_bases $prefix.porechop.fastq.gz | gzip > $prefix.filtlong.fastq.gz
filtlong_target_bases=$(($genome_size * $post_filtering_coverage))
echo ""
echo "genome_size=$genome_size, post_filtering_coverage=$post_filtering_coverage, filtlong_target_bases=$filtlong_target_bases"
echo ""
$filtlong_dir/filtlong --min_length 1000 --mean_q_weight 10 --target_bases $filtlong_target_bases $prefix.porechop.fastq.gz | gzip > $prefix.filtlong.fastq.gz
fi
else
if [[ "$run_filtering" == "yes" ]]
then
$filtlong_dir/filtlong --min_length 1000 --keep_percent 90 --target_bases $filtlong_target_bases $reads | gzip > $prefix.filtlong.fastq.gz
filtlong_target_bases=$(($genome_size * $post_filtering_coverage))
echo ""
echo "genome_size=$genome_size, post_filtering_coverage=$post_filtering_coverage, filtlong_target_bases=$filtlong_target_bases"
echo ""
$filtlong_dir/filtlong --min_length 1000 --mean_q_weight 10 --target_bases $filtlong_target_bases $reads | gzip > $prefix.filtlong.fastq.gz
fi
fi

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ source ./../../env.sh
# set project-specific variables

prefix="SK1.SMRTCell.1" # The file name prefix for the output files. For the testing example, run this script four times with the prefix of "SK1.SMRTCell.1", "SK1.SMRTCell.2", "SK1.SMRTCell.3", and "SK1.SMRTCell.4" respectively.
pacbio_RSII_bax_fofn_file="./pacbio_fofn_files/SK1.SMRTCell.1.RSII_bax.fofn" # The fofn file containing file paths to the PacBio RSII bax reads from the same SMRT cell. If you have data from multiple SMRT cells, please run this script sepearately for each of them. Do not mix reads from different SMRT cells even though they come from the same sample. For the testing example, you can set pacbio_RSII_bax_fofn_file="./pacbio_fofn_files/$prefix.RSII_bax.fofn" to let this parameter to be automatically set up based on the prefix parameter.
pacbio_RSII_bax_fofn_file="./pacbio_fofn_files/$prefix.RSII_bax.fofn" # The fofn file containing file paths to the PacBio RSII bax reads from the same SMRT cell. If you have data from multiple SMRT cells, please run this script sepearately for each of them. Do not mix reads from different SMRT cells even though they come from the same sample. For the testing example, you can set pacbio_RSII_bax_fofn_file="./pacbio_fofn_files/$prefix.RSII_bax.fofn" to let this parameter to be automatically set up based on the prefix parameter.

#######################################
# process the pipeline
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ prefix="SK1" # The file name prefix for output files of the testing example.
# process the pipeline

echo "download the bam file from the ENA database ..."
wget $file_url
wget --no-check-certificate $file_url
echo "bam2fastq ..."
$bedtools_dir/bedtools bamtofastq -i $file_name -fq $prefix.filtered_subreads.fastq
echo "gzip fastq ..."
Expand All @@ -24,12 +24,12 @@ rm $file_name
cd pacbio_fofn_files
echo "download the metadata and raw PacBio reads in .h5 format ..."

wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.metadata.xml
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.metadata.xml
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.metadata.xml
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.metadata.xml
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.metadata.xml
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.metadata.xml
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.metadata.xml
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.metadata.xml
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.metadata.xml
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.metadata.xml
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.metadata.xml
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.metadata.xml

if [[ ! -d Analysis_Results ]]
then
Expand All @@ -38,35 +38,35 @@ fi

cd Analysis_Results

wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.1.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.2.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.3.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.bas.h5

wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.1.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.2.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.3.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.bas.h5

# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.1.bax.h5
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.2.bax.h5
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.3.bax.h5
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.bas.h5

wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.1.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.2.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.3.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.bas.h5

# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.1.bax.h5
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.2.bax.h5
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.3.bax.h5
# wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.bas.h5

wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.1.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.2.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.3.bax.h5
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.bas.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.1.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.2.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.3.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080522/m150811_092723_00127_c100844062550000001823187612311514_s1_p0.bas.h5

wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.1.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.2.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.3.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080529/m150813_110541_00127_c100823112550000001823177111031581_s1_p0.bas.h5

# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.1.bax.h5
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.2.bax.h5
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.3.bax.h5
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080536/m150814_201337_00127_c100823152550000001823177111031541_s1_p0.bas.h5

wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.1.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.2.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.3.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR108/ERR1080537/m150814_233250_00127_c100823152550000001823177111031542_s1_p0.bas.h5

# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.1.bax.h5
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.2.bax.h5
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.3.bax.h5
# wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR112/ERR1124245/m150910_184604_00127_c100822732550000001823176011031536_s1_p0.bas.h5

wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.1.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.2.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.3.bax.h5
wget --no-check-certificate ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR114/ERR1140978/m150911_220012_00127_c100861772550000001823190702121671_s1_p0.bas.h5

###

Expand Down
Loading

0 comments on commit 8f05225

Please sign in to comment.