-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
version update for LRSDAY: v1.1.0 -> v1.2.0
- Loading branch information
Showing
86 changed files
with
64,238 additions
and
26,500 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
total sequence count: 33 | ||
total sequence length: 12490496 | ||
min sequence length: 1248 | ||
max sequence length: 1480288 | ||
mean sequence length: 378499.88 | ||
median sequence length: 84643.00 | ||
N50: 923711 | ||
L50: 6 | ||
N90: 341493 | ||
L90: 14 | ||
A%: 30.88 | ||
T%: 30.79 | ||
G%: 19.13 | ||
C%: 19.16 | ||
AT%: 61.67 | ||
GC%: 38.29 | ||
N%: 0.04 |
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
This file was deleted.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
49 changes: 49 additions & 0 deletions
49
Project_Template/00.Long_Reads/LRSDAY.00.Long_Reads_Preprocessing.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
#!/bin/bash | ||
set -e -o pipefail | ||
####################################### | ||
# load environment variables for LRSDAY | ||
source ./../../env.sh | ||
|
||
####################################### | ||
# set project-specific variables | ||
|
||
prefix="YGL3210" # The file name prefix for the output files | ||
reads="./../00.Long_Reads/YGL3210.fq.gz" # The file path of the long reads file (in fastq or fastq.gz format). | ||
reads_type="nanopore-raw" # The long reads data type: "pacbio-raw" or "pacbio-corrected" or "nanopore-raw" or "nanopore-corrected". | ||
run_filtering="yes" # Whether to filter the reads: "yes" or "no". Default = "yes". | ||
genome_size="12500000" # The haploid genome size (in bp) of sequenced organism. Default = "12500000" (i.e. 12.5 Mb for the budding yeast S. cereviaie genome). This is used to calculate targeted sequencing coverage after read filtering (see below). | ||
post_filtering_coverage="30" # Targeted sequencing coverage after read filtering. Default = "30" (i.e. 30x coverage). | ||
threads=1 # The number of threads to use. Default = "1". | ||
|
||
####################################### | ||
# process the pipeline | ||
|
||
filtlong_target_bases=$(($genome_size * $post_filtering_coverage)) | ||
echo "" | ||
echo "genome_size=$genome_size, post_filtering_coverage=$post_filtering_coverage, filtlong_target_bases=$filtlong_target_bases" | ||
echo "" | ||
if [[ "$reads_type" == "nanopore-raw" || "$reads_type" == "nanopore-corrected" ]] | ||
then | ||
$porechop_dir/porechop -i $reads -o $prefix.porechop.fastq.gz --threads $threads > $prefix.porechop.summary.txt | ||
if [[ "$run_filtering" == "yes" ]] | ||
then | ||
$filtlong_dir/filtlong --min_length 1000 --keep_percent 90 --target_bases $filtlong_target_bases $prefix.porechop.fastq.gz | gzip > $prefix.filtlong.fastq.gz | ||
fi | ||
else | ||
if [[ "$run_filtering" == "yes" ]] | ||
then | ||
$filtlong_dir/filtlong --min_length 1000 --keep_percent 90 --target_bases $filtlong_target_bases $reads | gzip > $prefix.filtlong.fastq.gz | ||
fi | ||
fi | ||
|
||
############################ | ||
# checking bash exit status | ||
if [[ $? -eq 0 ]] | ||
then | ||
echo "" | ||
echo "LRSDAY message: This bash script has been successfully processed! :)" | ||
echo "" | ||
echo "" | ||
exit 0 | ||
fi | ||
############################ |
41 changes: 41 additions & 0 deletions
41
Project_Template/00.Long_Reads/LRSDAY.00.PacBio.RSII_bax2bam.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
#!/bin/bash | ||
set -e -o pipefail | ||
####################################### | ||
# load environment variables for LRSDAY | ||
source ./../../env.sh | ||
|
||
####################################### | ||
# set project-specific variables | ||
|
||
prefix="SK1.SMRTCell.1" # The file name prefix for the output files. For the testing example, run this script four times with the prefix of "SK1.SMRTCell.1", "SK1.SMRTCell.2", "SK1.SMRTCell.3", and "SK1.SMRTCell.4" respectively. | ||
pacbio_RSII_bax_fofn_file="./pacbio_fofn_files/SK1.SMRTCell.1.RSII_bax.fofn" # The fofn file containing file paths to the PacBio RSII bax reads from the same SMRT cell. If you have data from multiple SMRT cells, please run this script sepearately for each of them. Do not mix reads from different SMRT cells even though they come from the same sample. For the testing example, you can set pacbio_RSII_bax_fofn_file="./pacbio_fofn_files/$prefix.RSII_bax.fofn" to let this parameter to be automatically set up based on the prefix parameter. | ||
|
||
####################################### | ||
# process the pipeline | ||
|
||
source $miniconda2_dir/activate $conda_pacbio_dir/../../conda_pacbio_env | ||
$conda_pacbio_dir/bax2bam \ | ||
--fofn=$pacbio_RSII_bax_fofn_file \ | ||
-o ./pacbio_fofn_files/$prefix.bax2bam \ | ||
--subread \ | ||
--pulsefeatures=DeletionQV,DeletionTag,InsertionQV,IPD,MergeQV,SubstitutionQV,PulseWidth,SubstitutionTag | ||
|
||
cd pacbio_fofn_files | ||
rm $prefix.bax2bam.scraps.bam | ||
rm $prefix.bax2bam.scraps.bam.pbi | ||
echo $(pwd)/$prefix.bax2bam.subreads.bam > $prefix.bam.fofn | ||
cd .. | ||
|
||
|
||
|
||
############################ | ||
# checking bash exit status | ||
if [[ $? -eq 0 ]] | ||
then | ||
echo "" | ||
echo "LRSDAY message: This bash script has been successfully processed! :)" | ||
echo "" | ||
echo "" | ||
exit 0 | ||
fi | ||
############################ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.