You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: fastq_screen_documentation.md
+13-17
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@ To assist your understanding of FastQ Screen and how it should be used, we have
31
31
32
32
Project Homepage
33
33
================
34
-
The FastQ Screen Homepage can be found `here <http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen">`_
34
+
The FastQ Screen Homepage can be found `here <http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen>`_
35
35
36
36
37
37
Download
@@ -56,31 +56,24 @@ Before running FastQ Screen there are a few prerequisites that will need to be i
56
56
57
57
1. A sequence aligner. FastQ Screen is compatible with `Bowtie <http://bowtie-bio.sourceforge.net>`_, `Bowtie2 <http://bowtie-bio.sourceforge.net>`_ or `BWA <http://bio-bwa.sourceforge.net>`_. It's easier if you put the chosen aligner in your path, but if not you can configure its location in the config file.
58
58
59
-
2. We recommend running FastQ Screen in a Linux system, on which the programming language Perl should already be installed. Perl should also be pre-installed on OSX systems, or if trying to run FastQ Screen on a Windows system you may obtain Perl from `ActiveState. <http://www.activestate.com/activeperl/downloads>`_
59
+
2. We recommend running FastQ Screen in a Linux system, on which the programming language Perl should already be installed.
60
60
61
61
3. GD::Graph FastQ Screen uses the GD::Graph module to draw PNG format graphs summarising the mapping results. FastQ Screen will still produce both text and HTML format summaries of the results if GD::Graph is not installed.
62
62
63
-
Windows ActivePerl users can install this using;
63
+
You can use the built in CPAN shell to install
64
+
this module:
64
65
65
-
``ppm install GD-Graph``
66
+
``perl -MCPAN -e "install GD"``
66
67
67
-
Other platforms can use the built in CPAN shell to install
68
-
this:
68
+
Because GD graph uses GD this will be brought in as a dependency. GD may be easier to install using a package manager on many linux distributions. On Fedora for example you can install GD using:
69
69
70
-
``perl -MCPAN -e "install GD"``
70
+
``yum install perl-GD``
71
71
72
-
Because GD graph uses GD this will be brought in as a
73
-
dependency. GD may be easier to install using a package
74
-
manager on many linux distributions. On Fedora for example
75
-
you can install GD using:
76
-
77
-
``yum install perl-GD``
78
-
79
-
..before doing the CPAN install of GD::Graph
72
+
..before doing the CPAN install of GD::Graph
80
73
81
74
Actually installing Fastq Screen is very simple. Download the tar.gz distribution file and then do:
82
75
83
-
``tar -xzf fastq_screen_v0.x.x.tar.gz``
76
+
``tar -xzf fastq_screen_v0.x.x.tar.gz``
84
77
85
78
You will see a folder called fastq\_screen\_v0.x.x has been created and the program is inside that. You can add the program to your path either by linking the program into:
86
79
``usr/local/bin`` or by adding the program installation directory to your search path.
@@ -118,9 +111,10 @@ Alternatively, pre-built Bowtie2 indices of commonly used genomes may be downloa
118
111
The genome indices will be downloaded to a folder named "FastQ_Screen_Genomes" in your current working directory (or to another location if --outdir is specified). In addition to the genome indices, the folder FastQ_Screen_Genomes will contain a configuration file named "fastq_screen.conf", which is ready to use and lists the correct paths to the newly downloaded reference genomes. This configuration file can be passed to fastq_screen with the --conf command, or may be used as the default configuration by copying the file to the folder containing the fastq_screen script.
119
112
120
113
114
+
121
115
Test Dataset
122
116
============
123
-
To confirm FastQ Screen functions correctly on your system please download the Test Dataset. The file 'fastq\_screen\_test\_dataset.fastq.gz' contains reads in Sanger FASTQ format.
117
+
To confirm FastQ Screen functions correctly on your system please download the `Test Dataset <https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/fastq_screen_test_dataset.tar.gz.>`_ The file 'fastq\_screen\_test\_dataset.fastq.gz' contains reads in Sanger FASTQ format.
124
118
125
119
1. Extract the tar archive before processing:
126
120
``tar xvzf fastq_screen_test_dataset.tar.gz``
@@ -186,6 +180,8 @@ The option --nohits is equivalent to --tag --filter 0000 (zero for every genome
186
180
187
181
By adjusting the filters and, if necessary, undergoing several rounds of filtering it should be possible for a user to extract the desired reads.
188
182
183
+
Filtering paired-end reads files separately will generate files with un-paired reads e.g. a read may be present in File1, but its corresponding pair may not be found in File2. Also, the order of the reads in processed files may not correspond to on another. Consequently, the resulting file pairs will need processing after filtering with FastQ Screen. `Several tools are available (although not currently produced by us) to achieve this re-pairing <https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/repair-guide>`_
184
+
189
185
There may also may be occasions when, after filtering a FASTQ file, the tags need to be removed from the headers of each read. This can be achieved using the script Misc/remove_tags.pl.
190
186
191
187
A video tutorial explaining how to filter FASTQ files may be found `here <https://www.youtube.com/watch?v=eJcAv-Dt57I&t=1s_>`__
0 commit comments