Skip to content

Miscellaneous materials (tutorials, exercises, testdata) to develop essential 'survival'-skills in bioinformatics.

License

Notifications You must be signed in to change notification settings

HullUni-bioinformatics/genomisc-training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

genomisc-training

Miscellaneous materials (tutorials, exercises, testdata) to develop essential 'survival'-skills in bioinformatics.

git clone..

contact: [email protected]

  1. Introduction
  2. Reproducible science
  3. The command line 1. Basics 2. writing/running scripts
  4. Docker
  5. Using HPC 1. Connecting 2. Job submission
    1. SLURM
    2. PBS
  6. Manipulating FASTQ data
  7. FASTQ basics
  8. FASTQ file manipulation using command line skills
  9. FASTQ trimming 1. Fastx-toolkit 2. Trimmomatic
  10. Error correction 1. Illumina data 2. PacBio data
  11. paired end read merging 1. FLASh 2. Pear
  12. Demultiplexing
  13. Read mapping
  14. BWA
  15. Bowtie
  16. RADseq data
  17. Stacks
  18. Pyrad
  19. SNP calling
  20. Freebayes
  21. SNP annotation
  22. SNPeff
  23. Genome assembly
  24. Illumina data 1. Velvet 2. Spades 3. Celera
  25. PacBio 1. CANU 2. FALCON 3. MIRA
  26. Hybrid 1. MIRA 2. Spades 3. Celera
  27. Metaassembly, Gapfilling and polishing 1. PBjelly 2. quickmerge
  28. Assembly evaluation 1. Basic stats 2. Completeness 3. Contamination
  29. RNAseq data
  30. Denovo
  31. Reference genome based
  32. metabarcoding
  33. A basic BLAST search
  34. MEGAN
  35. Structural genome annotation
  36. Functional genome annotation

Introduction

Manipulating FASTQ data

FASTQ file manipulation using command line skills

The UNIX command line provides highly efficient, simple and incredibly powerful tools for text file manipulation. Much of the NGS data you will be processing are nothing more than text files. Proficiency with some of these basic tools will get you a long way (you may not need anything else), so I have prepared a number of simple exercises that should help you develop your command line skills, specifically in the context of FASTQ data manipulation. It will hopefully also help you to get a feel for the kind of data that you will be working with. Get started here.

Read mapping

RADseq data

SNP calling

SNP annotation

Genome assembly

RNAseq data

metabarcoding

About

Miscellaneous materials (tutorials, exercises, testdata) to develop essential 'survival'-skills in bioinformatics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published