Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nvnieuwk committed Jul 5, 2023
1 parent d4e7951 commit 9143453
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 41 deletions.
8 changes: 3 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,9 @@

**CenterForMedicalGeneticsGhent/nf-cmgg-qdnaseq** is a bioinformatics pipeline for creating qDNAseq annotations

1. Trim FASTQ files to read lengths of 50 with Trimgalore
2. Align the reads with BWA (aln and samse/sampe)
3. Create a mappability WIG file with GenMap
4. Convert the WIG to BigWig with UCSC WigToBigWig
5. Create the annotations using a custom R script
1. Create a mappability WIG file with GenMap
2. Convert the WIG to BigWig with UCSC WigToBigWig
3. Create the annotations using a custom R script

## Usage

Expand Down
1 change: 0 additions & 1 deletion docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ Reference genome related files and options required for the workflow.
| `annotation_genome` | The name of the genome used to create the annotations. This will default to the value supplied with --genome. | `string` | None | | |
| `fasta` | Path to FASTA genome file. <details><summary>Help</summary><small>This parameter is _mandatory_ if `--genome` is not specified.</small></details> | `string` | | | |
| `fai` | Path to FASTA genome index file. | `string` | | | |
| `bwa` | The BWA index. | `string` | | | |
| `blacklist` | The blacklist BED file. | `string` | | | |
| `igenomes_base` | Directory / URL base for iGenomes references. | `string` | | | True |
| `igenomes_ignore` | Do not load the iGenomes reference config. <details><summary>Help</summary><small>Do not load `igenomes.config` when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`.</small></details> | `boolean` | | | True |
Expand Down
35 changes: 7 additions & 28 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,39 +12,18 @@ You will need to create a samplesheet with information about the samples you wou
--input '[path to samplesheet file]'
```

### Multiple runs of the same sample

The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. The pipeline will concatenate the raw reads before performing any downstream analysis. Below is an example for the same sample sequenced across 3 lanes:

```console
sample,fastq_1,fastq_2
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz
CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz
```

### Full samplesheet

The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below.

A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice.

```console
sample,fastq_1,fastq_2
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
CONTROL_REP2,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz
CONTROL_REP3,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz
TREATMENT_REP1,AEG588A4_S4_L003_R1_001.fastq.gz,
TREATMENT_REP2,AEG588A5_S5_L003_R1_001.fastq.gz,
TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz,
TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz,
cram,crai
test.cram,test.cram.crai
test2.cram,
```

| Column | Description |
| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). |
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
| Column | Description |
| ------ | ---------------------------------------------------- |
| `cram` | A input BAM or CRAM file to use for bins calculation |
| `crai` | The index for the BAM or CRAM file. |

An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.

Expand Down
7 changes: 0 additions & 7 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -84,13 +84,6 @@
"description": "Path to FASTA genome index file.",
"fa_icon": "far fa-file-code"
},
"bwa": {
"type": "string",
"format": "path",
"mimetype": "text/plain",
"description": "The BWA index.",
"fa_icon": "far fa-file-code"
},
"blacklist": {
"type": "string",
"format": "file-path",
Expand Down

0 comments on commit 9143453

Please sign in to comment.