Skip to content

Commit 6216303

Browse files
author
makirc
authored
Update README.md
1 parent f23b55f commit 6216303

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

README.md

+6-5
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Almost ready to go. After you prepared the files above, you may need to adjust t
4949

5050
- Models and scripts as cloned from this GIT repository
5151
- Annotations in the `annotations/` folder
52-
- CADD-SV scores SV in a sorted BED format on the GRCh38 genome build. The type of SV needs to be included for each variant in the 4th column. We recommend to split files containing more than 10,000 SVs into smaller files. An example input file can be found in `input/`. The file needs to have the suffix `id_`. If you plan to process variants from another genome build or SVs in VCF format, see below.
52+
- CADD-SV scores SV in a coordinate sorted BED format on the GRCh38 genome build. The type of SV needs to be included for each variant in the 4th column. We recommend to split files containing more than 10,000 SVs into smaller files. An example input file can be found in `input/`. The file needs to have the suffix `id_`. If you plan to process variants from another genome build or SVs in VCF format, see below.
5353

5454
## Running the pipeline
5555

@@ -71,10 +71,12 @@ The pipeline outputs your SV set containing all annotations in BED format in a f
7171
Further information about individual annotations are kept in a subfolder named after your input dataset.
7272

7373

74-
# Further Informations
74+
# Further Information
7575

7676
## Annotations
7777

78+
CADD-SV integrates different annotations, here some links to its annotation sources. A complete list can be found as Suppl. Table 1 of the manuscript/pre-print.
79+
7880
##### Integrated Scores
7981
CADD (https://krishna.gs.washington.edu/download/CADD/bigWig/) \
8082
LINSIGHT (http://compgen.cshl.edu/LINSIGHT/LINSIGHT.bw)
@@ -110,7 +112,7 @@ Fantom5 enhancers (https://zenodo.org/record/556775#.Xkz3G0oo-70)
110112
## Converting VCF and other genome builds
111113

112114
If you want to score SVs in a VCF format or your SVs are not in GRCh38 genomebuild coordinates:
113-
We provide an environment to handle this.
115+
We provide an environment to handle this. It uses the SURVIVOR tools (https://github.com/fritzsedlazeck/SURVIVOR).
114116

115117
```bash
116118
conda env create -n prepBED --file envs/prepBED.yml
@@ -124,11 +126,10 @@ Fantom5 enhancers (https://zenodo.org/record/556775#.Xkz3G0oo-70)
124126
125127
```
126128

127-
To lift hg19 coordinates to GRCh38 apply following steps:
129+
To lift hg19 coordinates to GRCh38 apply the following steps:
128130

129131
```
130132
conda activate prepBED
131133
liftOver beds/setname_hg19_id.bed dependencies/hg19ToHg38.over.chain.gz beds/setname_id.bed beds/setname_unlifted.bed
132134
```
133135

134-

0 commit comments

Comments
 (0)