-
Install Conda. Skip this if you already have equivalent Conda alternatives (Anaconda Python). Download and run the installer. Agree to the license term by typing
yes
. It will ask you about the installation location. On Stanford clusters (Sherlock and SCG4), we recommend to install it outside of your$HOME
directory since its filesystem is slow and has very limited space. At the end of the installation, chooseyes
to add Miniconda's binary to$PATH
in your BASH startup script.$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh $ bash Miniconda3-latest-Linux-x86_64.sh
-
Install Conda dependencies.
$ bash conda/uninstall_dependencies.sh # to remove any existing pipeline env $ bash conda/install_dependencies.sh
-
Choose
[GENOME]
fromhg19
,hg38
,mm9
andmm10
and specify a destination directory. This will take several hours. We recommend not to run this installer on a login node of your cluster. It will take >8GB memory and >2h time.$ bash conda/build_genome_data.sh [GENOME] [DESTINATION_DIR]
-
Find a TSV file on the destination directory and use it for
"atac.genome_tsv"
in your input JSON.
-
You can build your own genome database if your reference genome has one of the following file types.
.fasta.gz
.fa.gz
.fasta.bz2
.fa.gz2
.2bit
-
Get a URL for your reference genome. You may need to upload it to somewhere on the internet.
-
Get a URL for a gzipped blacklist BED file for your genome. If you don't have one then skip this step. An example blacklist for hg38 is here.
-
Find the following lines in
conda/build_genome_data.sh
and modify it. Give a good name[YOUR_OWN_GENOME]
for your genome.... elif [[ $GENOME == "YOUR_OWN_GENOME" ]]; then REF_FA="URL_FOR_YOUR_FASTA_OR_2BIT" BLACKLIST= # leave it empty if you don't have it ...
-
Specify a destination directory for your genome database and run the installer. This will take several hours.
$ bash conda/build_genome_data.sh [YOUR_OWN_GENOME] [DESTINATION_DIR]
-
Find a TSV file in the destination directory and use it for
"atac.genome_tsv"
in your input JSON.