Skip to content

Commit

Permalink
Make output bam name unique (by using sample uuid)
Browse files Browse the repository at this point in the history
This ensures bams aren't overwritten if the same output is used for multiple samples
  • Loading branch information
jvivian committed Jan 25, 2017
1 parent dcc39a7 commit ab3c710
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 5 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ RNA-seq fastqs are combined, aligned, and quantified with 2 different methods (R

This pipeline produces a tarball (tar.gz) file for a given sample with 3 main subdirectories: Kallisto, RSEM, and QC.
If the pipeline is run with all possible options (`fastqc`, `bamqc`, etc), the output tar
will have the following structure (once uncompressed):
will have the following structure (once uncompressed), where **SAMPLE** is the unique name of the sample:

```
SAMPLE
Expand Down Expand Up @@ -49,10 +49,10 @@ SAMPLE

If the user selects options such as `save-bam` or `wiggle`, additional files will appear in the output directory:

- SAMPLE.sorted.bam OR rnaAligned.sortedByCoord.md.bam if `bamQC` step is enabled.
- SAMPLE.sorted.bam OR SAMPLE.sortedByCoord.md.bam if `bamQC` step is enabled.
- SAMPLE.wiggle.bg

The output tarball is prepended with the UUID for the sample (e.g. UUID.tar.gz).
The output tarball is prepended with the unique name for the sample (e.g. SAMPLE.tar.gz).

# Dependencies

Expand Down
6 changes: 4 additions & 2 deletions src/toil_rnaseq/qc.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,11 @@ def run_bam_qc(job, aligned_bam_id, config):
# Save output BAM
if config.save_bam:
bam_path = os.path.join(work_dir, 'rnaAligned.sortedByCoord.md.bam')
new_bam_path = os.path.join(work_dir, config.uuid + '.sortedByCoord.md.bam')
os.rename(bam_path, new_bam_path)
if urlparse(config.output_dir).scheme == 's3' and config.ssec:
s3am_upload(fpath=bam_path, s3_dir=config.output_dir, s3_key_path=config.ssec)
s3am_upload(fpath=new_bam_path, s3_dir=config.output_dir, s3_key_path=config.ssec)
elif urlparse(config.output_dir).scheme != 's3':
copy_files(file_paths=[bam_path], output_dir=config.output_dir)
copy_files(file_paths=[new_bam_path], output_dir=config.output_dir)

return fail_flag, job.fileStore.writeGlobalFile(os.path.join(work_dir, 'bam_qc.tar.gz'))

0 comments on commit ab3c710

Please sign in to comment.