-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: production hotfixes #98
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,7 +36,7 @@ def __init__(self, **kwargs): | |
self.fsr = FailedSamplesRecord(self.kwargs['output_dir'], | ||
self.pipeline.sample_sheet.samples) | ||
|
||
self.master_qiita_job_id = None | ||
self.master_qiita_job_id = self.kwargs['job_id'] | ||
|
||
self.lane_number = self.kwargs['lane_number'] | ||
self.is_restart = bool(self.kwargs['is_restart']) | ||
|
@@ -74,45 +74,25 @@ def execute_pipeline(self): | |
Executes steps of pipeline in proper sequence. | ||
:return: None | ||
''' | ||
if not self.is_restart: | ||
self.pre_check() | ||
self.pre_check() | ||
|
||
# this is performed even in the event of a restart. | ||
self.generate_special_map() | ||
|
||
# even if a job is being skipped, it's being skipped because it was | ||
# determined that it already completed successfully. Hence, | ||
# increment the status because we are still iterating through them. | ||
|
||
self.update_status("Converting data", 1, 9) | ||
if "ConvertJob" not in self.skip_steps: | ||
# converting raw data to fastq depends heavily on the instrument | ||
# used to generate the run_directory. Hence this method is | ||
# supplied by the instrument mixin. | ||
# NB: convert_raw_to_fastq() now generates fsr on its own | ||
results = self.convert_raw_to_fastq() | ||
|
||
self.convert_raw_to_fastq() | ||
|
||
self.update_status("Performing quality control", 2, 9) | ||
if "NuQCJob" not in self.skip_steps: | ||
# NB: quality_control generates its own fsr | ||
self.quality_control(self.pipeline) | ||
|
||
self.quality_control() | ||
|
||
self.update_status("Generating reports", 3, 9) | ||
if "FastQCJob" not in self.skip_steps: | ||
# reports are currently implemented by the assay mixin. This is | ||
# only because metaranscriptomic runs currently require a failed- | ||
# samples report to be generated. This is not done for amplicon | ||
# runs since demultiplexing occurs downstream of SPP. | ||
results = self.generate_reports() | ||
self.fsr_write(results, 'FastQCJob') | ||
|
||
self.generate_reports() | ||
|
||
self.update_status("Generating preps", 4, 9) | ||
if "GenPrepFileJob" not in self.skip_steps: | ||
# preps are currently associated with array mixin, but only | ||
# because there are currently some slight differences in how | ||
# FastQCJob gets instantiated(). This could get moved into a | ||
# shared method, but probably still in Assay. | ||
self.generate_prep_file() | ||
|
||
self.generate_prep_file() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like the stuff in this method is the same as what's in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could push much of both into the MetaOmics base class off the top of my head. Those two child classes are very similar. Currently they differ only in the string constants we define, off the top of my head. More than happy to include that in my next push. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like a good idea? |
||
|
||
# moved final component of genprepfilejob outside of object. | ||
# obtain the paths to the prep-files generated by GenPrepFileJob | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file consistent in schema with
Demultiplex_Stats.csv
used in the other object forreports_path
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No Demultiplex_Stats.csv's schema is defined and maintained by Illumina and carries additional stats beyond sequence counts. SeqCounts.csv is defined and maintained by us and just records the total raw-read counts for forward and reverse and the lane number per sample.
Consistency in this case is not an issue since the consumer of the metadata is the metapool module, and it only uses Demultuplex_Stats.csv for its raw-read counts. The equivalent output file for the bcl2fastq is similarly different. Our new method is more accurate in that it counts both the forward and reverse reads while I believe in Demultuplex_Stats.csv we just double the forward count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it risky, and unexpected, for an attribute common across subclasses to have a different interpretation?