test full runs in WorkflowFactory.py #131

antgonza · 2025-05-01T00:44:42Z

No description provided.

antgonza · 2025-05-05T17:56:12Z

tests/qiita-sql/91.sql

        study_abstract
    )
 VALUES
+    (


Amplicon actually inserts preps to study 1 (exists as default in qiita) and 2, which doesn't exist.

antgonza · 2025-05-05T19:22:53Z

@AmandaBirmingham, this is ready for review. This supersedes #104. This version is actually running all the pipelines to the end: submitting new preps to Qiita; which is nice and should help for future improvements; this additionally means that now we know all the steps for all pipelines and all of them have the same steps (even if they are just "pass").

AmandaBirmingham

A few questions and a typo :)

AmandaBirmingham · 2025-05-05T23:15:05Z

tests/test_WorkflowFactory.py

+
+        # Metagenomic is a valid data type in the default qiita test
+        # database but job-id: 78901 doesn't exist; however, if we get
+        # to here, it means that all steps have ran to completion


typo: should be "have run"

AmandaBirmingham · 2025-05-05T23:15:22Z

tests/test_WorkflowFactory.py

+        self._inject_data(wf)
+        # Metatranscriptomic is not a valid data type in the default qiita test
+        # database; however, if we get to here, it means that all steps have
+        # ran to completion and the system is trying to create the preps.


same typo as above

AmandaBirmingham · 2025-05-05T23:18:53Z

tests/test_WorkflowFactory.py

+        # to here, it means that all steps have ran to completion
+        # and the system is trying to create the preps.
+        with self.assertRaisesRegex(RuntimeError, 'invalid input '
+                                    'syntax for type uuid: "78901"'):


I am confused by the error message and the comment here. What does "invalid input syntax for type uuid " mean? The comment mentions that 78901 is a job id, so why does the error message refer to it as a "type uuid"?

Qiita job ids are UUID in the database, this tests uses 78901 as the job id so the database complains about the format. IMOO it is fine to let it fail here as is the last step of the processing.

Aha, got it! For the benefit of non-experts like me reading the tests, would you consider having these repeated comments say essentially what you just did?

# Metagenomic is a valid data type in the default qiita test # database but the test job-id, 78901, isn't in the # format of a uuid as expected; however, if we get # to here, it means that all steps have run to completion

AmandaBirmingham · 2025-05-05T23:19:26Z

tests/test_WorkflowFactory.py

+
+        # Amplicon is a valid data type in the default qiita test
+        # database but job-id: 78901 doesn't exist; however, if we get
+        # to here, it means that all steps have ran to completion


same typo as above

AmandaBirmingham · 2025-05-05T23:19:40Z

tests/test_WorkflowFactory.py

+
+        # Metagenomic is a valid data type in the default qiita test
+        # database but job-id: 78901 doesn't exist; however, if we get
+        # to here, it means that all steps have ran to completion


same typo as above

AmandaBirmingham · 2025-05-05T23:30:08Z

src/qp_klp/Assays.py

+        """
+        pass
+
+    def quality_control(self):


So, post_process_raw_fastq_output is only for amplicons and quality_control is only for metaomics; they are mutually exclusive, and whichever one the assay in question has gets called in "qc-ing reads" step of execute_pipeline:

self.update_status("QC-ing reads", 2, 9) if "NuQCJob" not in self.skip_steps: self.post_process_raw_fastq_output() self.quality_control()

Given this, why have two separate methods? It seems like we could have a def qc_reads() method on Assay with no default contents and we could override it in Amplicon with whatever is currently inAmplicon.post_process_raw_fastq_output and in MetaOmics with whatever is currently in MetaOmics.quality_control. Then execute_pipeline could just say:

self.update_status("QC-ing reads", 2, 9) if "NuQCJob" not in self.skip_steps: self.qc_reads()

Great catch and suggestion, thank you!

AmandaBirmingham · 2025-05-05T23:30:56Z

src/qp_klp/Assays.py

+        if "GenPrepFileJob" not in self.skip_steps:
+            # preps are currently associated with array mixin, but only
+            # because there are currently some slight differences in how
+            # FastQCJob gets instantiated(). This could get moved into a


A little confused by the mention of FastQCJob here since I thought we handled that at line 189 ... is this some sort of downstream consequence of that stuff?

removing, left behind by mistake.

AmandaBirmingham · 2025-05-05T23:31:27Z

src/qp_klp/Assays.py

+        # prep is pointing to.
+        self.load_preps_into_qiita()
+
+        self.fsr.generate_report()


Could we toss this line a comment, too? I see that it was in MetaOmic.execute_pipeline but not in StandardAmpliconWorkflow.execute_pipeline, so now I'm curious about why ... :D. I also notice that at one point in MetaOmic the code checks if hasattr(self, 'fsr'): before calling self.fsr.<something>() ... do we need to worry that not every assay type or every individual assay instance will have an fsr property?

fsr is an instance of FailedSamplesRecord, which is its own object, and as far as I can tell is used with Job.audit to keep track of the samples lost on each of the steps of the pipeline. Here we call FailedSamplesRecord. generate_report to output the report so it's moved to the final output and the user can get access. Adding this info in the code.

AmandaBirmingham

Thank you for the changes and explanations. One more request for changes--only to comments, though!--and it looks good to me.

AmandaBirmingham · 2025-05-06T16:39:41Z

tests/test_WorkflowFactory.py

+        # to here, it means that all steps have ran to completion
+        # and the system is trying to create the preps.
+        with self.assertRaisesRegex(RuntimeError, 'invalid input '
+                                    'syntax for type uuid: "78901"'):


Aha, got it! For the benefit of non-experts like me reading the tests, would you consider having these repeated comments say essentially what you just did?

# Metagenomic is a valid data type in the default qiita test # database but the test job-id, 78901, isn't in the # format of a uuid as expected; however, if we get # to here, it means that all steps have run to completion

antgonza · 2025-05-06T18:46:15Z

Thank you @AmandaBirmingham !

antgonza added 26 commits April 22, 2025 14:35

inserting data to qiita for testing

c66c80e

testing creation of study in qiit

3443a82

add sbatch

1ea19c6

shopt

c505773

echo .profile

db3258d

slurm

23d428d

partition

026da91

scontrol create partition

e0bb95c

mv scontrol

cff7adb

partitionname

7b71a72

sudo scontrol

a886761

add some prints

fb190ab

/usr/bin/sbatch

e2d236b

sudo

980d35e

env

1ce4e3d

.local/bin/sbatch

ea9b962

ls

e77048c

sbatch in conda

2faf214

squeue

2af7794

improve error display and running tests

43e0395

sbatch

6c32c70

GITHUB_PATH

15a84b0

adding files to tests/bin

88e3b70

test_metagenomic_workflow_creation

dc5b1d0

adding _inject_data

3842dc6

merging main

68ecebc

antgonza changed the title ~~test full runs in WorkflowFactory.py~~ [WIP] test full runs in WorkflowFactory.py May 1, 2025

antgonza added 3 commits May 5, 2025 08:12

fixing some tests

5d89746

fixing other tests

9d34625

copyfile -> touch()

b368754

antgonza commented May 5, 2025

View reviewed changes

antgonza added 2 commits May 5, 2025 12:36

more copyfile -> touch()

62f3ab8

fix test_tellseq_workflow_creation

c0d2820

antgonza requested a review from AmandaBirmingham May 5, 2025 19:23

antgonza mentioned this pull request May 5, 2025

rm duplicated execute_pipeline #104

Closed

antgonza changed the title ~~[WIP] test full runs in WorkflowFactory.py~~ test full runs in WorkflowFactory.py May 5, 2025

AmandaBirmingham reviewed May 5, 2025

View reviewed changes

antgonza added 2 commits May 6, 2025 05:06

addressing @AmandaBirmingham comments

3ee3e49

flake8

7f2b47a

AmandaBirmingham reviewed May 6, 2025

View reviewed changes

addressing @AmandaBirmingham note about comments

110aeed

AmandaBirmingham approved these changes May 6, 2025

View reviewed changes

antgonza merged commit c0230b9 into qiita-spots:main May 6, 2025
2 checks passed

test full runs in WorkflowFactory.py #131

test full runs in WorkflowFactory.py #131

Uh oh!

Conversation

antgonza commented May 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antgonza commented May 5, 2025

Uh oh!

AmandaBirmingham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmandaBirmingham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antgonza commented May 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants