Skip to content

Fix bug in SIF generation for replicated blanks #136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jul 25, 2025

Conversation

AmandaBirmingham
Copy link
Contributor

@AmandaBirmingham AmandaBirmingham commented Jul 10, 2025

  • centralize string operations and identification of blanks
  • fix creation of sample information files for blanks from sample sheets with replicates: use original blank names (without replicate destination well suffix) rather than replicate blank name

@AmandaBirmingham AmandaBirmingham changed the title centralize string operations and identification of blanks Fix bug in SIF generation for replicated blanks Jul 11, 2025
@AmandaBirmingham AmandaBirmingham requested a review from Copilot July 11, 2025 21:15
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes SIF generation for replicated blanks by centralizing blank identification and ensuring original blank names (without replicate suffixes) are used.

  • Refactored generate_sample_info_files to take a list of dereplicated input file paths and dedupe controls.
  • Propagated dereplicated_input_file_paths through GenPrepFileJob, Assays, and Workflows.
  • Extended tests with demultiplexed sample‐sheet fixtures and helper logic for multiple replicates with/without context.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_Pipeline.py Updated tests to call generate_sample_info_files with lists, added helper for multi-replicate scenarios
tests/data/good_sheet_w_replicates*demux*.csv New demultiplexed CSV fixtures covering 3, 2, and 1 replicate cases
tests/data/good_sheet_w_replicates_and_context*demux*.csv New demux + context fixtures for multiple project mappings
src/sequence_processing_pipeline/Pipeline.py Changed generate_sample_info_files signature to accept file paths list and drop duplicate controls
src/sequence_processing_pipeline/GenPrepFileJob.py Stored dereplicated_input_file_paths after demuxing sample sheets
src/qp_klp/Workflows.py Updated generate_sifs and replaced blank filtering to use new method
src/qp_klp/Assays.py Converted _replace_tube_ids_w_sample_names to instance method and integrated pipeline state
Comments suppressed due to low confidence (4)

src/sequence_processing_pipeline/Pipeline.py:612

  • [nitpick] Changing this signature to require a list breaks backward compatibility. Consider giving dereplicated_input_file_paths a default value of None and falling back to the single input file (or previous behavior) when not provided.
    def generate_sample_info_files(self, dereplicated_input_file_paths):

tests/data/good_sheet_w_replicates_and_context_demux_3.csv:48

  • [nitpick] Typo in the test fixture: 'Eqiiperiment' should likely be 'Experiment' to match intended field naming.
TMI_10317,10317,False,AACC,GGTT,False,Knight Lab Kapa HP,Eqiiperiment,,,,

src/qp_klp/Workflows.py:472

  • The method sample_is_a_blank does not match the pattern used elsewhere (e.g., sample_name_is_blank). This will cause an AttributeError at runtime. Use the correct method name sample_name_is_blank or delegate to the shared is_blank utility.
                       not self.pipeline.sample_sheet.sample_is_a_blank(smpl)}

tests/test_Pipeline.py:518

  • [nitpick] Helper methods typically start with _ and are not collected by test runners - consider renaming to clarify that it is not a standalone test and document its purpose.
    def _help_test_generate_sample_information_files_with_multiple_preps(

Copy link
Member

@antgonza antgonza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me but I'm a little concern about not testing in qiita-rc before merge/deploy; what do you think?

@AmandaBirmingham
Copy link
Contributor Author

No, I don't think it would be a good idea to try to get it in to the deploy so close to the wire. But I would like to get it into main soon so we could test it in qiita-rc on some of our known problem cases :)

@antgonza
Copy link
Member

Can you pull from master to update this PR? Thank you.

@antgonza
Copy link
Member

I just did another pass here and the only comment is that the new tests sample-sheets have "Sample_Project" need to be updated to not contain names.

@antgonza
Copy link
Member

Nice, thank you very much!

@antgonza antgonza merged commit 6e92d3f into qiita-spots:main Jul 25, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants