feat: add tinit #215

JoshLoecker · 2025-01-31T19:50:39Z

Description

This pull request will add the t-INIT method from Troppo to the model-building options

⚠️⚠️⚠️ BREAKING CHANGES ⚠️⚠️⚠️

The function create_context_specific_model no longer accepts output_model_filetypes
The function create_context_specific_model parameter output_model_filepaths should be a list of paths ending in .json, .mat, .xml, or .sbml, depending on the requested output type.

This is functionally equivalent to the _read_counts_matrix function from rnaseq.py

Instead of checking if the stem of directories match, check that the same number of files are returned. Because the directories are sorted (and this comes from FastqToGeneCounts), they should always match

Removes an extra indentation

# Conflicts: # main/como/rnaseq.py

# Conflicts: # pyproject.toml

Signed-off-by: Josh Loecker <[email protected]>

… fix/single-cell-processing

Signed-off-by: Josh Loecker <[email protected]>

Copilot

Copilot reviewed 13 out of 28 changed files in this pull request and generated 3 comments.

Files not reviewed (15)

main/como/custom_types.py: Evaluated as low risk
tests/test_rnaseq_preprocess.py: Evaluated as low risk
main/como/proteomics_gen.py: Evaluated as low risk
main/como/proteomics_preprocessing.py: Evaluated as low risk
ruff.toml: Evaluated as low risk
pyproject.toml: Evaluated as low risk
main/como/utils.py: Evaluated as low risk
main/como/graph/init.py: Evaluated as low risk
main/como/proteomics/FTPManager.py: Evaluated as low risk
main/como/proteomics/proteomics_preprocess.py: Evaluated as low risk
main/como/cluster_rnaseq.py: Evaluated as low risk
tests/test_proteomics.py: Evaluated as low risk
.github/workflows/continuous_integration.yml: Evaluated as low risk
main/como/init.py: Evaluated as low risk
.pre-commit-config.yaml: Evaluated as low risk

Comments suppressed due to low confidence (6)

tests/unit/test_rnaseq_preprocess.py:70

The test should validate the content of the DataFrame, not just the column names.

result: pd.DataFrame = await _process_first_multirun_sample(strand_filepath, all_gene_count_filepaths)

main/como/data_types.py:147

[nitpick] The method get_shorthand in the class CobraCompartments is inconsistent with the naming convention used in the rest of the codebase. It should be renamed to get to match the existing method names.

def get_shorthand(cls, longhand: str) -> str | None:

main/como/data_types.py:152

[nitpick] The method get_longhand in the class CobraCompartments is inconsistent with the naming convention used in the rest of the codebase. It should be renamed to get_long to match the existing method names.

def get_longhand(cls, shorthand: str) -> str | None:

main/como/data_types.py:82

The SourceTypes enum has a member PROTEOMICS, but there is no corresponding logic or implementation for this source type in the codebase.

PROTEOMICS = "proteomics"

main/como/data_types.py:216

The _validate_attribute method in the class _BaseDataType should use the _log_and_raise_error function for consistency.

def _validate_attribute(self, key):

tests/fixtures/collect_files.py:7

[nitpick] The variable name _fragment_size_filepaths is ambiguous. It should be renamed to fragment_size_filepaths.

_fragment_size_filepaths = list(Path("main/data/COMO_input").absolute().rglob("*fragment_size*.txt"))

Copilot · 2025-01-31T19:52:29Z

tests/unit/test_data_types.py

+
+
+def test_source_types():
+    """Validate that source types always go in the order of 'trna', 'mrna', 'scrna', 'protemics'."""


The word 'protemics' should be corrected to 'proteomics'.

Suggested change

"""Validate that source types always go in the order of 'trna', 'mrna', 'scrna', 'protemics'."""

"""Validate that source types always go in the order of 'trna', 'mrna', 'scrna', 'proteomics'."""

Copilot · 2025-01-31T19:52:29Z

tests/unit/test_rnaseq_preprocess.py

+    @pytest.mark.asyncio
+    async def test_build_from_tab_invalid_file(self):
+        """Validate error on invalid file."""
+        with pytest.raises(ValueError, match="Building STAR information requires a '.tab' file"):


The word 'requires' should be 'require'.

Copilot · 2025-01-31T19:52:30Z

main/como/data_types.py

+    GIMME = "GIMME"
+    FASTCORE = "FASTCORE"
+    IMAT = "IMAT"
+    TINIT = "TINIT"


The enum Algorithm has a new member TINIT, but there is no corresponding logic or implementation for this algorithm in the codebase.

JoshLoecker added 30 commits December 9, 2024 15:12

feat: bring zfpkm_filter from rnaseq.py

279f7cc

feat: added matrix builder

7e2fe3d

This is functionally equivalent to the _read_counts_matrix function from rnaseq.py

feat: added tpm calculation

9eaa2ac

feat: added fpkm calculation

3670016

feat: added zfpkm transformation and calculation

1cd4dbb

feat: added zfpkm plotting

dd8698e

feat: aded calculate z score

01db1ef

feat: added cpm filtering

a47f752

feat: added tpm quantile filtering

b6460c2

feat: added root filtering logic

e0c98b4

feat: added logic for performing statistical tests

a4d6f2f

feat: create metadata df

e9665bf

refactor: allow passing specific filepaths

dc7c1fe

refactor: rename variable names for easier reuse

f35e09a

style: update log message, more pythonic code

fc803bc

style: variable rename

8bdddd9

feat: update to match new approach

e0d84be

revert: use mrna instead of polya

9674459

refactor: check files returned

cde6606

Instead of checking if the stem of directories match, check that the same number of files are returned. Because the directories are sorted (and this comes from FastqToGeneCounts), they should always match

feat: allow specifying specific directories

b14d3f9

style: use more descriptive variable names

d52b0ee

refactor: use early continue

fe1d406

Removes an extra indentation

style: update warning messages

8f1d027

style: rename variables

0ef736e

refactor: move filtering technique to types

3825aa2

fix: do not use more cores than necessary

b0fa60e

Merge branch 'main' into develop

911aaaa

# Conflicts: # main/como/rnaseq.py

Merge branch 'main' into develop

2bfe760

# Conflicts: # pyproject.toml

chore: merge dependabot updates into hotfix

9539711

Merge branch 'main' into hotfix

06403a4

JoshLoecker and others added 28 commits January 12, 2025 16:49

fix: do not create virtual environment to format notebooks

bd65dbe

Signed-off-by: Josh Loecker <[email protected]>

fix: use python 3.10 to evade errors

e419816

Signed-off-by: Josh Loecker <[email protected]>

fix: use uv tool to run nbconvert

7fa8d00

Signed-off-by: Josh Loecker <[email protected]>

style: format code, Jupyter Notebook(s), and Python imports with ruff

c6f02bc

feat: expand test suite to include python 3.11 and 3.12

0646150

Signed-off-by: Josh Loecker <[email protected]>

Merge remote-tracking branch 'origin/fix/single-cell-processing' into…

3412c31

… fix/single-cell-processing

fix: check if provided data is a path before checking if it exists

9a72ae5

Signed-off-by: Josh Loecker <[email protected]>

fix: read StringIO data instead of attempting to make dataframe from it

8259105

Signed-off-by: Josh Loecker <[email protected]>

fix: statsmodels version dependent on python version

330ed84

Signed-off-by: Josh Loecker <[email protected]>

fix: use tpm instead of quantile

b6b6633

Signed-off-by: Josh Loecker <[email protected]>

fix: typo

21f780b

Signed-off-by: Josh Loecker <[email protected]>

chore: update uv lock

b119787

Signed-off-by: Josh Loecker <[email protected]>

chore: code re-arrangement

a441231

Signed-off-by: Josh Loecker <[email protected]>

chore: remove aiofiles as a dependency

09e1852

Signed-off-by: Josh Loecker <[email protected]>

fix: renamed RNAType.(trna,mrna) to RNAType.(TRNA,MRNA)

80f32d4

Signed-off-by: Josh Loecker <[email protected]>

fix: re-attempt processing if JSON decode error occurs

2c1687b

Signed-off-by: Josh Loecker <[email protected]>

refactor: remove dependency on aiofiles

0f317f3

Signed-off-by: Josh Loecker <[email protected]>

fix: do not call listify twice

2229c51

Signed-off-by: Josh Loecker <[email protected]>

refactor: remove dependency on aiofiles

9a6a42d

Signed-off-by: Josh Loecker <[email protected]>

refactor: allow providing a list of model filepaths to write to

f288015

Signed-off-by: Josh Loecker <[email protected]>

fix: check that license information is present if using gurobi

c0d3816

Signed-off-by: Josh Loecker <[email protected]>

feat: add NONE for no logging

df86903

Signed-off-by: Josh Loecker <[email protected]>

refactor: remove dependency on aiofiles

f925078

Signed-off-by: Josh Loecker <[email protected]>

style: ruff formatting

13aa4ae

fix: make parent directories before saving files

1ac47f0

refactor: use critical log instead of warning

72c3630

refactor: reset exection output and count

8be741d

fix: rename column names

35a0bf2

Copilot bot review requested due to automatic review settings January 31, 2025 19:50

Copilot AI reviewed Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add tinit #215

feat: add tinit #215

JoshLoecker commented Jan 31, 2025 •

edited

Loading

Copilot AI left a comment

Copilot AI Jan 31, 2025

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Jan 31, 2025

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Copilot AI Jan 31, 2025

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.



		def test_source_types():
		"""Validate that source types always go in the order of 'trna', 'mrna', 'scrna', 'protemics'."""

feat: add tinit #215

Are you sure you want to change the base?

feat: add tinit #215

Conversation

JoshLoecker commented Jan 31, 2025 • edited Loading

Description

⚠️⚠️⚠️ BREAKING CHANGES ⚠️⚠️⚠️

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot AI Jan 31, 2025

Choose a reason for hiding this comment

Copilot AI Jan 31, 2025

Choose a reason for hiding this comment

Copilot AI Jan 31, 2025

Choose a reason for hiding this comment

JoshLoecker commented Jan 31, 2025 •

edited

Loading