-
Notifications
You must be signed in to change notification settings - Fork 0
Add TicChromaogram #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
WalkthroughAdds "Feature table" and "TIC Chromatogram" UI components and wiring; requires spec1 TSV in parseDeconv and builds feature_table, mass_table (with FeatureIndex arrays) and tic datasets; switches several Polars collects to streaming; and adds feature-level filtering in the update path. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant UI as Layout Manager / Browser
participant Init as Renderer Initialize
participant Parser as parseDeconv / Data Store
participant Comp as Component (Chromatogram / Tabulator)
User->>UI: Select component ("TIC Chromatogram" or "Feature table")
UI->>Init: Request initialization with comp_name
Init->>Parser: Request datasets (tic, feature_table, feature_dfs, mass_table)
Note over Parser: parseDeconv returns streamed datasets (feature_dfs, feature_table, mass_table, tic)
Parser-->>Init: Return requested streamed dataset(s)
Init->>Comp: Instantiate component with data_to_send (tic / feature_table / feature_dfs)
Init-->>UI: component_arguments + data_to_send
UI-->>User: Render component with streamed data
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/parse/deconv.py (1)
276-283: Consider vectorized alternative for MS2 array creation.The
map_elementsapproach with a lambda function works correctly but may be slower for large datasets compared to vectorized operations.If performance becomes a concern, consider this vectorized alternative:
pl.when(pl.col('MSLevel') == 2) .then( - pl.col('num_masses').map_elements( - lambda n: [-1] * n, - return_dtype=pl.List(pl.Int64) - ) + pl.int_range(0, pl.col('num_masses')) + .map_elements(lambda _: -1, return_dtype=pl.Int64) + .implode() ) .otherwise(pl.col('FeatureIndices'))Or use
pl.repeatif available in your Polars version. However, the current implementation is correct and may be sufficient unless profiling indicates a bottleneck.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
content/FLASHDeconv/FLASHDeconvLayoutManager.py(2 hunks)openms-streamlit-vue-component(1 hunks)src/parse/deconv.py(5 hunks)src/render/components.py(3 hunks)src/render/initialize.py(3 hunks)src/render/update.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- openms-streamlit-vue-component
🚧 Files skipped from review as they are similar to previous changes (1)
- content/FLASHDeconv/FLASHDeconvLayoutManager.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/render/initialize.py (2)
src/render/components.py (4)
FDRPlotly(72-75)FLASHQuant(101-104)Chromatogram(107-110)Tabulator(48-64)src/workflow/FileManager.py (1)
get_results(428-478)
src/parse/deconv.py (1)
src/workflow/FileManager.py (2)
store_data(327-345)get_results(428-478)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-openms
- GitHub Check: build-full-app
🔇 Additional comments (8)
src/render/initialize.py (3)
6-6: LGTM!The Chromatogram import and feature data wiring follow the existing patterns. The
use_polars=Trueflag is correctly used to load the LazyFrame stored in deconv.py.Also applies to: 36-41
182-185: LGTM!The TIC chromatogram component initialization follows the established pattern and correctly wires the data stored in deconv.py.
186-189: LGTM!The feature table component initialization is consistent with other Tabulator components and correctly references the data stored in deconv.py.
src/render/components.py (3)
62-64: LGTM!The FeatureTable addition to Tabulator follows the established pattern and naming conventions.
107-110: LGTM!The Chromatogram component class follows the established pattern for component definitions.
9-9: No action required. The_RELEASE = Falsesetting is intentional and correct for development mode. The comments on lines 7-8 clearly document this pattern: development useslocalhost:5173(the local dev server), while production uses the built distribution fromjs-component/dist. This is the proper state for an active development branch.Likely an incorrect or invalid review comment.
src/parse/deconv.py (2)
81-104: LGTM!The feature table aggregation logic correctly computes per-feature statistics including apex detection, retention time ranges, and intensity summaries.
154-175: LGTM!The TIC table construction correctly aggregates total ion current across MS levels and maintains chronological order.
| def parseDeconv( | ||
| file_manager, dataset_id, out_deconv_mzML, anno_annotated_mzML, | ||
| spec1_tsv=None, spec2_tsv=None, logger=None | ||
| spec1_tsv, spec2_tsv=None, logger=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle optional spec2_tsv parameter.
While spec1_tsv is now required, spec2_tsv remains optional (default None). However, Lines 33-38 attempt to read spec2_tsv without checking if it's None, which will cause a crash.
Apply this diff to handle the optional parameter:
spec1_df.loc[:,'Level'] = 1
file_manager.store_data(dataset_id, 'spec1_df', spec1_df)
- spec2_df = pd.read_csv(
- spec2_tsv, sep='\t', usecols=[
- 'FeatureIndex', 'MonoisotopicMass', 'SumIntensity', 'RetentionTime',
- 'ScanNum'
- ]
- )
- spec2_df.loc[:,'Level'] = 2
- file_manager.store_data(dataset_id, 'spec2_df', spec2_df)
- del spec1_df
- del spec2_df
+ if spec2_tsv is not None:
+ spec2_df = pd.read_csv(
+ spec2_tsv, sep='\t', usecols=[
+ 'FeatureIndex', 'MonoisotopicMass', 'SumIntensity', 'RetentionTime',
+ 'ScanNum'
+ ]
+ )
+ spec2_df.loc[:,'Level'] = 2
+ file_manager.store_data(dataset_id, 'spec2_df', spec2_df)
+ del spec2_df
+ else:
+ # Store empty dataframe to maintain consistent interface
+ spec2_df = pd.DataFrame(columns=['FeatureIndex', 'MonoisotopicMass', 'SumIntensity', 'RetentionTime', 'ScanNum', 'Level'])
+ file_manager.store_data(dataset_id, 'spec2_df', spec2_df)
+ del spec2_df
+ del spec1_dfAlso applies to: 33-40
🤖 Prompt for AI Agents
In src/parse/deconv.py around lines 11 and 33-40, the function signature left
spec2_tsv defaulting to None but the code later unconditionally attempts to read
and process spec2_tsv, which will crash when None; update the code to check if
spec2_tsv is not None before attempting to open/read it and only build related
variables (e.g., spec2_df/spec2_spectra) when provided, otherwise set those
variables to None or empty structures and skip any downstream processing that
requires spec2 data; ensure any logging reflects whether spec2 was supplied.
| features = file_manager.get_results( | ||
| dataset_id, ['spec1_df', 'spec2_df'], use_polars=True | ||
| ) | ||
| # Build the base once | ||
| base = pl.concat([features["spec1_df"], features["spec2_df"]]) | ||
|
|
||
| # Sort first so indices reflect first appearance order in the data | ||
| sorted_base = base.sort("RetentionTime") | ||
|
|
||
| # Create a ScanNum -> ScanIndex mapping in order of first occurrence | ||
| scan_index_map = ( | ||
| sorted_base | ||
| .select("ScanNum") | ||
| .unique(maintain_order=True) | ||
| .with_row_count("ScanIndex") | ||
| ) | ||
|
|
||
| # Build dataframe | ||
| features = ( | ||
| sorted_base | ||
| # needed for MassIndex; global index after sort | ||
| .with_row_count("RowID") | ||
| .with_columns( | ||
| # per-ScanNum 0-based MassIndex using RowID | ||
| (pl.col("RowID") - pl.col("RowID").min().over("ScanNum")).alias("MassIndex"), | ||
| # Retention time in seconds to comply with other datastructures | ||
| (pl.col("RetentionTime") * 60).alias("RetentionTime"), | ||
| ) | ||
| # Attach scan index | ||
| .join(scan_index_map, on="ScanNum", how="left") | ||
| # For now we only consider features at ms1 level | ||
| .filter(pl.col("Level") == 1) | ||
| # Drop helper columns | ||
| .drop(["Level", "RowID"]) | ||
| ) | ||
| file_manager.store_data(dataset_id, 'feature_dfs', features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg "spec2_df" --type py -C 3Repository: OpenMS/FLASHApp
Length of output: 174
🏁 Script executed:
rg "spec2_tsv" --type py -C 3Repository: OpenMS/FLASHApp
Length of output: 6886
🏁 Script executed:
rg "file_manager.get_results\|file_manager.store_data" --type py -C 2 -A 2Repository: OpenMS/FLASHApp
Length of output: 41
🏁 Script executed:
cat -n src/parse/deconv.py | head -100Repository: OpenMS/FLASHApp
Length of output: 4566
🏁 Script executed:
rg "store_data.*spec2_df|spec2_df.*store" --type py -C 3Repository: OpenMS/FLASHApp
Length of output: 347
spec2_df unconditional read will crash when spec2_tsv is None.
Lines 33–40 read spec2_tsv without checking if it's None. When called from Workflow.py with spec2_tsv=None (when MS2 data is absent), pd.read_csv(None, ...) will raise a TypeError. Add a conditional check: wrap lines 33–40 in if spec2_tsv is not None: before attempting to read and store spec2_df.
🤖 Prompt for AI Agents
In src/parse/deconv.py around lines 44 to 79, the code unconditionally reads and
stores spec2_df which will crash when spec2_tsv is None; wrap the block that
reads/creates/stores spec2_df (the lines that call pd.read_csv and
file_manager.store_data for spec2) in an if spec2_tsv is not None: guard so you
only attempt to read and store spec2_df when a path is provided; ensure
downstream code that concatenates features handles the missing spec2_df (e.g.,
only include spec2_df in pl.concat if it exists or create an empty DataFrame
placeholder) so concatenation and subsequent operations do not fail.
| ) | ||
| mass_row = feature_info.collect(streaming=True) | ||
| if mass_row.height == 0: | ||
| data['feature_data'] = pd.DataFrame() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix type inconsistency in empty DataFrame assignments.
The empty cases assign pd.DataFrame() (Pandas), but Line 205 collects a Polars DataFrame with collect(streaming=True). This creates type inconsistency for data['feature_data'].
Apply this diff to use Polars consistently:
mass_row = feature_info.collect(streaming=True)
if mass_row.height == 0:
- data['feature_data'] = pd.DataFrame()
+ data['feature_data'] = pl.DataFrame()
else:
idx = mass_row.row(0, named=True)['FeatureIndex']
if idx is None:
- data['feature_data'] = pd.DataFrame()
+ data['feature_data'] = pl.DataFrame()
else:
feature_data = (
feature_data
.filter(pl.col("FeatureIndex") == idx)
.sort("RetentionTime")
)
data['feature_data'] = feature_data.collect(streaming=True)
else:
- data['feature_data'] = pd.DataFrame()
+ data['feature_data'] = pl.DataFrame()Also applies to: 198-198, 207-207
🤖 Prompt for AI Agents
In src/render/update.py around lines 194, 198 and 207, empty cases assign pandas
DataFrame (pd.DataFrame()) which conflicts with later code that expects a Polars
DataFrame collected via collect(streaming=True); replace those pd.DataFrame()
assignments with an equivalent empty polars DataFrame (e.g., pl.DataFrame()) so
the variable type is consistently a Polars DataFrame throughout; ensure the
import for polars (pl) is available at the top of the file if not already.
- Send feature_table and feature_dfs data to TIC chromatogram component - Vue component displays selected feature as red line trace - Both TIC and feature traces share same y-axis scale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/render/initialize.py (2)
182-202: Consider usinguse_polars=Truefor consistency and simplify type handling.Two observations:
Inconsistency: Line 183 doesn't specify
use_polars=True, sofeature_dfsdefaults to pandas. However,ms1_deconv_heat_map(line 39) fetches the samefeature_dfswithuse_polars=True. This makes the LazyFrame check (lines 190-192) and Polars DataFrame check (lines 193-195) dead code in the current implementation.Type checking: Using
isinstanceis more explicit and maintainable than duck-typing withhasattr.If you want to use Polars consistently:
- data = file_manager.get_results(selected_data, ['tic', 'feature_table', 'feature_dfs']) + data = file_manager.get_results(selected_data, ['tic', 'feature_table', 'feature_dfs'], use_polars=True) data_to_send['tic'] = data['tic'] data_to_send['feature_table'] = data.get('feature_table') # feature_dfs contains per-scan intensity data for each feature feature_dfs = data.get('feature_dfs') if feature_dfs is not None: # Convert DataFrame to list of dicts for JSON serialization - if hasattr(feature_dfs, 'collect'): + if isinstance(feature_dfs, pl.LazyFrame): # It's a Polars LazyFrame df = feature_dfs.collect() - elif hasattr(feature_dfs, 'to_dicts'): + elif isinstance(feature_dfs, pl.DataFrame): # It's a Polars DataFrame df = feature_dfs else:
36-41: Inconsistent handling of Polars LazyFrame across branches.The
ms1_deconv_heat_mapbranch (lines 36-41) passes thefeature_dfsLazyFrame todata_to_sendwithout collecting, while thetic_chromatogrambranch (lines 190-201) collects and converts it to dicts immediately. This inconsistency makes the code harder to follow. Consider collecting and converting to dicts here for consistency:feature_data = file_manager.get_results( selected_data, ['feature_dfs'], use_polars=True )['feature_dfs'] if hasattr(feature_data, 'collect'): feature_data = feature_data.collect().to_dicts() data_to_send['feature_data'] = feature_data
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
openms-streamlit-vue-component(1 hunks)src/render/initialize.py(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- openms-streamlit-vue-component
🧰 Additional context used
🧬 Code graph analysis (1)
src/render/initialize.py (2)
src/render/components.py (1)
Chromatogram(107-110)src/workflow/FileManager.py (1)
get_results(428-478)
🔇 Additional comments (2)
src/render/initialize.py (2)
6-6: LGTM!Import is valid and used in the new
tic_chromatogrambranch.
203-206: LGTM!Follows the established pattern for table components.
Clicking on the red feature trace line automatically sets integration bounds to the feature's RTStart/RTEnd and calculates both TIC and feature area.
Feature trace and integration are MS1-only features. When MS2 filter is active: hide feature trace, disable toggle button, clear any existing integration.
Summary by CodeRabbit
New Features
Performance
Chores
Bug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.