Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing peak after grouping #786

Open
plyush1993 opened this issue Jan 25, 2025 · 6 comments
Open

Missing peak after grouping #786

plyush1993 opened this issue Jan 25, 2025 · 6 comments

Comments

@plyush1993
Copy link

plyush1993 commented Jan 25, 2025

Hi,
I observe a huge difference between output of peak grouping / peak filling and featureValues, however I set minFraction to 0.1 and minSample = 1, and each sample represents a unique group in my case (sampleGroups vector is also a vector of unique numeric values).
So I manually found one interesting peak in one particular sample in the peak grouping object but it didn't occur in the feature table after the featureValues function.

In addition, I noticed that this peak is recognized at ms2 level, however I can plot peak with both xcms tools and vendor software.

Regards,
Ivan

@sneumann
Copy link
Owner

Hi, thanks for reaching out. Could you be a bit more specific ?
Possibly giving a reproducible example, and/or more description.
"huge difference" in what ? Number of features ? Or the intensities of peaks detected by, e.g. centWave, and those imputed by using fillPeaks() ? Can you show the interesting peak mentioned ?
Yours, Steffen

@plyush1993
Copy link
Author

Dear @sneumann thank you for your reply!

I attached mzXML files (derived from TIMS-TOF Pro2 operated in AutoMSMS mode (DDA) and converted in MSConvert for MS levels 1-2): reproducible example.zip
The mentioned peak has mz 259.28 and rt 295-300. So I can find it in pk_gr (after grouping) and plot by xcms::chromatogram and observe in vendor software, however, it isn't present in the final peak table.
Below is a code to reproduce the error:

library(xcms)
library(parallel)
library(doParallel)
library(BiocParallel)

setwd("C:/.../reproducible.example") # main folder
wd_1 <- c("C:/.../reproducible example/") # folder with mzXML files 
files_all <- list.files(wd_1, recursive = T, full.names = T, pattern = ".mzXML") 

pd <- data.frame(sample_name = sub(basename(files_all), pattern = ".mzML",
                                   replacement = "", fixed = TRUE), stringsAsFactors = FALSE) # download filenames

raw_data <- MSnbase::readMSData(files = files_all, pdata = new("NAnnotatedDataFrame", pd), mode = "onDisk") # or use pd only as: pdata = new("NAnnotatedDataFrame", pd) or pdata = new("NAnnotatedDataFrame", n_gr_t)

cores = detectCores()-1
register(bpstart(SnowParam(cores)))
BiocParallel::register(BiocParallel::SerialParam())

cwp <- xcms::CentWaveParam(ppm = 10, 
                           peakwidth = c(5, 50), 
                           snthresh = 10, 
                           prefilter = c(3, 300),
                           noise = 1000) 

feat_det <- xcms::findChromPeaks(raw_data, param = cwp)

data_merge <- refineChromPeaks(feat_det, MergeNeighboringPeaksParam(minProp = 0.75, expandRt = 5, ppm = 5)) 

BiocParallel::register(BiocParallel::SerialParam())
app <- xcms::ObiwarpParam(
  center = 2)

ret_cor <- xcms::adjustRtime(data_merge, param = app)

pgp <- xcms::PeakDensityParam(sampleGroups = c(1:length(files_all)), 
                              bw = 1, 
                              minFraction =  0, 
                              minSamples = 0
) 

pk_gr <- xcms::groupChromPeaks(ret_cor, param = pgp)
ft_tbl <- featureValues(pk_gr, value = "into")
ft_inf <- featureDefinitions(pk_gr)
det_mz = data.frame(ft_inf@listData[["mzmed"]])

plotChromPeaks(pk_gr, ylim = c(459.2700, 459.2900), xlim = c(200, 400), file = 2)
mzr_1 <- 459.28 + c(-0.005, 0.005)
chr_1 <- xcms::chromatogram(pk_gr, mz = mzr_1)
plot(chr_1)

@jorainer
Copy link
Collaborator

What version of R and xcms are you using? I tried to run your example code above with the current stable version of xcms (4.4.0) but can not find the peak you mention. The plot(chr_1) generates the plot below, which shows signal in MS1, but no identified chromatographic peak (which would be shaded with a grey background). In fact, it looks like you have just 3 data points - and doing peak detection with so few data points is difficult...

Image

maybe worth mentioning that there have been fixes to the chromatogram() function to ensure that signal is only extracted for the MS level of interest. So, the default is chromatogram(msLevel = 1), while it would also be possible to call chromatogram(msLevel = 2) to extract the MS 2 data.

@plyush1993
Copy link
Author

Thank you @jorainer !
I forgot to mention that I usually use 3.17.3, but I also tried on 4.4.0. Anyway, the problem seems to be associated with the number of data points.
But in general, if the peak is in the feature_grouping object, what is the reason why it is not included in the final peak table (output from featureValues function)? Is it possible to control it somehow?

@plyush1993 plyush1993 changed the title Huge difference in outputs Missing peak after grouping Jan 27, 2025
@jorainer
Copy link
Collaborator

With feature_grouping object - do you mean the xcms result object after correspondence? any LC-MS feature you have in the featureDefinitions() of that object will be (and has to be!) in the matrix returned by the featureValues() call. it can well be that some chromatographic peaks from the chromPeaks() matrix are not included/considered (i.e. if they are not part of one feature).

@plyush1993
Copy link
Author

I meant if feature is in the groupChromPeaks object, why it is not translated into peak table by featureDefinitions or featureValues? Let's say I observed a feature in groupChromPeaks object but in final peak there is no peak with similar m/z and or rt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants