Skip to content

Adding chromPeaks metadata to the Spectra output of chromPeakSpectra() #779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Dec 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Package: xcms
Version: 4.5.1
Version: 4.5.2
Title: LC-MS and GC-MS Data Analysis
Description: Framework for processing and visualization of chromatographically
separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF,
Expand Down Expand Up @@ -158,4 +158,5 @@ Collate:
'writemzdata.R'
'writemztab.R'
'xcmsSource.R'
'zzz.R'
'zzz.R'

13 changes: 10 additions & 3 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,13 @@
# xcms 4.5.1
# xcms 4.5.2

## Changes in version 4.5.2

- Small update to `featureSpectra()` and `chromPeakSpectra()` to allow addition
of `chromPeaks()` and `featuresDefinitions()` columns to be added to the
`Spectra` output.
- Tidied the `xcms` vignette, to order the filtering of features and remove
the outdated normalisation paragraph.In depth discussion on this subject can
be found on `metabonaut`.

## Changes in version 4.5.1

Expand All @@ -13,8 +22,6 @@
- Small fix to the .yml file for the github actions, so they do not crash on
warnings.



## Changes in version 4.3.3

- Fix issue #755: `chromatogram()` with `msLevel = 2` fails to extract
Expand Down
57 changes: 42 additions & 15 deletions R/AllGenerics.R
Original file line number Diff line number Diff line change
Expand Up @@ -411,14 +411,19 @@ setGeneric("chromPeakData<-", function(object, value)
#'
#' Parameter `return.type` allows to specify the *type* of the result object.
#' With `return.type = "Spectra"` (the default) a [Spectra] object with all
#' matching spectra is returned. The spectra variable `"peak_id"` of the
#' returned `Spectra` contains the ID of the chromatographic peak (i.e., the
#' rowname of the peak in the `chromPeaks` matrix) for each spectrum.
#' With `return.type = "Spectra"` a `List` of `Spectra` is returned. The
#' length of the list is equal to the number of rows of `chromPeaks`. Each
#' element of the list contains thus a `Spectra` with all spectra for one
#' chromatographic peak (or a `Spectra` of length 0 if no spectrum was found
#' for the respective chromatographic peak).
#' matching spectra is returned. With `return.type = "Spectra"` a `List` of
#' `Spectra` is returned. The length of the list is equal to the number of rows
#' of `chromPeaks`. Each element of the list contains thus a `Spectra` with all
#' spectra for one chromatographic peak (or a `Spectra` of length 0 if no
#' spectrum was found for the respective chromatographic peak).
#'
#' Parameter `chromPeakColumns` allows the user to add specific metadata
#' columns from the chromatographic peaks (`chromPeaks`) to the returned
#' spectra object. This can be useful to keep information such as retention
#' time (`rt`), m/z (`mz`). The columns will be named as they are written in the
#' `chromPeaks` object with the prefix `"chrom_peak_"`. The *peak ID*
#' (i.e., the row name of the peak in the `chromPeaks` matrix) is always added
#' to the spectra object as a metadata column named `"chrom_peak_id"`.
#'
#' See also the *LC-MS/MS data analysis* vignette for more details and examples.
#'
Expand Down Expand Up @@ -453,6 +458,11 @@ setGeneric("chromPeakData<-", function(object, value)
#' @param return.type `character(1)` defining the type of result object that
#' should be returned.
#'
#' @param chromPeakColumns `character` vector with the names of the columns
#' from `chromPeaks` that should be added to the returned spectra object.
#' The columns will be named as they are written in the `chromPeaks` object
#' with a prefix `"chrom_peak_"`. Defaults to `c("mz", "rt")`.
#'
#' @param BPPARAM parallel processing setup. Defaults to [bpparam()].
#'
#' @param ... ignored.
Expand Down Expand Up @@ -500,10 +510,10 @@ setGeneric("chromPeakData<-", function(object, value)
#' ms2_sps <- chromPeakSpectra(dda)
#' ms2_sps
#'
#' ## spectra variable *peak_id* contain the row names of the peaks in the
#' ## spectra variable *chrom_peak_id* contain the row names of the peaks in the
#' ## chromPeak matrix and allow thus to map chromatographic peaks to the
#' ## returned MS2 spectra
#' ms2_sps$peak_id
#' ms2_sps$chrom_peak_id
#' chromPeaks(dda)
#'
#' ## Alternatively, return the result as a List of Spectra objects. This list
Expand Down Expand Up @@ -799,10 +809,21 @@ setGeneric("featureDefinitions<-", function(object, value)
#' spectrum **per chromatographic peak** will be returned (hence multiple
#' spectra per feature).
#'
#' The ID of each chromatographic peak (i.e. its row name in `chromPeaks`)
#' and each feature (i.e., its row name in `featureDefinitions`) are
#' available in the returned [Spectra()] with spectra variables `"peak_id"`
#' and `"feature_id"`, respectively.
#' The information from `featureDefinitions` for each feature can be included
#' in the returned [Spectra()] object using the `featureColumns` parameter.
#' This is useful for keeping details such as the median retention time (`rtmed`)
#' or median m/z (`mzmed`). The columns will retain their names as specified
#' in the `featureDefinitions` object, prefixed by `"feature_"`
#' (e.g., `"feature_mzmed"`). Additionally, the *feature ID* (i.e., the row
#' name of the feature in the `featureDefinitions` data.frame) is always added
#' as a metadata column named `"feature_id"`.
#'
#' See also [chromPeakSpectra()], as it supports a similar parameter for
#' including columns from the chromatographic peaks in the returned spectra object.
#' These parameters can be used in combination to include information from both
#' the chromatographic peaks and the features in the returned [Spectra()].
#' The *peak ID* (i.e., the row name of the peak in the `chromPeaks` matrix)
#' is added as a metadata column named `"chrom_peak_id"`.
#'
#' @param object [XcmsExperiment] or [XCMSnExp] object with feature defitions.
#'
Expand All @@ -815,6 +836,12 @@ setGeneric("featureDefinitions<-", function(object, value)
#' `featureDefinitions(x)`). This parameter overrides `skipFilled` and is
#' only supported for `return.type` being either `"Spectra"` or `"List"`.
#'
#' @param featureColumns `character` vector with the names of the columns
#' from `featureDefinitions` that should be added to the returned spectra
#' object. The columns will be named as they are written in the
#' `featureDefinitions` object with the prefix `"feature_`.
#' Defaults to `c("mzmed", "rtmed")`.
#'
#' @param ... additional arguments to be passed along to [chromPeakSpectra()],
#' such as `method`.
#'
Expand All @@ -825,7 +852,7 @@ setGeneric("featureDefinitions<-", function(object, value)
#' the order and the length matches parameter `features` (or if no `features`
#' is defined the order of the features in `featureDefinitions(object)`).
#'
#' Spectra variables `"peak_id"` and `"feature_id"` define to which
#' Spectra variables `"chrom_peak_id"` and `"feature_id"` define to which
#' chromatographic peak or feature each individual spectrum is associated
#' with.
#'
Expand Down
39 changes: 33 additions & 6 deletions R/XcmsExperiment-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -793,10 +793,16 @@
"largest_bpi"),
msLevel = 2L, expandRt = 0, expandMz = 0,
ppm = 0, skipFilled = FALSE,
peaks = integer(), BPPARAM = bpparam()) {
peaks = integer(),
chromPeakColumns = c("rt", "mz"),
BPPARAM = bpparam()) {
method <- match.arg(method)
pks <- .chromPeaks(x)[, c("mz", "mzmin", "mzmax", "rt",
"rtmin", "rtmax", "maxo", "sample")]
if (!all(chromPeakColumns %in% colnames(.chromPeaks(x))))
stop("One or more of the columns in 'chromPeakColumns' are not ",
"available in the 'chromPeaks' data.")
pks <- .chromPeaks(x)[, union(c("mz", "mzmin", "mzmax", "rt",
"rtmin", "rtmax", "maxo", "sample"),
chromPeakColumns)]
if (ppm != 0)
expandMz <- expandMz + pks[, "mz"] * ppm / 1e6
if (expandMz[1L] != 0) {
Expand All @@ -818,7 +824,7 @@
res <- bpmapply(
split.data.frame(pks, f),
split(spectra(x), factor(fromFile(x), levels = levels(f))),
FUN = function(pk, sp, msLevel, method) {
FUN = function(pk, sp, msLevel, method, chromPeakColumns) {
sp <- filterMsLevel(sp, msLevel)
idx <- switch(
method,
Expand All @@ -829,14 +835,35 @@
largest_bpi = .spectra_index_list_largest_bpi(sp, pk, msLevel))
ids <- rep(rownames(pk), lengths(idx))
res <- sp[unlist(idx)]
res$peak_id <- ids
pk_data <- as.data.frame(pk[ids, chromPeakColumns, drop = FALSE])
pk_data$id <- ids
colnames(pk_data) <- paste0("chrom_peak_", colnames(pk_data))
res <- .add_spectra_data(res, pk_data)
res
},
MoreArgs = list(msLevel = msLevel, method = method),
MoreArgs = list(msLevel = msLevel, method = method,
chromPeakColumns = chromPeakColumns),
BPPARAM = BPPARAM)
Spectra:::.concatenate_spectra(res)
}

#' @param x `Spectra` object.
#'
#' @param data `data.frame` or `matrix` with the data to be added to the
#' spectra object.
#'
#' @noRd
.add_spectra_data <- function(x, data) {
if (is(data, "matrix"))
data <- as.data.frame(data)
if (nrow(data) != length(x))
stop("Length of 'data' does not match the number of spectra in 'x'")
for (i in colnames(data)) {
x[[i]] <- data[, i]
}
x
}

#' @param peaks `matrix` with chrom peaks.
#'
#' @param peakIdx `list` of `integer` indices defining which chromatographic
Expand Down
21 changes: 16 additions & 5 deletions R/XcmsExperiment.R
Original file line number Diff line number Diff line change
Expand Up @@ -1229,6 +1229,7 @@ setMethod(
"largest_tic", "largest_bpi"),
msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0,
skipFilled = FALSE, peaks = character(),
chromPeakColumns = c("rt", "mz"),
return.type = c("Spectra", "List"), BPPARAM = bpparam()) {
if (hasAdjustedRtime(object))
object <- applyAdjustedRtime(object)
Expand All @@ -1244,14 +1245,15 @@ setMethod(
else pkidx <- integer()
res <- .mse_spectra_for_peaks(object, method, msLevel, expandRt,
expandMz, ppm, skipFilled, pkidx,
chromPeakColumns,
BPPARAM)
if (!length(pkidx))
peaks <- rownames(.chromPeaks(object))
else peaks <- rownames(.chromPeaks(object))[pkidx]
if (return.type == "Spectra")
res <- res[as.matrix(findMatches(peaks, res$peak_id))[, 2L]]
res <- res[as.matrix(findMatches(peaks, res$chrom_peak_id))[, 2L]]
else
as(split(res, factor(res$peak_id, levels = peaks)), "List")
as(split(res, factor(res$chrom_peak_id, levels = peaks)), "List")
})

#' @rdname reconstructChromPeakSpectra
Expand Down Expand Up @@ -1773,11 +1775,16 @@ setMethod(
"featureSpectra", "XcmsExperiment",
function(object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0,
skipFilled = FALSE, return.type = c("Spectra", "List"),
features = character(), ...) {
features = character(),
featureColumns = c("rtmed", "mzmed"),
...) {
return.type <- match.arg(return.type)
if (!hasFeatures(object))
stop("No feature definitions present. Please run ",
"'groupChromPeaks' first.")
if (!all(featureColumns %in% colnames(featureDefinitions(object))))
stop("One or more of the requested 'featureColumns' are not ",
"present in the feature definitions.")
if (hasAdjustedRtime(object))
object <- applyAdjustedRtime(object)
features_all <- rownames(featureDefinitions(object))
Expand All @@ -1794,11 +1801,15 @@ setMethod(
expandMz = expandMz, ppm = ppm, skipFilled = skipFilled,
peaks = unique(pindex), ...)
mtch <- as.matrix(
findMatches(sps$peak_id, rownames(.chromPeaks(object))[pindex]))
findMatches(sps$chrom_peak_id,
rownames(.chromPeaks(object))[pindex]))
sps <- sps[mtch[, 1L]]
fid <- rep(
ufeatures, lengths(featureDefinitions(object)$peakidx[findex]))
sps$feature_id <- fid[mtch[, 2L]]
f_data <- featureDefinitions(object)[fid[mtch[, 2L]], featureColumns]
f_data$id <- fid[mtch[, 2L]]
colnames(f_data) <- paste0("feature_", colnames(f_data))
sps <- .add_spectra_data(sps, f_data)
if (return.type == "List") {
sps <- List(split(sps, f = factor(sps$feature_id,
levels = ufeatures)))
Expand Down
2 changes: 1 addition & 1 deletion R/do_adjustRtime-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -874,7 +874,7 @@ NULL
resid_ratio = 3,
zero_weight = 10,
bs = "tp"){
rt_map <- rt_map[order(rt_map$obs), ]
rt_map <- rt_map[order(rt_map$obs), c("ref", "obs")]
# add first row of c(0,0) to set a fix timepoint.
rt_map <- rbind(c(0,0), rt_map)
weights <- rep(1, nrow(rt_map))
Expand Down
31 changes: 21 additions & 10 deletions man/chromPeakSpectra.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 23 additions & 5 deletions man/featureSpectra.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading