Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of custom transofmration in linear models. #160

Open
SilasK opened this issue May 31, 2024 · 1 comment
Open

Use of custom transofmration in linear models. #160

SilasK opened this issue May 31, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@SilasK
Copy link

SilasK commented May 31, 2024

I would like to use variations of CLR transformation.

e.g., non-zero or iqlr CLR, both using only a subset of the features to estimate the geometric mean.

I can transform a phyloseq object before the taxatree_models, but then it complains that the input contains non-zero values, which messes up the aggregation. I found no way to turn off the aggregation.

Ideally, one could add a custom function to tax_aggregate or directly implement these transformations in microViz.

Here is my function:

#' @title get otu table as a matrix with taxa are columns format
#' @description A wrapper to get otu table as a matrix with taxa are columns format
#' @param pseq The phyloseq object.
#' @return otu table as a matrix with taxa are columns format
#' @examples
#' get_otu_matrix(pseq)
#' @export
#' @import phyloseq
otu_matrix <- function(pseq) {
  OTU <- as(phyloseq::otu_table(pseq), "matrix")
  
  if (phyloseq::taxa_are_rows(pseq)) {
    OTU <- t(OTU)
  }
  
  OTU
}

#' @title robust (non zero) centered log transformation
#' @description A wrapper to perform Bayesian replacement of zeroes and CLR-transformation. It is rebust in the sense that it takes only values that are not zero in the calculation of the geometric mean.
#' @param pseq The phyloseq object.
#' @param log_fun The log function to use. Default is log2.
#' @return phyloseq object with CLR-transformed abundances.
#' @export
#'
rclr_transformation <- function(pseq, log_fun = log2) {
  if (class(pseq) != "phyloseq") {
    stop("pseq must be a phyloseq object.")
  }
  
  abundances <- otu_matrix(pseq)
  mdat <- microbiome::meta(pseq)
  if (!identical(rownames(abundances), rownames(mdat))) {
    stop("OTU table and metadata are not in order.")
  }
  
  
  # remove all columns with all zeros
  abundances <- abundances[, colSums(abundances) > 0]
  
  abundances.log <- zCompositions::cmultRepl(abundances, label = 0, method = "CZM",z.delete = FALSE) |>
    log_fun() |>
    as.matrix()
  
  abundances.log.nz <- abundances.log
  abundances.log.nz[abundances == 0] <- NaN
  
  
  geo_means <- rowMeans(abundances.log.nz, na.rm = TRUE)
  
  
  transformed <- abundances.log - geo_means
  
  
  
  ps <- phyloseq::phyloseq(
    phyloseq::otu_table(t(transformed), taxa_are_rows = TRUE),
    phyloseq::sample_data(pseq),
    phyloseq::phy_tree(pseq, errorIfNULL = FALSE),
    phyloseq::tax_table(pseq)
  )
  
  return(ps)
}
@david-barnett david-barnett added the enhancement New feature or request label Jun 5, 2024
@david-barnett
Copy link
Owner

interesting suggestion Silas,

I could allow users to supply any transform as a function instead of a string,

the transform would have to take and return an otu_table, acting as a drop-in replacement for this line within tax_transform

otu <- otuTransform(otu = otu, trans = trans, ...)

it would need to relax argument type checks in a few places to allow this...

I won't have time to look at this for a few weeks probably, feel free to attempt a PR if its more urgent and that will speed things up

cheers
David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants