-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A helper function allowing arbitrarily complex custom contrasts #11
Comments
A little update for using the function in interactions:
|
Hi @atakanekiz, that does look like a useful and intuitive function to use. However, as I've mentioned in my warning (now on the README as well), I'm in general not confident of the "universality" of this approach anymore, in particular for imbalanced designs. # replace all control males in batch 1 with female
dds$sex[c(1, 3, 7)] <- "female" But we still want to adjust for average sex differences, so we fit your model: design(dds) <- ~ condition + sex + batch
dds <- DESeq(dds) Now, let's say I wanted to simply contrast treatment A vs control: contraster(dds,
group1=list(c("condition", "treatmentA")),
group2=list(c("condition", "control"))) The result is:
We get coefficient weights for results(dds, name = "condition_treatmentA_vs_control") In summary, yes your function seems like a really nice modification, but the general strategy might not be as "universal" as we hoped it would be. |
I see. I find your approach of creating contrast vectors the most straightforward and intuitively understandable way to specify contrasts in DESeq2. Simple designs (one factor-two/three levels or two factor-two levels) are easy to deal with by using the list or coefficient names, but things get complicated quickly. I am curious about what you think about the following points: 1- I played around with it a bit and found out that even if there is a single 2- The other approach is, making all coefficients with decimals zero when 3- When would you need to assign weights in the contrasts? When multiple data subsets are grouped together, they should be assigned the same weight as you alluded to in #7 That would be tremendous if we can find a way to fix this unwanted behavior -- I think it will be a great service to many in the field 🦸 |
I agree, except that I eventually realised that this approach breaks down for imbalanced designs. So, it can be dangerous to trust it blindly. Regarding your points: 1- For the example design I gave (one of the sexes missing in one of the batches), we don't want to find sex-by-batch interactions, we just want to account for an average effect of batch across our groups (that's what the additive design is doing effectively). We're assuming that there is an average offset for each batch. For example, in batch 2 on average the expression was 3 units higher and in batch 3 on average the expression was 1 unit lower, regardless of sex or treatment. Something like that. So, we do want to include 2 - I'm not sure it would work, because sometimes we may want to have decimals. For example, if I want to compare the average of treatment2 and treatment3 against control, then the coefficients for treatment2 and treatment3 should be 0.5 each. 3 - Yes, as I say in #7 I'm not sure the weights make sense at all. I need to revise the whole tutorial to remove that. The more people open issues, the more skeptical I am of my initial approach! |
At the risk of making you scream "Enough already!", another possible solution comes to mind: The unbalanced experiments cause erroneously weighted (and different) contrasts. As you said, we may want to have numbers like 0.5, 0.33, 0.25 in the contrasts vector (representing averages of 2,3 and 4 groups respectively). However, there should be more than one of these numbers in the contrasts vector (eg. condA--> 0.5 condB --> 0.5). Would it work to engineer something into |
Hi Hugo,
Thank you for the tremendously helpful tutorial! I like the
getNumericCoef()
function you mentioned in your Gist and I wanted to improve on that a little bit to allow specifying custom groups to compare across multiple conditions. I would appreciate it if you can take a look and let me know if this could be universally (?) used to streamline analyses. In a nutshell, the function has 4 inputs:dds
, with design and colDatagroup1
andgroup2
arguments to specify groups to compareweighted
argument to weight differing group sizes differentlyThe
group1
andgroup2
arguments have the same syntax: character vector of length 2+ where the first position points to one of the column names incolData(dds)
(ie grouping variable), and the second position (and onwards) indicate(s) specific subsets to include in that comparison group. The contrast is extracted asgroup1 - group2
(ie group1 is the group of interest, and the group2 is the reference)This is the function
This is the reprex
A more complex reprex
The text was updated successfully, but these errors were encountered: