Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deviation contrast and excluding samples via design matrix #5

Open
vtorda opened this issue May 23, 2022 · 2 comments
Open

Deviation contrast and excluding samples via design matrix #5

vtorda opened this issue May 23, 2022 · 2 comments

Comments

@vtorda
Copy link

vtorda commented May 23, 2022

Hi Hugo,

Thank you for your material! It helped me understand how to handle design matrix and contrast vectors.

I have two specific topics in which I would ask for your help, and it would also be nice if you could implement these topics in your tutorial:
1) deviation contrast
2) excluding samples from contrasting

1) deviation contrast

I am trying to create a deviation contrast that is the difference between the mean of a condition to the grand mean.
I wonder whether this procedure is correct?

design(dds) <- ~ 0 + condition
dds <- DESeq(dds)
mod_mat <- model.matrix(design(dds), colData(dds))
grand_mean <- colMeans(mod_mat)
condition1 <- colMeans(mod_mat[which(dds$condition == "condition1"), ])
res3 <- results(dds, contrast = condition1 - grand_mean)

2) excluding samples from contrasting

If I want to exclude a sample from my contrast analysis but want to keep it to infer dispersion, is it ok to set all values to zero in the design matrix? But in this case, do I also need to consider that I have less sample in the contrast when I perform, e.g., a deviation contrast? More specifically, I shouldn't calculate the grand mean by colMeans, because the divisor should be the # of all samples minus the # of samples that were excluded.

By modifying the code above:

design(dds) <- ~ 0 + condition
dds <- DESeq(dds)
mod_mat <- model.matrix(design(dds), colData(dds))

I want to exclude the condition3 from the contrast:

excl_n <- sum(mod_mat[,str_detect(colnames(mod_mat), condition3)]) # number of excluded samples
mod_mat[,colnames(mod_mat) %in% condition3] <- 0
grand_mean <- apply(mod_mat, 2, function(x) sum(x) / (length(x) - excl_n))
condition1 <- colMeans(mod_mat[which(dds$condition == "condition1"), ])
res3 <- results(dds, contrast = condition1 - grand_mean)

Thank you!

Torda

@tavareshugo
Copy link
Owner

Hi @vtorda

Sorry for taking some time to reply to this. This reply is just based on looking at your code visually, without having tested it myself. But the short answer is that I think what you're doing is fine in both cases.

1)
Yes, I think setting the contrast that way would answer your question of interest.

2)
That seems like it's one way to do it, but I think perhaps you're over-complicating (unless I'm missing something).
You could perhaps do it this way:

grand_mean <- colMeans(mod_mat[which(dds$condition != "condition3"), ])

So, strictly speaking it's not the grand mean (it's the mean of all the terms except condition3), but it seems to be what you want, I think.


I'll keep this issue open, to remind me to include examples of this in the document, when I have some time. Thanks for the questions! 😃

@vtorda
Copy link
Author

vtorda commented Jun 22, 2022

Hi Hugo,

Thank you very much for your reply! You helped me a lot, not just with your answers but with your thorough explanations in your documents! Regarding the second point, I think you are right; your code is much more straightforward and does the same what I want: calculate the mean of all samples except one and contrast the mean of one of the samples and the mean of all samples except that one.

Cheers,
Torda

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants