You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm wondering how the genetic variants on CpG sites are treated by aligned_bam_to_cpg_scores.
For example, when a diploid genome has a heterozygous SNP on a CpG site, how will the coverage and modified/unmodified site counts in output files be affected?
Best,
Kyuto
The text was updated successfully, but these errors were encountered:
First note that if the heterozygous SNP creates a CpG that isn't present in the reference, then you'll only see output for that site when --modsites-mode is set to the denovo option.
When output is generated for a heterozygous SNP site, I believe the current logic will give the non-CpG reads a methylation probability of zero, and count them towards the unmodified coverage.
To my understanding, if one has a heterozygous SNP on a fully methylated CpG site, the modification probability will be evaluated as ~0.5.
Such a site needs attention in interpretation, especially when the interest is in the effects of epigenetic regulation.
I think it could be helpful if the counts of non-CpG reads were shown for each site in the output.
Thanks @qsonehara , I think this is a good suggestion for us to have as an option, we can leave this as an feature ticket and see if it can be added in a future update.
Hi,
I'm wondering how the genetic variants on CpG sites are treated by aligned_bam_to_cpg_scores.
For example, when a diploid genome has a heterozygous SNP on a CpG site, how will the coverage and modified/unmodified site counts in output files be affected?
Best,
Kyuto
The text was updated successfully, but these errors were encountered: