-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is CpG status predicted at insertion variants? #40
Comments
Thanks @elcortegano , I just commented on the IGV ticket -- in summary the HiFi reads always contain methylation information in insertions if they've had primrose 5mCpG calling applied to them (this process occurs on unaligned reads). For the pileup tools, they do not provide an output for insertions today. In the default |
Thanks for all the clarifications @ctsa ! specially for the input on the 5mCpG calls in the BAMs. Regarding the issue here with the pileup methods, I understand the difficulties in adding insertions data. The only standard format I could think of for adding that data would look like a VCF, with the different modification probabilities in the FORMAT field (?). I think that could be useful. I'll admit that for my personal uses right now it would be fine having IGV visualization for 5mC sites. However, relying on IGV visualization is not scalable, so I think other users might benefit of a feature like this in the future. |
Okay that sounds good. I'll be in touch with Jim if we can help with the IGV visualization at the read level. For the pileup we'll look out for a way to do this, perhaps VCF or another format will make this doable in future. |
same requirement as @elcortegano, HiFi data is beneficial for insertion identification and the 5mC methylation status may be highly related with these insertions, the support for 5mC of insertion sequence is desired. |
Following a recent issue open in the IGV site (igvteam/igv#1303), we wonder if
aligned_bam_to_cpg_scores.py
would call 5mC methylated CpG sites at insertions (i.e. at sites that are not in the reference).I run primrose in my data on the CCS reads, and then 5mC sites were called from a pbmm2 aligned BAM similarly as:
The text was updated successfully, but these errors were encountered: