You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @muntakimrafi wanted to create a an issue relevant to our discussion on Blue sky
Basically it would be really useful to provide files with precomputed pairwise leakage for all genomic elements. I imagine every ~200 bp or so, for popular references (hg38, mm10, T2T, GCRm39).
Ideally there could also be a tiled whole genome pipeline.
If it's useful I'm happy to contribute some development time to parts of this task.
The text was updated successfully, but these errors were encountered:
Hello @ejarmand, continuing from our discussion on Blue sky
This would be an enourmous amount of calculation. But I think we should do this if we were do a create the least possible leakage free splits for the genome.
The first step would be to run blastn genome wide. I am thinking of creating mutliple databases (1 per chromosome) and running blastn_array modules (for each chromosome as query set) (@bkiyota just pushed it to the repo). This way we can modularize the computation and divide it among multiple interested parties.
Hi @muntakimrafi wanted to create a an issue relevant to our discussion on Blue sky
Basically it would be really useful to provide files with precomputed pairwise leakage for all genomic elements. I imagine every ~200 bp or so, for popular references (hg38, mm10, T2T, GCRm39).
Ideally there could also be a tiled whole genome pipeline.
If it's useful I'm happy to contribute some development time to parts of this task.
The text was updated successfully, but these errors were encountered: