You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I have recently tested haploflow on a complex metagenomics dataset, and it seems to be performing very well compared to other tools in producing correctly assembled viral contigs. For testing, I have used half of my dataset (i.e. only forward reads), with an uncompressed file size of 8 GB, and this has run without issues. I have used the conda installation, and am running this on a Linux system with 250 GB RAM. However, when I try to use the full dataset (16 GB), the RAM usage increases until it is maxed out and the program eventually crashes. Is there any way to control the memory use to avoid these issues?
The text was updated successfully, but these errors were encountered:
Hi, unfortunately there currently is no way to control memory manually and the deBruijn graph implementation is not optimised for large metagenomic datasets (it scales with the number of different k-mers). It is on my list to improve the memory behaviour though and maybe there really is a bug/memory leak somewhere, since I don't expect the reverse reads to add so many new k-mers - I will investigate.
Hi,
I have recently tested haploflow on a complex metagenomics dataset, and it seems to be performing very well compared to other tools in producing correctly assembled viral contigs. For testing, I have used half of my dataset (i.e. only forward reads), with an uncompressed file size of 8 GB, and this has run without issues. I have used the conda installation, and am running this on a Linux system with 250 GB RAM. However, when I try to use the full dataset (16 GB), the RAM usage increases until it is maxed out and the program eventually crashes. Is there any way to control the memory use to avoid these issues?
The text was updated successfully, but these errors were encountered: