Possible to restrict RAM use with haploflow? #18

chrisgab · 2023-02-15T19:51:54Z

Hi,
I have recently tested haploflow on a complex metagenomics dataset, and it seems to be performing very well compared to other tools in producing correctly assembled viral contigs. For testing, I have used half of my dataset (i.e. only forward reads), with an uncompressed file size of 8 GB, and this has run without issues. I have used the conda installation, and am running this on a Linux system with 250 GB RAM. However, when I try to use the full dataset (16 GB), the RAM usage increases until it is maxed out and the program eventually crashes. Is there any way to control the memory use to avoid these issues?

AlphaSquad · 2023-02-16T08:16:25Z

Hi, unfortunately there currently is no way to control memory manually and the deBruijn graph implementation is not optimised for large metagenomic datasets (it scales with the number of different k-mers). It is on my list to improve the memory behaviour though and maybe there really is a bug/memory leak somewhere, since I don't expect the reverse reads to add so many new k-mers - I will investigate.

chrisgab · 2023-02-16T13:44:34Z

Thank you for your quick response. Very interested to hear what you find, and if a future version will implement more control over memory usage.

AlphaSquad self-assigned this Feb 16, 2023

AlphaSquad added the enhancement label Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible to restrict RAM use with haploflow? #18

Possible to restrict RAM use with haploflow? #18

chrisgab commented Feb 15, 2023

AlphaSquad commented Feb 16, 2023

chrisgab commented Feb 16, 2023

Possible to restrict RAM use with haploflow? #18

Possible to restrict RAM use with haploflow? #18

Comments

chrisgab commented Feb 15, 2023

AlphaSquad commented Feb 16, 2023

chrisgab commented Feb 16, 2023