[FEATURE REQUEST] Can we increase the memory efficiency of radEmu? #88

adw96 · 2024-10-17T16:27:59Z

In #84 , @svteichman noted that one of the reasons why it's expensive to run score tests even on a HPC is that it can be a very memory intensive process for large J. I just caught up with @ailurophilia who found this surprising, and we were wondering if @svteichman could give us some details on what functions are the most memory expensive. David did some awesome stuff with sparse matrices back in the day, but we may be able to optimize further.

svteichman · 2024-10-17T16:34:37Z

That's a great question that I will start looking into. I had forgotten about the sparse matrix implementations, and I agree that it doesn't make sense for those to require a ton of memory. I've been assuming that the largest memory required is for the score covariance matrix and information matrices used in the fisher scoring algorithm and in the robust score test statistic - however this doesn't fully explain it, because I can run the estimation algorithm locally on my laptop for ~8000 categories but cannot run score tests locally without running into memory issues. I find the same thing on the cluster, that I need to allocate far more memory for running score tests than for estimation, or else I get jobs that fail due to insufficient memory. So at this point, my thoughts on memory usage are just observations from running these score tests for up to 16,000 parameters (8000 x 2), but I'm happy to investigate it further to see if I can understand why this is happening.

adw96 · 2024-10-17T17:10:48Z

Thank you for the information, @svteichman ! No need to follow up further -- I know you're busy with other important things.

@gthopkins -- would you mind doing some memory profiling of radEmu for a single continuous predictor variable and growing $J$, and for a single binary predictor variable and growing $J$, and reporting for us some stats about what step/objects are becoming large? This isn't urgent (but I don't want it to fall off our list). Thank you so much!!

gthopkins · 2024-10-18T15:19:59Z

@adw96 yes, I will take a look! This is surprising trend, I would have guessed the procedure is relatively memory efficient. I’m intrigued to see what I find.

This addresses feature request/issue statdivlab#88

svteichman · 2025-01-06T17:30:15Z

Hi @gthopkins, I'm going to take a look at this and see what I can find, so no need to look further into this for now. If you've have any thoughts about this memory efficiency issue please let me know though!

adw96 assigned gthopkins Oct 17, 2024

adw96 mentioned this issue Oct 23, 2024

Should I do abundance filtering before running radEmu? #83

Closed

gthopkins added a commit to gthopkins/radEmu that referenced this issue Nov 4, 2024

Add documentation to include partially_verbose argument in emuFit

ff9164e

This addresses feature request/issue statdivlab#88

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Can we increase the memory efficiency of radEmu? #88

[FEATURE REQUEST] Can we increase the memory efficiency of radEmu? #88

adw96 commented Oct 17, 2024 •

edited

Loading

svteichman commented Oct 17, 2024

adw96 commented Oct 17, 2024

gthopkins commented Oct 18, 2024

svteichman commented Jan 6, 2025

[FEATURE REQUEST] Can we increase the memory efficiency of radEmu? #88

[FEATURE REQUEST] Can we increase the memory efficiency of radEmu? #88

Comments

adw96 commented Oct 17, 2024 • edited Loading

svteichman commented Oct 17, 2024

adw96 commented Oct 17, 2024

gthopkins commented Oct 18, 2024

svteichman commented Jan 6, 2025

adw96 commented Oct 17, 2024 •

edited

Loading