-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE REQUEST] Can we increase the memory efficiency of radEmu? #88
Comments
That's a great question that I will start looking into. I had forgotten about the sparse matrix implementations, and I agree that it doesn't make sense for those to require a ton of memory. I've been assuming that the largest memory required is for the score covariance matrix and information matrices used in the fisher scoring algorithm and in the robust score test statistic - however this doesn't fully explain it, because I can run the estimation algorithm locally on my laptop for ~8000 categories but cannot run score tests locally without running into memory issues. I find the same thing on the cluster, that I need to allocate far more memory for running score tests than for estimation, or else I get jobs that fail due to insufficient memory. So at this point, my thoughts on memory usage are just observations from running these score tests for up to 16,000 parameters (8000 x 2), but I'm happy to investigate it further to see if I can understand why this is happening. |
Thank you for the information, @svteichman ! No need to follow up further -- I know you're busy with other important things. @gthopkins -- would you mind doing some memory profiling of |
@adw96 yes, I will take a look! This is surprising trend, I would have guessed the procedure is relatively memory efficient. I’m intrigued to see what I find. |
This addresses feature request/issue statdivlab#88
Hi @gthopkins, I'm going to take a look at this and see what I can find, so no need to look further into this for now. If you've have any thoughts about this memory efficiency issue please let me know though! |
In #84 , @svteichman noted that one of the reasons why it's expensive to run score tests even on a HPC is that it can be a very memory intensive process for large J. I just caught up with @ailurophilia who found this surprising, and we were wondering if @svteichman could give us some details on what functions are the most memory expensive. David did some awesome stuff with sparse matrices back in the day, but we may be able to optimize further.
The text was updated successfully, but these errors were encountered: