Skip to content

[WIP] Add specialization for getting cols and cleanup of rvalue/lvalue #2880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

SteveBronder
Copy link
Collaborator

@SteveBronder SteveBronder commented Jan 21, 2020

Submission Checklist

  • Run unit tests: ./runTests.py src/test/unit
  • Run cpplint: make cpplint
  • Declare copyright holder and open-source license: see below

Summary

This PR is for two things

  1. In brms issue 833 Paul found that traversing a matrix row-wise was actually faster. Looking into the code I think it's because rvalue has a specialization for pulling out a single row but not a single column. So I added a specializations for pulling out a column or a slice of a column
A[, M] // index_list(index_omni(), index_uni(M));
A[J:K, M] // index_list(index_min_max(J, K), index_uni(M));

I also ran a test program that did a bunch of vec1 .* vec2 + ... with min_max slices though heaptrack and found that it was a pretty big culprit for a lot of allocations

image

The flame graph above shows that a large number of allocations happened when pulling from rvalue(). I think in these min_max cases we should be able to just grab a slice of a column and not allocate anything

  1. Some of the rvalue() code here couldn't take in fixed sized matrices, So I removed the Eigen::Dynamic values in the signature and replaced them with a template value

  2. In the case of a nil_index_list for rvalue and assign I added a perfect forwarding right hand side so that in the case of a temporary the value can be moved into the value to be assigned instead of copied. It maybe good here to have one specialization that does a copy for smaller types while forwarding for larger types

  3. We do our own bound checks so I changed instances of A(i, j) and A.coeff(i, j) with A.coeffRef(i, j). This should remove Eigen's own bounds checking to save a little time

  4. The templating here was a bit old school so I tried to update things to use the require library instead of boosts stuff

Intended Effect

This should give a very small performance bump to Stan code that does a lot of indexing. Though this code touches a lot of models and needs to go through more performance testing

How to Verify

I added two new tests to the cases mentioned in (1)

Side Effects

Hopefully less allocations. I am concerned that the auto's here may have unintended side effects that need more testing

Documentation

New docs for rvalue specializations mentioned in (1)

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Steve Bronder

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@SteveBronder
Copy link
Collaborator Author

I think this has to wait on 1628 in math

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants