[WIP] Add specialization for getting cols and cleanup of rvalue/lvalue #2880
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Submission Checklist
./runTests.py src/test/unit
make cpplint
Summary
This PR is for two things
rvalue
has a specialization for pulling out a single row but not a single column. So I added a specializations for pulling out a column or a slice of a columnI also ran a test program that did a bunch of
vec1 .* vec2 + ...
with min_max slices though heaptrack and found that it was a pretty big culprit for a lot of allocationsThe flame graph above shows that a large number of allocations happened when pulling from rvalue(). I think in these min_max cases we should be able to just grab a slice of a column and not allocate anything
Some of the
rvalue()
code here couldn't take in fixed sized matrices, So I removed the Eigen::Dynamic values in the signature and replaced them with a template valueIn the case of a nil_index_list for
rvalue
andassign
I added a perfect forwarding right hand side so that in the case of a temporary the value can be moved into the value to be assigned instead of copied. It maybe good here to have one specialization that does a copy for smaller types while forwarding for larger typesWe do our own bound checks so I changed instances of
A(i, j)
andA.coeff(i, j)
withA.coeffRef(i, j)
. This should remove Eigen's own bounds checking to save a little timeThe templating here was a bit old school so I tried to update things to use the
require
library instead of boosts stuffIntended Effect
This should give a very small performance bump to Stan code that does a lot of indexing. Though this code touches a lot of models and needs to go through more performance testing
How to Verify
I added two new tests to the cases mentioned in (1)
Side Effects
Hopefully less allocations. I am concerned that the
auto
's here may have unintended side effects that need more testingDocumentation
New docs for
rvalue
specializations mentioned in (1)Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Steve Bronder
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: