-
-
Notifications
You must be signed in to change notification settings - Fork 195
var<mat> versions of quad_form and trace_quad_form (Issue #2101)
#2164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions and comments
stan/math/prim/err/check_square.hpp
Outdated
| @@ -17,7 +17,7 @@ namespace math { | |||
| * @param y Matrix to test | |||
| * @throw <code>std::invalid_argument</code> if the matrix is not square | |||
| */ | |||
| template <typename T_y, require_eigen_t<T_y>* = nullptr> | |||
| template <typename T_y, require_matrix_t<T_y>* = nullptr> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a_few_varmats has this (#2106), so this will get fixed when that merges
stan/math/rev/fun/quad_form.hpp
Outdated
| auto arena_A = to_arena(A); | ||
| auto arena_B = to_arena(B); | ||
|
|
||
| check_not_nan("multiply", "A", value_of(arena_A)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to go through and make all the check functions work with var<mat> types. I'll make a list of the ones that need updated and add them to #2101. Here I just pass in the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should prefer passing values into check functions anyway. That way we don't need them to support vars. Also checks on value will be faster than ones on Matrix<var>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is an expression thing, again. Like, if we pass in MatrixXd, then if we switch to value then we have to do two full reads of the matrix to do a check. If we pass in Matrix<var, R, C> it is one read. If we pass in an expression view of the values in a Matrix<var, R, C>, then it is still one read. This sound right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but that is assuming the functions calling the check does not need to extract values anyway. Most functions do need values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most functions do need values.
That is true.
I guess if we just put a auto x_val = value_of_rec(x) inside all the checks then that would be good? If someone passes in values, that is good. If someone passes in mat<var>, it works via expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would work, but for mat<var> it would not be efficient. Also I am trying to say if we only pass in values we don't need to change all checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense. I don't want to accidentally make them slow.
I think our checks should work with var<mat> though. That means for ones that read values just adding a version that takes in a var<mat> and then calls itself with the value of the var<mat>.
stan/math/rev/fun/quad_form.hpp
Outdated
| arena_res = (arena_res + arena_res.transpose()).eval(); | ||
| } | ||
|
|
||
| return_t res = arena_res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arena_res is on the arena, so since return_t is a var<mat> it will just take the pointer arena_res has right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. arena_t<var<T>> is var<T>.
stan/math/rev/fun/quad_form.hpp
Outdated
| value_of(B).eval()), Mat1, Mat2>; | ||
|
|
||
| auto arena_A = to_arena(A); | ||
| auto arena_B = to_arena(B); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these matrices are either var<mat> s or just regular mat s then this code is fine. If one of these is a mat<var> though then it's not as fine because I access values and adjoints repeatedly. If the vals/adjoints are accessed more than once, it would be more economical to just copy to a var<mat>.
So my question is we should just by default be copying to var<mat> in all these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about you do something like:
const auto& A_adj = to_ref(arena_A)
Also it would be nice if we only copied these matrices to arena if we actually needed them in the reverse pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I'm not worried about evaluatiang expressions as much as converting things to var<mat> land so I can be lazy.
I checked and all the copying here is fine if inputs are a mix of mat<double> and var<mat>. The A inputs and the B inputs are required for both reverse passes. It's not so good for mat<var>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. to_arena does not convert to var<mat>, so there is steel a need to handle mat<var> efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the solution is to just add a version of to_var_value (https://github.com/stan-dev/math/blob/develop/stan/math/rev/fun/to_var_value.hpp) that just forwards things through if it is given a var_value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I think at a very high level" as a precursor to a sentence sounds like the most arrogant thing ever said. It is also very cringe on its own. I just meant "I'm not sure but", but it didn't come out that way lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of interesting, I did a search on github for quad_form in stan programs
https://github.com/search?l=&p=5&q=quad_form+language%3AStan&type=Code
It looks like the most used case is having data as the second argument. Mostly just something to note. (aka I'd rather not spend too much time optimizing this part)
If these matrices are either
var<mat>s or just regular mat s then this code is fine. If one of these is amat<var>though then it's not as fine because I access values and adjoints repeatedly. If the vals/adjoints are accessed more than once, it would be more economical to just copy to avar<mat>.
I agree with Tadej that mat<var> -> var<mat> is v expensive. For this func it may be fine since we do a lot of mat muls
I also don't like to_var_value_if, if we want a var_value from a mat<var> we could just do
var_value<promote_scalar_t<double, Mat1>>
imo I'm not a huge fan of the do_something_if functions with templates. I personally label them as "easy to write hard to read". They're also worse for small N (doing almost nothing is different from doing nothing like you get by pulling out the logic)
I went through and moved all the var/double logic out.
https://github.com/stan-dev/math/compare/feature-pr2164/quad_form
The other benefit of that is we don't have to use forward_as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't like to_var_value_if, if we want a var_value from a mat we could just do
var_value<promote_scalar_t<double, Mat1>>
I don't think this constructor exists though, so we have to call a conversion somehow.
For this func it may be fine since we do a lot of mat muls
I agree the conversion is more expensive than not doing the conversion in ideal cases, but the comparison is between doing the conversion, calling value_of like a ton of extra times (and there are like 9-10 of these in the implementations here), or writing special versions that support combos of mat<var> and var<mat> (and selectively copying out values).
The conversion is like 5 extra read ops and we do 3 every time we do an uncached .val() or .adj(). So I'd much rather be lazy on this one. And then I want to avoid writing three versions of the function to support the argument combos.
I went through and moved all the var/double logic out.
Hmm I think what I wrote should be efficient for var<mat> mat<double>. I prefer the shorter code, but if there is an efficiency problem I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also that is an interesting search neat trick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the 3 read ops to do a .val() in a mat<var>, one of those is pointer chasing and I think will be way more expensive than loading up .val() three times from a var<mat>.
Edit: changed mat<var> to var<mat>... Got this exactly backwards on the first try lol.
…4.1 (tags/RELEASE_600/final)
…into feature/quad_form
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
…4.1 (tags/RELEASE_600/final)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments
| inline var trace_quad_form(const Mat1& A, const Mat2& B) { | ||
| check_square("trace_quad_form", "A", A); | ||
| check_multiplicable("trace_quad_form", "A", A, "B", B); | ||
|
|
||
| arena_t<Mat1> arena_A = A; | ||
| arena_t<Mat2> arena_B = B; | ||
| auto arena_A = to_arena(to_var_value_if<!is_constant<Mat1>::value>(A)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up going with to_var_value_if as the lazy support for mixed var<mat> and mat<var>.
The other thing I tried was templating out the functions but that is confusing in its own way. It looks like:
template <typename Mat1, typename Mat2,
require_var_matrix_t<Mat1>* = nullptr,
require_eigen_st<is_var, Mat2>* = nullptr>
inline var trace_quad_form(const Mat1& A, const Mat2& B) {
return trace_quad_form(A, to_var_value(B));
}
template <typename Mat1, typename Mat2,
require_eigen_st<is_var, Mat1>* = nullptr,
require_var_matrix_t<Mat2>* = nullptr>
inline var trace_quad_form(const Mat1& A, const Mat2& B) {
return trace_quad_form(to_var_value(A), B);
}
template <typename Mat1, typename Mat2,
require_all_matrix_t<Mat1, Mat2>* = nullptr,
require_any_var_matrix_t<Mat1, Mat2>* = nullptr>,
require_all_not_eigen_st<is_var, Mat1, Mat2>* = nullptr>
inline var trace_quad_form(const Mat1& A, const Mat2& B) {
...
}Edit: updated code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following why they have to both be coerced to var matrices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other options are:
-
Do not support mixes of
mat<var>andvar<mat>. I do not like that because we will have programs with both types (and so we could end up with mixed arguments) -
Write three different specializations. I do not like that because we found out optimizing for
mat<var>is trickier thanvar<mat>. I do not think this is particularly necessary cuz this will be faster than not having autodiff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just don't move the matrix<var> to a var<matrix>. I think Tadej and I both prefer just leaving them instead of doing the conversion. In another comment I have branch with one quad_form that handles all the cases of vars with one function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to sit down one day and figure out this whole val() and val_op() thing. Something something const correctness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The furthest I got was that it's something to do with Eigen's sizeof logic, there's a little more discussion in this stackoverflow thread. The error doesn't occur with float, only with double. I found a modification that fixed the compilation, but hard to know whether that would affect other behaviour. Tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In another comment I have branch with one quad_form that handles all the cases of vars with one function
I was trying to avoid touching the pure mat<var> implementation since we decided to wait on benchmarks before we did that (which is why we decided not to merge #2019)
I think Tadej and I both prefer just leaving them
Yeah I saw the comments and I made a response where I argued the case that the conversion here was better: #2164 (comment)
Since I didn't see a response to that, I just went with the conversion to move this along.
I'm not on the same page with stuff like:
arena_t<promote_scalar_t<var, Mat1>> arena_A(A);
arena_t<promote_scalar_t<var, Mat2>> arena_B(B);
return_t res(arena_B.val_op().transpose() * arena_A.val_op() * arena_B.val_op());
Like up until this point I've been assuming if we read the .val() of a mat<var> more than once, we should make a copy of it on the arena.
Since this function does a lot of .val() calls on A and B, my understanding is that we need to make arena copies of the values, otherwise we're going to pointer-chase for every .val() that happens on a mat<var>, and it's the pointer chasing that is slow. I can just ditch the to_var_value stuff (it's gonna be faster than no autodiff either way) but I thought I made a pretty good argument back here: #2164 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the big Q btw. Can do the control flow thing sure.
|
|
||
| var res | ||
| = (value_of(arena_B).transpose() * value_of(arena_A) * value_of(arena_B)) | ||
| .trace(); | ||
|
|
||
| reverse_pass_callback([arena_A, arena_B, res]() mutable { | ||
| if (!is_constant<Mat1>::value) | ||
| forward_as<promote_scalar_t<var, Mat1>>(arena_A).adj() | ||
| forward_as<promote_var_matrix_t<Mat1, Mat1, Mat2>>(arena_A).adj() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should redo promote_var_matrix_t in another pull request.
I think it should just be promote_var_matrix_t<T> returns the var_mat with the same shape as T.
I think our current design of promote_var_matrix_t would be useful when writing two argument functions that took all combinations of var<mat> and mat<var>.
It seems now we are writing functions where the signatures all separate:
One function does (mat<var>, mat<var>) and returns mat<var>, and then another function handles (var<mat>, mat<var>), (mat<var>, var<mat>), (var<mat>, var<mat>) and returns a var<mat> type.
So it's easy enough from the separation of functions to know what the return types should be, and so promote_var_matrix_t can simplify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what your seeing here is moreso because of the pattern your using. imo I'd rather just not use this pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I could just do var_value<Mat1> since you mentioned it elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait that doesn't work cuz might end up with a var_value<var_value<...>>.
Anyway, you're right it's a result of the pattern so it's kinda downstream of the rest of the comments.
|
@rok-cesnovar @t4c1 @SteveBronder do we have an opencl computer I can log in to to test this? I wasn't able to get opencl running on my local computer |
|
Not really, I can give you access to our lab machine if needed, but I am not sure why would quad_form or trace_quad_form cause GLMs to fail. Maybe restart to double check? |
|
The test is |
|
Sure thing. Will run in about an hour or so. |
|
That hour was long. Runs fine locally. |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
|
@SteveBronder ready for review! |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple fixes, most of it really boils down to that I think a lot of the problems you are seeing come from the pattern where we use forward_as and promoting matrix of var to var matrix. I don't see why we need to do that promotion and I think breaking out the control flow logic will make those problems go away
stan/math/rev/fun/quad_form.hpp
Outdated
| auto arena_A = to_arena(to_var_value_if<!is_constant<Mat1>::value>(A)); | ||
| auto arena_B = to_arena(to_var_value_if<!is_constant<Mat2>::value>(B)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the logic here, why does this need to move over to a var_value conditionally or even at all? imo I'd rather have us use the scheme of moving the control flow up because it's a lot easier to read through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is a mat<var>, then it needs converted to a var<mat>, otherwise pass through.
The other option from these to_var_value_if statements are more function definitions and requires. Yes to_var_value_if and friends are confusing, but I do not think this is more confusing than:
require_eigen_st<is_var, Mat1>* = nullptr,
require_var_matrix_t<Mat2>* = nullptr
which is something like you'd need to add to the functions to specify that one of the things is an Eigen-like Matrix with a var and one of the other things is a var matrix.
The result of the to_var_value_if, regardless of what it is, needs to end up in the arena cause we use it in the reverse pass (hence everything is wrapped in a to_arena).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 Pull 👏 Out 👏 The 👏 Control 👏 Flow 👏
All of these problems go away if we just pull the control flow out into the function. There's a lot of good reasons to prefer pulling out control flow
-
Each path from the start to the return is linear. This makes it a lot easier to read and I don't have to go through each line and think, "Well what if X is var and y is not or x is var and y is var and....". In a block of the control flow I'm aware of which is which.
-
Removes the need for widgets like
forward_as<>()and more type trait layers liketo_*_if<>(). All the paths do different things and work on different types. Yinz keep saying this leads to code duplication but I disagree! The code is different in each path, it makes the code more verbose which, imo, is very different than duplication. -
Uses
value_of()less (which requires making aholder<>class forMatrix<var>) -
Adding more additional control flow is less annoying. Like if we want to do the
.noalias()optimization forvar<mat>like
if (is_var_matrix<Mat1>::value) {
arena_A.adj().noalias() += arena_B * (res.adj() * arena_B.transpose());
} else {
arena_A.adj() += arena_B * (res.adj() * arena_B.transpose());
}When we bunch the control flow together with the code this turns into
if (!is_constant<Mat1>::value) {
if (is_var_matrix<Mat1>) {
forward_as<arena_t<promote_scalar_t<var, Mat1>>>(arena_A).adj().noalias()
+= value_of(arena_B) * C_adj_B_t;
} else {
forward_as<arena_t<promote_scalar_t<var, Mat1>>>(arena_A).adj()
+= value_of(arena_B) * C_adj_B_t;
}
}That's a mouthful!
I have an example of breaking out the control flow for quad_form below. imo, way more clear! Yeah there's more literal lines of code, but like that's fine to me because I think it's simpler and easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have like 3 inputs or something then yeah I think it's a good idea to use the to_*_if() stuff cuz otherwise you end up with like a bunch of paths which feels silly, but for binary functions I think pulling out the control flow is much cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also the one downside is that for some reason the version that pulls out the flow has to use val_op() otherwise we run into const correctness errors due to something internal in Eigen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these problems go away if we just pull the control flow
"Well what if X is var and y is not or x is var and y is var and...."
Yinz keep saying this leads to code duplication but I disagree!
Sure. I disagree. I don't think any of this code is easy, and I have to read it very carefully line by line anyway.
I can pull out the control flow though (though I don't think it gets rid of the to_var_value problem).
Uses value_of() less (which requires making a holder<> class for Matrix)
I was wondering why you were swapping those all the time. I just liked value_of more cause I don't have to worry about the types. If we add a plain type overload could value_of just point to .val() (or .val_op()) instead of using a holder?
Like if we want to do the .noalias() optimization for var like
Well if we converted mat<var> to var<mat> we wouldn't need the extra logic. But if it's necessary sure. .noalias() is good optimization.
If you have like 3 inputs or something
For what it's worth, it is COming Soon (TM) with trace_gen_quad_form, but we can do different code there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I disagree. I don't think any of this code is easy, and I have to read it very carefully line by line anyway.
The asm etc that's generated for the two versions can be v different. Like for example with to_arena_if<>(), even when the bool template is false it's not a no-op and still has to do make an Eigen map. For smaller N stuff like that matters
I can pull out the control flow though (though I don't think it gets rid of the to_var_value problem).
For this I'd use something like the version I put up where it doesn't do the conversion
I was wondering why you were swapping those all the time. I just liked value_of more cause I don't have to worry about the types.
Yeah so value_of() for Eigen types calls
template <typename EigMat, require_eigen_t<EigMat>* = nullptr,
require_not_st_arithmetic<EigMat>* = nullptr>
inline auto value_of(EigMat&& M) {
return make_holder(
[](auto& a) {
return a.unaryExpr([](const auto& scal) { return value_of(scal); });
},
std::forward<EigMat>(M));
}So it's good to call value_of() of when we are doing stuff like
// A is an input argument
arena_t<promote_scalar_t<double, Mat1>> arena_A(value_of(A));But once we have the arena_t we should just call .val() etc.
If we add a plain type overload could value_of just point to .val() (or .val_op()) instead of using a holder?
Actually @t4c1 why is there a reason we don't have a separate version for plain and non-plain Eigen types? I think we could have an overload for when it's an Eigen matrix of vars that are not an expression and just return .val()
Well if we converted mat to var we wouldn't need the extra logic. But if it's necessary sure. .noalias() is good optimization.
Yeah it would be v cool if we could do that swap and have it be efficient but idt it's worth the cost of swapping stuff around.
For what it's worth, it is COming Soon (TM) with trace_gen_quad_form, but we can do different code there.
Oh yeah, I'd be a monster if I asked for breaking up the control flow for anything more than a binary function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uses value_of() less (which requires making a holder<> class for Matrix)
If we add a plain type overload could value_of just point to .val() (or .val_op()) instead of using a holder?
Holder is only needed for rvalue inputs (which would not work without it). For lvalues make_holder just calls the given functor with given argument.
Actually @t4c1 why is there a reason we don't have a separate version for plain and non-plain Eigen types? I think we could have an overload for when it's an Eigen matrix of vars that are not an expression and just return .val()
.val() does exactly the same thing as value_of called on Matrices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.val() does exactly the same thing as value_of called on Matrices.
The slight difference is that value_of will always return a CWiseUnaryOp whereas .val() will return either a CWiseUnaryView or CWiseUnaryOp, depending on const-ness
stan/math/rev/fun/quad_form.hpp
Outdated
| auto arena_res = to_arena(value_of(arena_B).transpose() * value_of(arena_A) | ||
| * value_of(arena_B)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we access the value of arena_B a lot. It may be worth doing the read just once and saving it as arena_b_val
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point it is a var_value or an arena_t<Eigen::MatrixXd>, so it's safe to use value_of
| using return_t | ||
| = promote_var_matrix_t<decltype(value_of(B).transpose().eval() | ||
| * value_of(A) * value_of(B).eval()), | ||
| Mat1, Mat2>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promote_var_matrix_t is really intended for when you don't know whether you will return a var<matrix> or matrix<var>. Here we know by the template requires that at least one of the inputs must be a var<matrix> so the return type is just going to be a var_value<Eigen::Matrix<double, -1, -1>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was a little nervous about compile time sized matrices, which is why I kept the template.
In that case I guess I could have just done var_value<decltype(value_of(B).transpose().eval() * value_of(A) * value_of(B).eval())>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I guess I could have just done
var_value<decltype(value_of(B).transpose().eval() * value_of(A) * value_of(B).eval())>.
Yes though you want
var_value<plain_type_t<decltype(value_of(B).transpose().eval() * value_of(A) * value_of(B).eval())>> because var_value<> can accept expressions as the template (For slices such as Block<>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var_value<>can accept expressions
Oh right. Can't let the guard down for a second on this stuff :/
stan/math/rev/fun/quad_form.hpp
Outdated
| reverse_pass_callback([arena_A, arena_B, res]() mutable { | ||
| auto C_adj_B_t = (res.adj() * value_of(arena_B).transpose()).eval(); | ||
|
|
||
| if (!is_constant<Mat1>::value) | ||
| forward_as<promote_var_matrix_t<Mat1, Mat1, Mat2>>(arena_A).adj() | ||
| += value_of(arena_B) * C_adj_B_t; | ||
|
|
||
| if (!is_constant<Mat2>::value) | ||
| forward_as<promote_var_matrix_t<Mat2, Mat1, Mat2>>(arena_B).adj() | ||
| += value_of(arena_A) * C_adj_B_t.transpose() | ||
| + value_of(arena_A).transpose() * value_of(arena_B) * res.adj(); | ||
| }); | ||
|
|
||
| return res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional/rant] Another thing that's nice about moving out the control flow is that it let's you have the immediete evaluation for C_adj_B_t when both are var types and for when just one is a var you can do C_adj_B_t inline for the accumulation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I see what you're saying.
Are we sure there's no temporary created in an expression like A * B * C though? I assume that is broken up into D = B * C and A * D if A, B, and C are all Eigen MatrixXd types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, the godbolt for that is getting at the level that I can read asm. Though it looks like it makes at least one alloc then can reuse that allocate with a memset under certain conditions. The main thing is that
A += B * C * D;Needs to make at least one copy for (B * C * D) because of aliasing issues with the left and right hand side. Reading over the asm I think it does one alloc here and reuses the memory for the other multiply. And idt we can use .noalias() for eigen matrices of vars since .noalias() is like the restrict keyword and we access the value and adjoint through the same pointer on both sides (aka aliasing)
Though we can use .noalias() for var matrices when we do things like
A.adj().noalias() += ...Since the values and adjoints of the var matrix are actually different pointers so there is in fact no aliasing. Does the restrict thing make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, the godbolt for that is getting at the level that I can read asm.
Alright I probably shouldn't have asked that question. Imo we're spending too long on this function.
We could do this efficiently like:
auto C_adj_B_t = to_ref_if<!is_constant<Mat1>::value && !is_constant<Mat2>::value>
(res.adj() * value_of(arena_B).transpose());But the bigger question is to_var_value_if and the thing here: #2164 (comment) so let's get those worked out
| * @tparam Mat1 type of the first (square) matrix | ||
| * @tparam Mat2 type of the second matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More informative docs would be good here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard to be more specific. Both arguments are matrix types in the sense that they satisfy require_matrix_t, and one of them needs to be a reverse mode autodiff type.
We could say inherits from EigenBase or is a var_value<T> with T such that it inherits from EigenBase but that is sorta just redefining require_matrix_t in the docs.
Suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could say inherits from EigenBase or is a var_value with T such that it inherits from EigenBase but that is sorta just redefining require_matrix_t in the docs.
I usually write that out. If someone is one the website the requires don't show up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More docs added
stan/math/rev/fun/to_var_value.hpp
Outdated
| /** | ||
| * Forward var_values on. | ||
| * | ||
| * @tparam T type of the input | ||
| * @param a matrix to convert | ||
| */ | ||
| template <typename T, require_var_t<T>* = nullptr> | ||
| auto to_var_value(T&& a) { | ||
| return std::forward<T>(a); | ||
| } | ||
|
|
||
| /** | ||
| * If the condition is true, calls `to_var_value` on the argument, otherwise | ||
| * forward the argument through. | ||
| * | ||
| * @tparam T type of argument | ||
| * @param a argument | ||
| * @return argument copied/evaluated on AD stack | ||
| */ | ||
| template <bool Condition, typename T, std::enable_if_t<!Condition>* = nullptr> | ||
| inline auto to_var_value_if(T&& a) { | ||
| return std::forward<T>(a); | ||
| } | ||
|
|
||
| template <bool Condition, typename T, std::enable_if_t<Condition>* = nullptr> | ||
| inline auto to_var_value_if(const T& a) { | ||
| return to_var_value(a); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like having the to_var_value for var<mat> but imo I think to_var_value_if() is going to cause more problems than answers
| inline var trace_quad_form(const Mat1& A, const Mat2& B) { | ||
| check_square("trace_quad_form", "A", A); | ||
| check_multiplicable("trace_quad_form", "A", A, "B", B); | ||
|
|
||
| arena_t<Mat1> arena_A = A; | ||
| arena_t<Mat2> arena_B = B; | ||
| auto arena_A = to_arena(to_var_value_if<!is_constant<Mat1>::value>(A)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following why they have to both be coerced to var matrices?
|
|
||
| var res | ||
| = (value_of(arena_B).transpose() * value_of(arena_A) * value_of(arena_B)) | ||
| .trace(); | ||
|
|
||
| reverse_pass_callback([arena_A, arena_B, res]() mutable { | ||
| if (!is_constant<Mat1>::value) | ||
| forward_as<promote_scalar_t<var, Mat1>>(arena_A).adj() | ||
| forward_as<promote_var_matrix_t<Mat1, Mat1, Mat2>>(arena_A).adj() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what your seeing here is moreso because of the pattern your using. imo I'd rather just not use this pattern
…4.1 (tags/RELEASE_600/final)
|
@SteveBronder I broke out the control flow and such here. I'm just removing the Anyway it'll need another review. I stuck with |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
|
@SteveBronder this is ready for review again. It was greenlit before I just added some more docs for the template types and merged in develop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Two comments though. imo I'd rather not pay for making the holder() which we can just do by using .val() instead of value_of() like I have below for quad_form but I think this code is also fine
| check_not_nan("multiply", "A", value_of(arena_A)); | ||
| check_not_nan("multiply", "B", value_of(arena_B)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we do arena_t<promote_scalar_t<var, Mat1>> arena_A = A; we know that arena_A has a var scalar so we can use .val() directly. I'm pretty sure value_of() is going to have to make a holder() type where calling .val() directly won't have to. Is there a reason you can't just use the version I wrote below that handles var<mat> and mat<var>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conclusion of this discussion that the value_of s here are fine and they don't create holders: #2164 (comment) .
Is there a reason you can't just use the version I wrote below that handles var and mat?
I am basically using that code. The differences are:
-
value_ofinstead of worrying about the difference invalandval_op -
We talked about at the last release to stop updating
mat<var>code unless we have benchmarks showing it's faster, so I kept the oldmat<var>stuff in place. -
The code in that pull request returns a 1x1 matrix for the vector^T matrix vector case. This is not correct (see: https://mc-stan.org/docs/2_25/functions-reference/dot-products-and-specialized-products.html). I think there is a bug in the test framework that is letting this pass. I haven't made an issue for that yet.
Sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made an issue for the 1x1 thing: #2198
| if (is_var_matrix<Mat1>::value) { | ||
| arena_A.adj().noalias() | ||
| += res.adj() * value_of(arena_B) * value_of(arena_B).transpose(); | ||
| } else { | ||
| arena_A.adj() | ||
| += res.adj() * value_of(arena_B) * value_of(arena_B).transpose(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[side note] I wonder if it would be useful or look nice to have something like
update_adjoint(arena_A, res.adj() * value_of(arena_B) * value_of(arena_B).transpose());Where for var<mat> it's just
lhs.adj().nolias() += rhsand for mat<var> it is
lhs.adj() += rhsSo we don't have to have these if-else statements all over
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about:
noalias(lhs.adj()) += rhs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah what I said doesn't make sense cause you'd still need the is_var_matrix logic.
I can add this if you want. I'm happy with ifs though.
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
Summary
var<mat>versions forquad_formandtrace_quad_formNot the cleanest. Comments inline.
Release notes
var<mat>versions ofquad_formandtrace_quad_formChecklist
Math issue Make functions with custom autodiff
var<mat>friendly #2101Copyright holder: Columbia University
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit)make test-headers)make test-math-dependencies)make doxygen)make cpplint)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested