Skip to content

Commit f6ab8c7

Browse files
author
Andrew Johnson
committed
First draft
1 parent d447282 commit f6ab8c7

File tree

1 file changed

+240
-0
lines changed

1 file changed

+240
-0
lines changed
+240
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
- Feature Name: vector_unary_functions
2+
- Start Date: 2019-12-15
3+
- RFC PR: (leave this empty)
4+
- Stan Issue: (leave this empty)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
The proposed changes introduce a framework for extending unary vector functions to work with column vectors, row vectors, and std vectors, as well as std vectors of these. The proposed framework also works with ```Eigen``` expression templates as inputs.
10+
11+
# Motivation
12+
[motivation]: #motivation
13+
14+
Currently, for a function to work with these different vector types (and containers of these vectors types), a different specialisation is required for each. This results in a large amount of code duplication and additional maintenance burden. By implementing a general framework, rather than individual specialisations, the function only needs to be defined once for it to work with all vector types (as well as containers of these vector types).
15+
16+
# Guide-level explanation
17+
[guide-level-explanation]: #guide-level-explanation
18+
19+
There are three primary types of unary vector functions used in the Math library:
20+
21+
- ```f(vector) -> vector``` (e.g. ```log_softmax```)
22+
- ```f(vector) -> scalar``` (e.g. ```log_sum_exp```)
23+
- ```f(vector, scalar) -> vector``` (e.g. ```head```)
24+
25+
Most of these functions operate solely on ```Eigen``` column vectors. While some of these also operate on ```std::vector```s, this requires the re-specification of the function using only standard library functions. This introduces two unwanted side effects. Firstly, this requires an increasing amount of code to maintain as these functions are extended to work with different types as inputs. Secondly, as different coding approaches are required for different inputs (i.e. ```Eigen``` vs standard library), functions will perform differently depending on the vector type that is passed to them.
26+
27+
By introducing a vectorisation framework for unary vector functions, a developer need only define their function once using ```Eigen```'s matrix/array functions, and it will automatically be able to take all vector types as inputs.
28+
29+
# Reference-level explanation
30+
[reference-level-explanation]: #reference-level-explanation
31+
32+
The proposed ```apply_vector_unary``` framework is derived from the existing ```apply_scalar_unary``` framework (and possibly supercedes it). There are three functions within the vectorisation framework, to handle the three types of unary vector functions mentioned above.
33+
34+
The first, ```apply_vector_unary<T>::apply(x, f)```, is used for the ```f(vector) -> vector``` function type. Here, the template type ```T``` refers to the type of vector input (e.g. ```VectorXd``` or ```std::vector<std::vector<double>>```), ```x``` is the input vector, and ```f``` is the functor defining how the input vector should be manipulated. Using ```log_softmax``` as an example, this is how its use would look:
35+
```
36+
template <typename T>
37+
inline auto log_softmax(const T& x) {
38+
return apply_vector_unary<T>::apply(x, [](auto& v){
39+
check_nonzero_size("log_softmax", "v", v);
40+
return v.array() - log_sum_exp(v);
41+
});
42+
}
43+
```
44+
As you can see, very little code is added to the existing function, since the current implementation is simply wrapped inside a lambda expression at the call to ```apply_vector_unary<T>::apply()```.
45+
46+
The second function, ```apply_vector_unary<T>::reduce(x, f)```, is used for the ```f(vector) -> scalar``` function type. The use of this itself looks identical to that of ```apply_vector_unary<T>::apply()```, but the return type at the end is either a scalar or an ```std::vector``` of scalars (when a nested container is passed as an input). Using ```log_sum_exp``` as an example:
47+
```
48+
template <typename T>
49+
inline auto log_sum_exp(const T& x) {
50+
return apply_vector_unary<T>::reduce(x, [](auto& v){
51+
if (v.size() == 0) {
52+
return -std::numeric_limits<double>::infinity();
53+
}
54+
55+
const double max = v.maxCoeff();
56+
if (!std::isfinite(max)) {
57+
return max;
58+
}
59+
return max + std::log((v.array() - max).exp().sum());
60+
});
61+
}
62+
```
63+
64+
The third function, ```apply_vector_unary<T>::apply_scalar(x, y, f)```, is used for the ```f(vector, scalar) -> vector``` function type. This implementation allows for both an input vector ```x``` and additional scalar ```y``` to be used with the defined function ```f```. Additionally, when a nested container (e.g. ```std::vector<Eigen::VectorXd>```) is passed as the input vector, the additional scalar ```y``` can be either a single scalar, or a vector of scalars. Using ```head``` as an example:
65+
```
66+
template <typename T, typename T2>
67+
inline auto head(const T& x, const T2& n) {
68+
return apply_vector_unary<T>::apply_scalar(x, n, [](auto& v, auto& m){
69+
if (m != 0){
70+
if (v.rows() == 1){
71+
check_column_index("head", "n", v, m);
72+
} else {
73+
check_row_index("head", "n", v, m);
74+
}
75+
}
76+
return v.head(m);
77+
});
78+
}
79+
```
80+
# Drawbacks
81+
[drawbacks]: #drawbacks
82+
83+
The only possible drawback (that's immediately apparent to me), is that these implementations all rely on the use of ```Eigen::Map``` to work with ```std::vector```s. This approach assumes that the combination of ```Eigen::Map``` + ```Eigen``` functions is faster than using standard library functions. Given ```Eigen```'s SIMD vectorisation and lazy evaluation, this feels like a safe bet, but there may be currently-unknown edge cases where this is not true.
84+
85+
# Rationale and alternatives
86+
[rationale-and-alternatives]: #rationale-and-alternatives
87+
88+
This approach was chosen as it requires minimal changes to existing vector functions (as seen above) and results in greatly-expanded functionality. If this approach is not used, the same functionality would require an additional specialisation for each vector type for each function, which would result in a not-insignificant maintenance burden.
89+
90+
91+
# Prior art
92+
[prior-art]: #prior-art
93+
94+
As mentioned earlier, this approach was largely inspired by the existing ```apply_scalar_unary``` vectorisation framework.
95+
96+
# Unresolved questions
97+
[unresolved-questions]: #unresolved-questions
98+
99+
The naming of these functions (```apply```, ```reduce```, ```apply_scalar```) might not be very intuitive, and so may need to be changed. There may also be other vector function types that I'm missing, and other vector input types that I'm not considering and should be added.
100+
101+
# Appendix
102+
103+
```apply_vector_unary``` code:
104+
```
105+
template <typename T, typename Enable = void>
106+
struct apply_vector_unary {};
107+
108+
// Eigen types: Matrix, Array, Expression, other
109+
template <typename T>
110+
struct apply_vector_unary<T, require_eigen_t<T>> {
111+
template <typename F>
112+
static inline auto apply(const T& x, const F& f) {
113+
return f(x);
114+
}
115+
116+
template <typename F, typename T2>
117+
static inline auto apply_scalar(const T& x, const T2& y, const F& f) {
118+
return f(x, y);
119+
}
120+
121+
template <typename F>
122+
static inline auto reduce(const T& x, const F& f) {
123+
return f(x);
124+
}
125+
};
126+
127+
// std::vector<T> types, where T is a scalar
128+
template <typename T>
129+
struct apply_vector_unary<T, require_std_vector_vt<is_stan_scalar, T>> {
130+
using scalar_t = scalar_type_t<T>;
131+
using map_t =
132+
typename Eigen::Map<const Eigen::Matrix<scalar_t,Eigen::Dynamic,1>>;
133+
134+
template <typename F>
135+
static inline std::vector<scalar_t> apply(const T& x, const F& f) {
136+
Eigen::Matrix<scalar_t,Eigen::Dynamic,1> result
137+
= apply_vector_unary<map_t>::apply(as_column_vector_or_scalar(x), f);
138+
return std::vector<scalar_t>(result.data(),
139+
result.data() + result.size());
140+
}
141+
142+
template <typename F, typename T2>
143+
static inline std::vector<scalar_t> apply_scalar(const T& x, const T2& y, const F& f) {
144+
Eigen::Matrix<scalar_t,Eigen::Dynamic,1> result
145+
= apply_vector_unary<map_t>::apply_scalar(as_column_vector_or_scalar(x), y, f);
146+
return std::vector<scalar_t>(result.data(),
147+
result.data() + result.size());
148+
}
149+
150+
template <typename F>
151+
static inline scalar_t reduce(const T& x, const F& f) {
152+
return apply_vector_unary<map_t>::reduce(as_column_vector_or_scalar(x), f);
153+
}
154+
};
155+
156+
// std::vector<T> types, where T is an Eigen type
157+
template <typename T>
158+
struct apply_vector_unary<T, require_std_vector_vt<is_eigen, T>> {
159+
using eigen_t = typename T::value_type;
160+
using scalar_t = typename eigen_t::Scalar;
161+
using return_t = std::vector<Eigen::Matrix<scalar_t,
162+
eigen_t::RowsAtCompileTime,
163+
eigen_t::ColsAtCompileTime>>;
164+
165+
template <typename F>
166+
static inline return_t apply(const T& x, const F& f) {
167+
size_t x_size = x.size();
168+
return_t result(x_size);
169+
for(size_t i = 0; i < x_size; ++i)
170+
result[i] = apply_vector_unary<eigen_t>::apply(x[i], f);
171+
return result;
172+
}
173+
174+
template <typename F, typename T2>
175+
static inline return_t apply_scalar(const T& x, const T2& y, const F& f) {
176+
scalar_seq_view<T2> y_vec(y);
177+
size_t x_size = x.size();
178+
return_t result(x_size);
179+
for(size_t i = 0; i < x_size; ++i)
180+
result[i] = apply_vector_unary<eigen_t>::apply_scalar(x[i], y_vec[i], f);
181+
return result;
182+
}
183+
184+
template <typename F>
185+
static inline std::vector<scalar_t> reduce(const T& x, const F& f) {
186+
size_t x_size = x.size();
187+
std::vector<scalar_t> result(x_size);
188+
for(size_t i = 0; i < x_size; ++i)
189+
result[i] = apply_vector_unary<eigen_t>::reduce(x[i], f);
190+
return result;
191+
}
192+
};
193+
194+
// std::vector<T> types, where T is an std::vector
195+
template <typename T>
196+
struct apply_vector_unary<T, require_std_vector_vt<is_std_vector, T>> {
197+
using scalar_t = scalar_type_t<T>;
198+
using return_t = typename std::vector<std::vector<scalar_t>>;
199+
using map_t =
200+
typename Eigen::Map<const Eigen::Matrix<scalar_t,Eigen::Dynamic,1>>;
201+
202+
template <typename F>
203+
static inline return_t apply(const T& x, const F& f) {
204+
size_t x_size = x.size();
205+
return_t result(x_size);
206+
Eigen::Matrix<scalar_t,Eigen::Dynamic,1> inter;
207+
for(size_t i = 0; i < x_size; ++i){
208+
inter = apply_vector_unary<map_t>::apply(as_column_vector_or_scalar(x[i]), f);
209+
result[i] = std::vector<scalar_t>(inter.data(),
210+
inter.data() + inter.size());
211+
}
212+
return result;
213+
}
214+
215+
template <typename F, typename T2>
216+
static inline return_t apply_scalar(const T& x, const T2& y, const F& f) {
217+
scalar_seq_view<T2> y_vec(y);
218+
size_t x_size = x.size();
219+
return_t result(x_size);
220+
Eigen::Matrix<scalar_t,Eigen::Dynamic,1> inter;
221+
for(size_t i = 0; i < x_size; ++i){
222+
inter = apply_vector_unary<map_t>::apply_scalar(as_column_vector_or_scalar(x[i]), y_vec[i], f);
223+
result[i] = std::vector<scalar_t>(inter.data(),
224+
inter.data() + inter.size());
225+
}
226+
return result;
227+
}
228+
229+
template <typename F>
230+
static inline std::vector<scalar_t> reduce(const T& x, const F& f) {
231+
size_t x_size = x.size();
232+
std::vector<scalar_t> result(x_size);
233+
for(size_t i = 0; i < x_size; ++i)
234+
result[i] = apply_vector_unary<map_t>::reduce(as_column_vector_or_scalar(x[i]), f);
235+
236+
return result;
237+
}
238+
};
239+
240+
```

0 commit comments

Comments
 (0)