Skip to content

Commit 46364b5

Browse files
author
Bob Carpenter
committed
interface guidelines from stan wiki
1 parent 6b1a5af commit 46364b5

File tree

1 file changed

+131
-0
lines changed

1 file changed

+131
-0
lines changed

designs/interface-guidelines.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
- Feature Name: Interface Guidelines
2+
- Start Date: 2020-04-18
3+
- RFC PR: ??
4+
- Stan Issue:
5+
6+
This page has recommendations on how users will use Stan to compile programs and draw samples. These recommendations are being implemented in "RStan 3" and "PyStan 3". This page is intended for developers of any user interfaces to Stan.
7+
8+
# Significant Changes
9+
1. New user-facing API (see below for examples)
10+
2. Standardize on CmdStan parameter names. For example, use ``num_chains`` (CmdStan) everywhere rather than the current split (PyStan and RStan use ``chains``).
11+
3. ``permuted`` does not exist
12+
4. Replace ``sampling`` with ``sample`` verb, ``optimizing`` with verb ``maximize``
13+
5. Remove generic ``vb`` method, replace with ``experimental_advi_meanfield`` and ``experimental_advi_fullrank``
14+
15+
# Major Unresolved issues
16+
17+
* ~Should the function/method previously known as ``optimize`` be renamed to be ``maximize``? Whatever decision is made, the renaming should take place in all interfaces (including CmdStan). (AR and BG agree on ``maximize``)~ (yes, discussed 2018-07-05)
18+
* ~Should the function/method previously known as ``vb`` be called via ``experimental_advi_meanfield`` and ``experimental_advi_fullrank`` (mirroring the C++ services names)?~ (yes, discussed on 2018-07-05)
19+
* ~To what extent should the steps in the interfaces mirror those of the C++ / CmdStan?~ (User-facing API shown below is OK.)
20+
* ~What should be stand-alone functions and what should be methods, as in `build(config)` vs. `config$build()`?~ (No separate compile / build step.)
21+
* ~Do we need a separate class that only exposes the algorithms in the C++ library or is it better to for the class to expose both algorithms and lower-level functions like `log_prob`?~ (No separate classes. Advanced users will have workarounds.)
22+
* ~How granular should the estimation functions be, as in `$hmc_nuts_diag_e_adapt()` vs. `$hmc(adapt = TRUE, diag = TRUE, metric = "Euclidean")`~ (Decided by the C++ API refactor. 99% of users will just call the ``sample`` method.)
23+
24+
## RStan-specific
25+
* Eliminate ``stan`` function? (PyStan has removed it.)
26+
27+
# Other Changes and Notes
28+
1. Store draws internally with draws/chains in last dimension (num_params, num_draws, num_chains) with an eye to ragged array support and/or adding additional draws.
29+
2. To the extent possible, RStan and PyStan should use the same names for internal operations and internal class attributes.
30+
31+
# Typical User Sessions
32+
33+
## Typical R session
34+
We plan to use [ReferenceClasses](http://stat.ethz.ch/R-manual/R-devel/library/methods/html/refClass.html) throughout. See the examples section of [this](https://github.com/stan-dev/rstan/blob/develop/rstan3/R/rstan.R) for a canonical R example.
35+
36+
## Typical PyStan session
37+
```python
38+
import pystan
39+
posterior = pystan.build(schools_program_code, data=schools_data)
40+
fit = posterior.sample(num_chains=1, num_samples=2000)
41+
mu = fit["mu"] # shape (param_stan_dimensions, num_draws * num_chains) NEW!
42+
43+
fit.to_frame() # shape (num_chains * num_draws, num_flat_params)
44+
45+
estimates = fit.maximize()
46+
estimates = fit.hmc_nuts_diag_e_adapt(delta=0.9) # advanced users, unlikely to use
47+
48+
# additional features
49+
assert len(fit) == 3 # three parameters in the 8 schools model: mu, tau, eta
50+
```
51+
52+
## Typical CmdStan session
53+
54+
TBD but Julia and MATLAB just call CmdStan and thus would be similar.
55+
56+
## Typical StataStan session
57+
58+
TBD but StataStan also calls CmdStan but would probably have some quirks
59+
60+
# Methods provided by the Stan library to all interfaces
61+
62+
The class would expose the following methods from the abstract base class (that needs to be implemented):
63+
64+
- scalar log_prob(unconstrained_params)
65+
- vector grad(unconstrained_params)
66+
- tuple log_prob_grad(unconstrained_params)
67+
- matrix hessian(unconstrained_params)
68+
- tuple log_prob_grad_hessian(unconstrained_params)
69+
- scalar laplace_approx(unconstrained_params)
70+
- vector constrain_params(unconstrained_params = <vector>)
71+
- vector unconstrain_params(constrained_params = <vector>)
72+
- tuple params_info() would return
73+
- parameter names
74+
- lower and upper bounds, if any
75+
- parameter dimensions
76+
- declared type (cov_matrix, etc.)
77+
- which were declared in the parameters, transformed parameters, and generated quantities blocks
78+
79+
# MCMC output containers
80+
81+
## RStan
82+
83+
The VarWriter would fill an Rcpp::NumericVector with appropriate dimensions. Then there would be a new (S4) class that is basically an array to hold MCMC output for a particular parameter, which hinges on the params_info() method. The (array slot of a) StanType has dimensions equal to the original dimensions plus two additional trailing dimensions, namely chains and iterations. Thus,
84+
- if the parameter is originally a scalar, on the interface side it acts like a 3D array that is 1 x chains x iterations
85+
- if the parameter is originally a (row) K-vector, on the interface side it acts like a 3D array that is K x chains x iterations
86+
- if the parameter is originally a matrix, on the interface side it acts like like a 4D array that is rows x cols x chains x iterations
87+
- if the parameter is originally a ``std::vector`` of type ``foo``, on the interface side it acts like a multidimensional array with dimensions equal to the union of the ``std::vector`` dimensions, the ``foo`` dimensions, chains, and iterations
88+
89+
The advantages of having such a class hierarchy are
90+
- can do customized summaries; i.e. a ``cov_matrix`` would not have its upper triangle summarized because that is redundant with the lower triangle and we can separate the variances from the covariances and a ``cholesky_factor_cov`` would not have its upper triangle summarized because its elements are fixed zeros
91+
- can easily call unconstrain methods (in C++) for one unknown rather than having to do it for all unknowns
92+
- can implement probabalistic programming; i.e. if ``beta`` is a estimated vector in Stan, then in R ``mu <- X %*% beta`` is N x chains x iterations and if ``variance`` is a scalar then ``sigma <- sqrt(variance)`` is a can be summarized appropriately (including n_eff, etc.)
93+
94+
## PyStan
95+
96+
TBD. Allen is ambivalent about doing something like a StanType class hierarchy
97+
98+
## CmdStan
99+
100+
Everything gets written to a flat CSV file
101+
102+
## MCMC output bundle
103+
104+
## StanFitMCMC class in R
105+
106+
TBD
107+
108+
## StanFitMCMC class in PyStan
109+
110+
**instance fields**
111+
- param_draws : This needs to be [ params x chains x iterations ] : double in contiguous memory or similar if the C++ API for split_rhat is going to be called directly and fast.
112+
- param_names params : string
113+
- num_warmup 1 : long
114+
- timestamps: timestamp (long) for each iteration (possibly roll into param_draws)
115+
- mass matrix : NULL | params | params x params : double
116+
- diagnostic draws: diagnostic params x chains x iterations
117+
- diagnostic names: diagnostic params : string
118+
119+
120+
## Optimize output bundle
121+
122+
### StanFitOptimize
123+
note: computation of the hessian is optional at optimization time
124+
**instance fields**
125+
- param values
126+
- param names
127+
- value (value of function)
128+
- (optinal) hessian
129+
- (optional) diagnostic param values: diagnostic params x iterations
130+
- diagnostic param names: diagnostic params : string
131+

0 commit comments

Comments
 (0)