Adding three distributions: Spherical, OrthogonalMatrices and NormalSingularValues (working names) #465
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Over the long weekend, I dug into BPCA topic again and implemented Nirwan et al "Rotation Invariant Householder Parameterization for Bayesian PCA", which basically involved three distributions of potentially wider interest.
A distribution over direction vectors, i.e. a spherically symmetric distribution.
This basically means putting a prior on the euclidean norm of the vector. There is a longer discussion on what prior to put on it in the stan forums, but I'm thinking of implementing it generically so it can be given as a parameter.
A uniform distribution over all (semi-)orthogonal matrices i.e. Stiefel manifold. This is done following Mezzadri "How to generate random matrices from the classical compact groups" and basically boils down to sampling random rotation vectors and using them as bases for sequential Householder transformations.
A distribution of the singular values of a matrix of i.i.d. standard normal distributions. This one is adapted from James, Lee "Concise Probability Distributions of Eigenvalues of Real-Valued Wishart Matrices"
While all three work for my current use case, there is definitely a lot of cleanup work to be done to make them generic. Most notably:
Before I take that on, though, there is some discussion to be had.
a) Do all three of the distributions make sense to add?
They are somewhat esoteric, but my gut feeling tells me they do have other uses than the BPCA I needed them for. Spherical is probably the easiest to find uses for of the three. Stan discussion linked above got sparked from finding a better geometry than atan2 for Von Mises distribution .
b) What would be good names for them?
NormalSingularValues is the only one I'm happy with out of the three right now, but it could also be eigenvalues of Wishart (in which case NSV would be a simple square root of that).
Orthogonal could be called Stiefel, but that is an esoteric term not that many people are familiar with. RandomOrthogonalMatrix is more descriptive, but it also covers Semi-Orthogonal matrices, for instance.
And Spherical also probably needs some clarifying words next to it but I can't even propose good ones.
c) How should I implement the OrthogonalMatrix as a proper distribution? It basically requires K draws from Spherical as its only source of randomness, so one option would be to allow spherical to take in a list of vector lengths and then output a single vector that is the concatenation, and then do OrthogonalMatrix as just a transformation on top of that (analogous to how Chi is implemented on top of ChiSquared). But this would add some rather odd complexity into the Spherical, so I'm wondering if there are any other good options.
In short - I need someone to sanity check all of this and help me think it through on the abstract level. @zaxtax, are you up for doing a third PR with me in a row?