VIPurPCA offers a visualization of uncertainty propagated through the dimensionality reduction technique Principal Component Analysis (PCA) by automatic differentiation.
VIPurPCA requires Python 3.7.3 or later and can be installed via:
pip install vipurpca
A website showing results and animations can be found here.
In order to propagate uncertainty through PCA the class PCA
can be used, which has the following parameters, attributes, and methods:
Parameters | |
---|---|
matrix : array_like Array of size [n, p] containing mean numbers to which VIPurPCA should be applied. |
|
sample_cov : array_like of shape [n, n] or [n], default=None, optional Input uncertainties in terms of the sample covariance matrix. If sample_cov is one-dimensional its values are assumed to be the diagonal of a diagonal matrix. Used to compute the total covariance matrix over the input using the Kronecker product of sample_cov and feature_cov. |
|
feature_cov : array_like of shape [p, p] or [p], default=None, optional Input uncertainties in terms of the feature covariance matrix. If feature_cov is one-dimensional its values are assumed to be the diagonal of a diagonal matrix. Used to compute the total covariance matrix over the input using the Kronecker product of sample_cov and feature_cov. |
|
full_cov : array_like of shape [np, np] or [np], default=None, optional Input uncertainties in terms of the full covariance matrix. If full_cov is one-dimensional its values are assumed to be the diagonal of a diagonal matrix. Used alternatively to the Kronecker product of sample_cov and feature_cov. Requires more memory. |
|
n_components : int or float, default=None, optional Number of components to keep. |
|
axis : {0, 1} , default=0, optional The default expects samples in rows and features in columns. |
Attributes | |
---|---|
size : [n, p] Dimension of matrix (n: number of samples, p: number of dimensions) |
|
eigenvalues : ndarray of size [n_components] Eigenvalues obtained from eigenvalue decomposition of the covariance matrix. |
|
eigenvectors : ndarray of size [n_componentsp, np] Eigenvectors obtained from eigenvalue decomposition of the covariance matrix. |
|
jacobian : ndarray of size [n_componentsp, np] Jacobian containing derivatives of eigenvectors w.r.t. input matrix. |
|
cov_eigenvectors : ndarray of size [n_componentsp, n_componentsp] Propagated uncertainties of eigenvectors. |
|
transformed data : ndarray of size [n, n_components] Low dimensional representation of data after applying PCA. |
Methods | |
---|---|
pca_value() | Apply PCA to the matrix. |
compute_cov_eigenvectors(save_jacobian=False) | Compute uncertainties of eigenvectors. |
animate(pcx=1, pcy=2, n_frames=10, labels=None, outfile='animation.gif') | Generate animation of PCA-plot of PC pcx vs. PC pcy with n_frames number of frames. labels (list, 1d array) indicate labelling of individual samples. > |
Two example datasets can be loaded after installing VIPurPCA providing mean, covariance and labels.
from vipurpca import load_data
Y, cov_Y, y = load_data.load_studentgrades_dataset()
Y, cov_Y, y = load_data.load_estrogen_dataset()
More information on the datasets can be found here
from vipurpca import load_data
from vipurpca import PCA
# load mean (Y), uncertainty estimates (cov_Y) and labels (y)
Y, cov_Y, y = load_data.load_estrogen_dataset()
pca = PCA(matrix=Y, sample_cov=None, feature_cov=None,
full_cov=cov_Y, n_components=3, axis=0)
# compute PCA
pca.pca_value()
# Bayesian inference
pca.compute_cov_eigenvectors(save_jacobian=False)# Create animation
pca.animate(1, 2, labels=y, outfile='animation.gif')
The resulting animation can be found here here.