Mixture of Common Factor Analysers¶
A mixture of common factor analysers (MCFA) is a method for modelling multi-dimensional data. The model assumes that the data are generated by a set of mutually orthogonal latent factors that are common to all data, but the scoring (or extent) of those factors is different for each data point. It also assumes that the scoring in latent space can be modelled as a mixture of multivariate Gaussian distributions. The latent space is assumed to be lower dimensional than the data.
Model parameters are estimated using the expectation-maximization algorithm, given some fixed number of latent factors and components. If the number of latent factors and components is not known then these are found through a grid search, where the minimum message length is adopted as the objective function.
Using Mock Getting started guide.
API¶
MCFA¶
-
class
mcfa.mcfa.
MCFA
(n_components, n_latent_factors, covariance_regularization=0, max_iter=10000, tol=1e-05, init_components='kmeans++', init_factors='svd', verbose=1, random_seed=None, **kwargs)[source]¶ A mixture of common factor analyzers model.
-
bic
(X, theta=None, log_likelihood=None)[source]¶ Estimate the Bayesian Information Criterion given the model and the data.
Parameters: - X – The data, \(X\), which is expected to be an array of shape [n_samples, n_features].
- theta – [optional] The model parameters \(\theta\). If None is given then the model parameters from self.theta_ will be used.
Returns: The Bayesian Information Criterion (BIC) for the model and the data. A smaller BIC value is often used as a statistic to select a single model from a class of models.
-
classmethod
deserialize
(data)[source]¶ De-serialize the data and return an object.
Parameters: data – Serialized data describing the object. Returns: A mcfa.MCFA
object.
-
expectation
(X, pi, A, xi, omega, psi, **kwargs)[source]¶ Compute the conditional expectation of the complete-data log-likelihood given the observed data \(X\) and the given model parameters.
Parameters: - X – The data, which is expected to be an array with shape [n_samples, n_features].
- pi – The relative weights for the components in the mixture. This should have size n_components and the entries should sum to one.
- A – The common factor loads between mixture components. This should have shape [n_features, n_latent_factors].
- xi – The mean factors for the components in the mixture. This should have shape [n_latent_factors, n_components].
- omega – The covariance matrix of the mixture components in latent space. This array should have shape [n_latent_factors, n_latent_factors, n_components].
- psi – The variance in each dimension. This should have size [n_features].
Raises: scipy.linalg.LinAlgError – If the covariance matrix of any mixture component in latent space is ill-conditioned or singular.
Returns: A two-length tuple containing the sum of the log-likelihood for the data given the model, and the responsibility matrix \(\tau\) giving the partial associations between each data point and each component in the mixture.
-
factor_scores
(X)[source]¶ Estimate the posterior factor scores given the model parameters.
Parameters: X – The data, \(X\), which is expected to be an array of shape [n_samples, n_features].
-
fit
(X, init_params=None, **kwargs)[source]¶ Fit the model to the data, \(Y\).
Parameters: - X – The data, \(X\), which is expected to be an array of shape [n_samples, n_features].
- init_params – [optional] A dictionary of initial values to run expectation-maximization from.
Returns: The fitted model.
-
maximization
(X, tau, pi, A, xi, omega, psi, **kwargs)[source]¶ Compute the updated estimates of the model parameters given the data, the responsibility matrix \(\tau\), and the current estimates of the model parameters.
Parameters: - X – The data, which is expected to be an array with shape [n_samples, n_features].
- tau – The responsibility matrix, which is expected to have shape [n_samples, n_components]. The sum of each row is expected to equal one, and the value in the i-th row (sample) of the j-th column (component) indicates the partial responsibility (between zero and one) that the j-th component has for the i-th sample.
- pi – The relative weights for the components in the mixture. This should have size n_components and the entries should sum to one.
- A – The common factor loads between mixture components. This should have shape [n_features, n_latent_factors].
- xi – The mean factors for the components in the mixture. This should have shape [n_latent_factors, n_components].
- omega – The covariance matrix of the mixture components in latent space. This array should have shape [n_latent_factors, n_latent_factors, n_components].
- psi – The variance in each dimension. This should have size [n_features].
Returns: A five-length tuple containing the updated parameter estimates for the mixing weights \(\pi\), the common factor loads \(A\), the means of the components in latent space \(\xi\), the covariance matrices of components in latent space \(\omega\), and the variance in each dimension \(\psi\).
-
message_length
(X, theta=None, log_likelihood=None)[source]¶ Estimate the explanation length given the model and the data.
Parameters: - X – The data, \(X\), which is expected to be an array of shape [n_samples, n_features].
- theta – [optional] The model parameters \(\theta\). If None is given then the model parameters from self.theta_ will be used.
-
number_of_parameters
(D)[source]¶ Return the number of model parameters \(Q\) required to describe data of \(D\) dimensions.
\[Q = (K - 1) + D + J(D + K) + \frac{1}{2}KJ(J + 1) - J^2\]Where \(K\) is the number of components, \(D\) is the number of dimensions in the data, and \(J\) is the number of latent factors.
Parameters: D – The dimensionality of the data (the number of features). Returns: The number of model parameters, \(Q\).
-
parameter_names
¶ Return the names of the parameters in this model.
-
pseudo_bic
(X, gamma=0.1, omega=1, theta=None)[source]¶ Estimate the pseudo Bayesian Information Criterion given the model and the data as per Gao and Carroll (2017):
Parameters: - X – The data, \(X\), which is expected to be an array of shape [n_samples, n_features].
- theta – [optional] The model parameters \(\theta\). If None is given then the model parameters from self.theta_ will be used.
Returns: The Bayesian Information Criterion (BIC) for the model and the data. A smaller BIC value is often used as a statistic to select a single model from a class of models.
-
rotate
(R, X=None, ensure_valid_rotation=True, atol=0.001, rtol=1e-05)[source]¶ Rotate the factor loads and factor scores by a valid rotation matrix.
Parameters: - R – A J times J rotation matrix, where J is the number of latent factors.
- X – [optional] The data, which is expected to be an array with shape [n_samples, n_features]. If given, the log-likelihood will be evaluated before and after rotation. A warning will be raised if the log-likelihood changes by more than the convergence tolerance.
- ensure_valid_rotation – [optional] If the rotation matrix does not follow R @ R.T = I, then the nearest rotation matrix with this property will be used.
- atol – [optional] The absolute tolerance acceptable for individual entries in the matrix I - R @ R.T. Default is 1e-3.
- rtol – The relative tolerance acceptable for individual entries in the matrix I - R @ R.T. Default is 1e-5.
Returns: The actual rotation matrix applied.
-