Welcome to ManifoldLearning’s documentation

ManifoldLearning.jl is a Julia package for manifold learning and non-linear dimensionality reduction. It proides set of nonlinear dimensionality reduction methods, such as Isomap, LLE, LTSA, etc.

Methods:

Isomap

Isomap is a method for low-dimensional embedding. Isomap is used for computing a quasi-isometric, low-dimensional embedding of a set of high-dimensional data points [1].

This package defines a Isomap type to represent a Isomap results, and provides a set of methods to access its properties.

Properties

Let M be an instance of Isomap, n be the number of observations, and d be the output dimension.

outdim(M)

Get the output dimension d, i.e the dimension of the subspace.

projection(M)

Get the projection matrix (of size (d, n)). Each column of the projection matrix corresponds to an observation in projected subspace.

neighbors(M)

The number of nearest neighbors used for approximating local coordinate structure.

ccomponent(M)

The observations index array of the largest connected component of the distance matrix.

Data Transformation

One can use the transform method to perform Isomap over a given dataset.

transform(Isomap, X; ...)

Perform Isomap over the data given in a matrix X. Each column of X is an observation.

This method returns an instance of Isomap.

Keyword arguments:

name description default
k The number of nearest neighbors for determining local coordinate structure. 12
d Output dimension. 2

Example:

using ManifoldLearning

# suppose X is a data matrix, with each observation in a column
# apply Isomap transformation to the dataset
Y = transform(Isomap, X; k = 12, d = 2)

References

[1]Tenenbaum, J. B., de Silva, V. and Langford, J. C. “A Global Geometric Framework for Nonlinear Dimensionality Reduction”. Science 290 (5500): 2319-2323, 22 December 2000. http://isomap.stanford.edu/

Diffusion maps

Diffusion maps leverages the relationship between heat diffusion and a random walk; an analogy is drawn between the diffusion operator on a manifold and a Markov transition matrix operating on functions defined on the graph whose nodes were sampled from the manifold [1].

This package defines a DiffMap type to represent a Hessian LLE results, and provides a set of methods to access its properties.

Properties

Let M be an instance of DiffMap, n be the number of observations, and d be the output dimension.

outdim(M)

Get the output dimension d, i.e the dimension of the subspace.

projection(M)

Get the projection matrix (of size (d, n)). Each column of the projection matrix corresponds to an observation in projected subspace.

kernel(M)

The kernel matrix.

Data Transformation

One can use the transform method to perform DiffMap over a given dataset.

transform(DiffMap, X; ...)

Perform DiffMap over the data given in a matrix X. Each column of X is an observation.

This method returns an instance of DiffMap.

Keyword arguments:

name description default
d Output dimension. 2
t Number of time steps. 1
ɛ The scale parameter. 1.0

Example:

using ManifoldLearning

# suppose X is a data matrix, with each observation in a column
# apply DiffMap transformation to the dataset
Y = transform(DiffMap, X; d=2, t=1, ɛ=1.0)

References

[1]Coifman, R. & Lafon, S. “Diffusion maps”. Applied and Computational Harmonic Analysis, Elsevier, 2006, 21, 5-30. DOI:10.1073/pnas.0500334102

Laplacian Eigenmaps

Laplacian Eigenmaps (LEM) method uses spectral techniques to perform dimensionality reduction. This technique relies on the basic assumption that the data lies in a low-dimensional manifold in a high-dimensional space. The algorithm provides a computationally efficient approach to non-linear dimnsionality reduction that has locally preserving properties [1].

This package defines a LEM type to represent a Laplacian Eigenmaps results, and provides a set of methods to access its properties.

Properties

Let M be an instance of LEM, n be the number of observations, and d be the output dimension.

outdim(M)

Get the output dimension d, i.e the dimension of the subspace.

projection(M)

Get the projection matrix (of size (d, n)). Each column of the projection matrix corresponds to an observation in projected subspace.

neighbors(M)

The number of nearest neighbors used for approximating local coordinate structure.

eigvals(M)

The eigenvalues of alignment matrix.

Data Transformation

One can use the transform method to perform LEM over a given dataset.

transform(LEM, X; ...)

Perform LEM over the data given in a matrix X. Each column of X is an observation.

This method returns an instance of LEM.

Keyword arguments:

name description default
k The number of nearest neighbors for determining local coordinate structure. 12
d Output dimension. 2
t The temperature parameters of the heat kernel. 1.0

Example:

using ManifoldLearning

# suppose X is a data matrix, with each observation in a column
# apply Laplacian Eigenmaps transformation to the dataset
Y = transform(LEM, X; k = 12, d = 2, t = 1.0)

References

[1]Belkin, M. and Niyogi, P. “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation”. Neural Computation, June 2003; 15 (6):1373-1396. DOI:10.1162/089976603321780317

Locally Linear Embedding

Locally Linear Embedding (LLE) technique builds a single global coordinate system of lower dimensionality. By exploiting the local symmetries of linear reconstructions, LLE is able to learn the global structure of nonlinear manifolds [1].

This package defines a LLE type to represent a LLE results, and provides a set of methods to access its properties.

Properties

Let M be an instance of LLE, n be the number of observations, and d be the output dimension.

outdim(M)

Get the output dimension d, i.e the dimension of the subspace.

projection(M)

Get the projection matrix (of size (d, n)). Each column of the projection matrix corresponds to an observation in projected subspace.

neighbors(M)

The number of nearest neighbors used for approximating local coordinate structure.

eigvals(M)

The eigenvalues of alignment matrix.

Data Transformation

One can use the transform method to perform HLLE over a given dataset.

transform(LLE, X; ...)

Perform LLE over the data given in a matrix X. Each column of X is an observation.

This method returns an instance of LLE.

Keyword arguments:

name description default
k The number of nearest neighbors for determining local coordinate structure. 12
d Output dimension. 2

Example:

using ManifoldLearning

# suppose X is a data matrix, with each observation in a column
# apply LLE transformation to the dataset
Y = transform(LLE, X; k = 12, d = 2)

References

[1]Roweis, S. & Saul, L. “Nonlinear dimensionality reduction by locally linear embedding”, Science 290:2323 (2000). DOI:10.1126/science.290.5500.2323

Hessian Eigenmaps

The Hessian Eigenmaps (Hessian LLE, HLLE) method adapts the weights in LLE to minimize the Hessian operator. Like LLE, it requires careful setting of the nearest neighbor parameter. The main advantage of Hessian LLE is the only method designed for non-convex data sets [1].

This package defines a HLLE type to represent a Hessian LLE results, and provides a set of methods to access its properties.

Properties

Let M be an instance of HLLE, n be the number of observations, and d be the output dimension.

outdim(M)

Get the output dimension d, i.e the dimension of the subspace.

projection(M)

Get the projection matrix (of size (d, n)). Each column of the projection matrix corresponds to an observation in projected subspace.

neighbors(M)

The number of nearest neighbors used for approximating local coordinate structure.

eigvals(M)

The eigenvalues of alignment matrix.

Data Transformation

One can use the transform method to perform HLLE over a given dataset.

transform(HLLE, X; ...)

Perform HLLE over the data given in a matrix X. Each column of X is an observation.

This method returns an instance of HLLE.

Keyword arguments:

name description default
k The number of nearest neighbors for determining local coordinate structure. 12
d Output dimension. 2

Example:

using ManifoldLearning

# suppose X is a data matrix, with each observation in a column
# apply HLLE transformation to the dataset
Y = transform(HLLE, X; k = 12, d = 2)

References

[1]Donoho, D. and Grimes, C. “Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data”, Proc. Natl. Acad. Sci. USA. 2003 May 13; 100(10): 5591–5596. DOI:10.1073/pnas.1031596100

Local Tangent Space Alignment

Local tangent space alignment (LTSA) is a method for manifold learning, which can efficiently learn a nonlinear embedding into low-dimensional coordinates from high-dimensional data, and can also reconstruct high-dimensional coordinates from embedding coordinates [1].

This package defines a LTSA type to represent a LTSA results, and provides a set of methods to access its properties.

Properties

Let M be an instance of LTSA, n be the number of observations, and d be the output dimension.

outdim(M)

Get the output dimension d, i.e the dimension of the subspace.

projection(M)

Get the projection matrix (of size (d, n)). Each column of the projection matrix corresponds to an observation in projected subspace.

neighbors(M)

The number of nearest neighbors used for approximating local coordinate structure.

eigvals(M)

The eigenvalues of alignment matrix.

Data Transformation

One can use the transform method to perform LTSA over a given dataset.

transform(LSTA, X; ...)

Perform LTSA over the data given in a matrix X. Each column of X is an observation.

This method returns an instance of LTSA.

Keyword arguments:

name description default
k The number of nearest neighbors for determining local coordinate structure. 12
d Output dimension. 2

Example:

using ManifoldLearning

# suppose X is a data matrix, with each observation in a column
# apply LTSA transformation to the dataset
Y = transform(LTSA, X; k = 12, d = 2)

References

[1]Zhang, Zhenyue; Hongyuan Zha. “Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment”. SIAM Journal on Scientific Computing 26 (1): 313–338, 2004. DOI:10.1137/s1064827502419154

Notes:

All methods implemented in this package adopt the column-major convention of JuliaStats: in a data matrix, each column corresponds to a sample/observation, while each row corresponds to a feature (variable or attribute).