Welcome to libTLDA’s documentation!

libTLDA is a library of transfer learners and domain-adaptive classifiers. It is designed to give researchers and engineers an oppportunity to quickly test a number of classifiers.

More information will be added.

Contents:

Installation

libTLDA is registered on PyPI and can be installed through:

pip install libtlda

Virtual environment

Pip takes care of all dependencies, but the addition of these dependencies can mess up your current python environment. To ensure a clean install, it is recommended to set up a virtual environment using conda or virtualenv. To ease this set up, an environment file is provided, which can be run through:

conda env create -f environment.yml
source activate libtlda

For more information on getting started, see the Examples section.

Classifiers

This page contains the list of classes of classifiers including all member functions.

Importance-Weighted Classifier

class libtlda.iw.ImportanceWeightedClassifier(loss_function='logistic', l2_regularization=None, weight_estimator='lr', smoothing=True, clip_max_value=-1, kernel_type='rbf', bandwidth=1)

Class of importance-weighted classifiers.

Methods contain different importance-weight estimators and different loss functions.

Examples

>>>> X = np.random.randn(10, 2)
>>>> y = np.vstack((-np.ones((5,)), np.ones((5,))))
>>>> Z = np.random.randn(10, 2)
>>>> clf = ImportanceWeightedClassifier()
>>>> clf.fit(X, y, Z)
>>>> u_pred = clf.predict(Z)

Methods

fit(X, y, Z) Fit/train an importance-weighted classifier.
get_params() Get classifier parameters.
get_weights() Get estimated importance weights.
is_trained() Check whether classifier is trained.
iwe_kernel_densities(X, Z) Estimate importance weights based on kernel density estimation.
iwe_kernel_mean_matching(X, Z) Estimate importance weights based on kernel mean matching.
iwe_logistic_discrimination(X, Z) Estimate importance weights based on logistic regression.
iwe_nearest_neighbours(X, Z) Estimate importance weights based on nearest-neighbours.
iwe_ratio_gaussians(X, Z) Estimate importance weights based on a ratio of Gaussian distributions.
predict(Z) Make predictions on new dataset.
predict_proba(Z) Compute posterior probabilities on new dataset.
fit(X, y, Z)

Fit/train an importance-weighted classifier.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
None
get_params()

Get classifier parameters.

get_weights()

Get estimated importance weights.

is_trained()

Check whether classifier is trained.

iwe_kernel_densities(X, Z)

Estimate importance weights based on kernel density estimation.

Parameters:
X : array

source data (N samples by D features)

Z : array

target data (M samples by D features)

Returns:
array

importance weights (N samples by 1)

iwe_kernel_mean_matching(X, Z)

Estimate importance weights based on kernel mean matching.

Parameters:
X : array

source data (N samples by D features)

Z : array

target data (M samples by D features)

Returns:
iw : array

importance weights (N samples by 1)

iwe_logistic_discrimination(X, Z)

Estimate importance weights based on logistic regression.

Parameters:
X : array

source data (N samples by D features)

Z : array

target data (M samples by D features)

Returns:
array

importance weights (N samples by 1)

iwe_nearest_neighbours(X, Z)

Estimate importance weights based on nearest-neighbours.

Parameters:
X : array

source data (N samples by D features)

Z : array

target data (M samples by D features)

Returns:
iw : array

importance weights (N samples by 1)

iwe_ratio_gaussians(X, Z)

Estimate importance weights based on a ratio of Gaussian distributions.

Parameters:
X : array

source data (N samples by D features)

Z : array

target data (M samples by D features)

Returns:
iw : array

importance weights (N samples by 1)

predict(Z)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

predict_proba(Z)

Compute posterior probabilities on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
probs : array

label predictions (M samples by K)

Transfer Component Classifier

class libtlda.tca.TransferComponentClassifier(loss_function='logistic', l2_regularization=1.0, mu=1.0, num_components=1, kernel_type='rbf', bandwidth=1.0, order=2.0)

Class of classifiers based on Transfer Component Analysis.

Methods contain component analysis and general utilities.

Methods

fit(X, y, Z) Fit/train a classifier on data mapped onto transfer components.
get_params() Get classifier parameters.
is_trained() Check whether classifier is trained.
kernel(X, Z[, type, order, bandwidth]) Compute kernel for given data set.
predict(Z) Make predictions on new dataset.
transfer_component_analysis(X, Z) Transfer Component Analysis.
fit(X, y, Z)

Fit/train a classifier on data mapped onto transfer components.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
None
get_params()

Get classifier parameters.

is_trained()

Check whether classifier is trained.

kernel(X, Z, type='rbf', order=2, bandwidth=1.0)

Compute kernel for given data set.

Parameters:
X : array

data set (N samples by D features)

Z : array

data set (M samples by D features)

type : str

type of kernel, options: ‘linear’, ‘polynomial’, ‘rbf’, ‘sigmoid’ (def: ‘linear’)

order : float

degree for the polynomial kernel (def: 2.0)

bandwidth : float

kernel bandwidth (def: 1.0)

Returns:
array

kernel matrix (N+M by N+M)

predict(Z)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

transfer_component_analysis(X, Z)

Transfer Component Analysis.

Parameters:
X : array

source data set (N samples by D features)

Z : array

target data set (M samples by D features)

Returns:
C : array

transfer components (D features by num_components)

K : array

source and target data kernel distances

Subspace Aligned Classifier

class libtlda.suba.SemiSubspaceAlignedClassifier(loss_function='logistic', l2_regularization=None, subspace_dim=1)

Class of classifiers based on semi-supervised Subspace Alignment.

Methods contain the alignment itself, classifiers and general utilities.

Examples

>>>> X = np.random.randn(10, 2)
>>>> y = np.vstack((-np.ones((5,)), np.ones((5,))))
>>>> Z = np.random.randn(10, 2)
>>>> clf = SubspaceAlignedClassifier()
>>>> clf.fit(X, y, Z)
>>>> preds = clf.predict(Z)

Methods

align_classes(X, Y, Z, u, CX, CZ, V) Project each class separately.
find_medioid(X, Y) Find point with minimal distance to all other points.
fit(X, Y, Z[, u]) Fit/train a classifier on data mapped onto transfer components.
get_params() Get classifier parameters.
is_pos_def(A) Check for positive definiteness.
predict(Z[, zscore]) Make predictions on new dataset.
predict_proba(Z[, zscore, signed_classes]) Make predictions on new dataset.
reg_cov(X) Regularize covariance matrix until non-singular.
score(Z, U[, zscore]) Compute classification error on test set.
semi_subspace_alignment(X, Y, Z, u[, …]) Compute subspace and alignment matrix, for each class.
align_classes(X, Y, Z, u, CX, CZ, V)

Project each class separately.

Parameters:
X : array

source data set (N samples x D features)

Y : array

source labels (N samples x 1)

Z : array

target data set (M samples x D features)

u : array

target labels (m samples x 2)

CX : array

source principal components (K classes x D features x d subspaces)

CZ : array

target principal components (K classes x D features x d subspaces)

V : array

transformation matrix (K classes x d subspaces x d subspaces)

Returns:
X : array

transformed X (N samples x d features)

Z : array

transformed Z (M samples x d features)

find_medioid(X, Y)

Find point with minimal distance to all other points.

Parameters:
X : array

data set, with N samples x D features.

Y : array

labels to select for which samples to compute distances.

Returns:
x : array

medioid

ix : int

index of medioid

fit(X, Y, Z, u=None)

Fit/train a classifier on data mapped onto transfer components.

Parameters:
X : array

source data (N samples x D features).

Y : array

source labels (N samples x 1).

Z : array

target data (M samples x D features).

u : array

target labels, first column corresponds to index of Z and second column corresponds to actual label (number of labels x 2).

Returns:
None
get_params()

Get classifier parameters.

is_pos_def(A)

Check for positive definiteness.

A : array
square symmetric matrix.
Returns:
bool

whether matrix is positive-definite. Warning! Returns false for arrays containing inf or NaN.

predict(Z, zscore=False)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples x D features)

zscore : boolean

whether to transform the data using z-scoring (def: false)

Returns:
preds : array

label predictions (M samples x 1)

predict_proba(Z, zscore=False, signed_classes=False)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples x D features)

zscore : boolean

whether to transform the data using z-scoring (def: false)

Returns:
preds : array

label predictions (M samples x 1)

reg_cov(X)

Regularize covariance matrix until non-singular.

Parameters:
C : array

square symmetric covariance matrix.

Returns:
C : array

regularized covariance matrix.

score(Z, U, zscore=False)

Compute classification error on test set.

Parameters:
Z : array

new data set (M samples x D features)

zscore : boolean

whether to transform the data using z-scoring (def: false)

Returns:
preds : array

label predictions (M samples x 1)

semi_subspace_alignment(X, Y, Z, u, subspace_dim=1)

Compute subspace and alignment matrix, for each class.

Parameters:
X : array

source data set (N samples x D features)

Y : array

source labels (N samples x 1)

Z : array

target data set (M samples x D features)

u : array

target labels, first column is index in Z, second column is label (m samples x 2)

subspace_dim : int

Dimensionality of subspace to retain (def: 1)

Returns:
V : array

transformation matrix (K, D features x D features)

CX : array

source principal component coefficients

CZ : array

target principal component coefficients

class libtlda.suba.SubspaceAlignedClassifier(loss_function='logistic', l2_regularization=None, subspace_dim=1)

Class of classifiers based on Subspace Alignment.

Methods contain the alignment itself, classifiers and general utilities.

Examples

>>>> X = np.random.randn(10, 2)
>>>> y = np.vstack((-np.ones((5,)), np.ones((5,))))
>>>> Z = np.random.randn(10, 2)
>>>> clf = SubspaceAlignedClassifier()
>>>> clf.fit(X, y, Z)
>>>> preds = clf.predict(Z)

Methods

align_data(X, Z, CX, CZ, V) Align data to components and transform source.
fit(X, Y, Z) Fit/train a classifier on data mapped onto transfer components.
get_params() Get classifier parameters.
is_pos_def(A) Check for positive definiteness.
predict(Z[, zscore]) Make predictions on new dataset.
predict_proba(Z[, zscore, signed_classes]) Make predictions on new dataset.
reg_cov(X) Regularize covariance matrix until non-singular.
score(Z, U[, zscore]) Compute classification error on test set.
subspace_alignment(X, Z[, subspace_dim]) Compute subspace and alignment matrix.
zca_whiten(X) Perform ZCA whitening (aka Mahalanobis whitening).
align_data(X, Z, CX, CZ, V)

Align data to components and transform source.

Parameters:
X : array

source data set (N samples x D features)

Z : array

target data set (M samples x D features)

CX : array

source principal components (D features x d subspaces)

CZ : array

target principal component (D features x d subspaces)

V : array

transformation matrix (d subspaces x d subspaces)

Returns:
X : array

transformed source data (N samples x d subspaces)

Z : array

projected target data (M samples x d subspaces)

fit(X, Y, Z)

Fit/train a classifier on data mapped onto transfer components.

Parameters:
X : array

source data (N samples x D features).

Y : array

source labels (N samples x 1).

Z : array

target data (M samples x D features).

Returns:
None
get_params()

Get classifier parameters.

is_pos_def(A)

Check for positive definiteness.

predict(Z, zscore=False)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples x D features)

zscore : boolean

whether to transform the data using z-scoring (def: false)

Returns:
preds : array

label predictions (M samples x 1)

predict_proba(Z, zscore=False, signed_classes=False)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples x D features)

zscore : boolean

whether to transform the data using z-scoring (def: false)

Returns:
preds : array

label predictions (M samples x 1)

reg_cov(X)

Regularize covariance matrix until non-singular.

Parameters:
C : array

square symmetric covariance matrix.

Returns:
C : array

regularized covariance matrix.

score(Z, U, zscore=False)

Compute classification error on test set.

Parameters:
Z : array

new data set (M samples x D features)

zscore : boolean

whether to transform the data using z-scoring (def: false)

Returns:
preds : array

label predictions (M samples x 1)

subspace_alignment(X, Z, subspace_dim=1)

Compute subspace and alignment matrix.

Parameters:
X : array

source data set (N samples x D features)

Z : array

target data set (M samples x D features)

subspace_dim : int

Dimensionality of subspace to retain (def: 1)

Returns:
V : array

transformation matrix (D features x D features)

CX : array

source principal component coefficients

CZ : array

target principal component coefficients

zca_whiten(X)

Perform ZCA whitening (aka Mahalanobis whitening).

Parameters:
X : array (M samples x D features)

data matrix.

Returns:
X : array (M samples x D features)

whitened data.

Robust Bias-Aware Classifier

class libtlda.rba.RobustBiasAwareClassifier(l2=0.0, order='first', gamma=1.0, tau=1e-05, learning_rate=1.0, rate_decay='linear', max_iter=100, clip=1000, verbose=True)

Class of robust bias-aware classifiers.

Reference: Liu & Ziebart (20140. Robust Classification under Sample Selection Bias. NIPS.

Methods contain training and prediction functions.

Methods

feature_stats(X, y[, order]) Compute first-order moment feature statistics.
fit(X, y, Z) Fit/train a robust bias-aware classifier.
get_params() Get classifier parameters.
is_trained() Check whether classifier is trained.
iwe_kernel_densities(X, Z[, clip]) Estimate importance weights based on kernel density estimation.
learning_rate_t(t) Compute current learning rate after decay.
posterior(psi) Class-posterior estimation.
predict(Z) Make predictions on new dataset.
predict_proba(Z) Compute posteriors on new dataset.
psi(X, theta, w[, K]) Compute psi function.
feature_stats(X, y, order='first')

Compute first-order moment feature statistics.

Parameters:
X : array

dataset (N samples by D features)

y : array

label vector (N samples by 1)

Returns:
array

array containing label vector, feature moments and 1-augmentation.

fit(X, y, Z)

Fit/train a robust bias-aware classifier.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
None
get_params()

Get classifier parameters.

is_trained()

Check whether classifier is trained.

iwe_kernel_densities(X, Z, clip=1000)

Estimate importance weights based on kernel density estimation.

Parameters:
X : array

source data (N samples by D features)

Z : array

target data (M samples by D features)

clip : float

maximum allowed value for individual weights (def: 1000)

Returns:
array

importance weights (N samples by 1)

learning_rate_t(t)

Compute current learning rate after decay.

Parameters:
t : int

current iteration

Returns:
alpha : float

current learning rate

posterior(psi)

Class-posterior estimation.

Parameters:
psi : array

weighted data-classifier output (N samples by K classes)

Returns:
pyx : array

class-posterior estimation (N samples by K classes)

predict(Z)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

predict_proba(Z)

Compute posteriors on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

psi(X, theta, w, K=2)

Compute psi function.

Parameters:
X : array

data set (N samples by D features)

theta : array

classifier parameters (D features by 1)

w : array

importance-weights (N samples by 1)

K : int

number of classes (def: 2)

Returns:
psi : array

array with psi function values (N samples by K classes)

Structural Correspondence Learner

class libtlda.scl.StructuralCorrespondenceClassifier(loss='logistic', l2=1.0, num_pivots=1, num_components=1)

Class of classifiers based on structural correspondence learning.

Methods consist of a way to augment features, and a Huber loss function plus gradient.

Methods

Huber_grad(theta, X, y[, l2]) Huber gradient computation.
Huber_loss(theta, X, y[, l2]) Huber loss function.
augment_features(X, Z[, l2]) Find a set of pivot features, train predictors and extract bases.
fit(X, y, Z) Fit/train an structural correpondence classifier.
get_params() Get classifier parameters.
is_trained() Check whether classifier is trained.
predict(Z) Make predictions on new dataset.
Huber_grad(theta, X, y, l2=0.0)

Huber gradient computation.

Reference: Ando & Zhang (2005a). A framework for learning predictive structures from multiple tasks and unlabeled data. JMLR.

Parameters:
theta : array

classifier parameters (D features by 1)

X : array

data (N samples by D features)

y : array

label vector (N samples by 1)

l2 : float

l2-regularization parameter (def= 0.0)

Returns:
array

Gradient with respect to classifier parameters

Huber_loss(theta, X, y, l2=0.0)

Huber loss function.

Reference: Ando & Zhang (2005a). A framework for learning predictive structures from multiple tasks and unlabeled data. JMLR.

Parameters:
theta : array

classifier parameters (D features by 1)

X : array

data (N samples by D features)

y : array

label vector (N samples by 1)

l2 : float

l2-regularization parameter (def= 0.0)

Returns:
array

Objective function value.

augment_features(X, Z, l2=0.0)

Find a set of pivot features, train predictors and extract bases.

Parameters X : array

source data array (N samples by D features)
Z : array
target data array (M samples by D features)
l2 : float
regularization parameter value (def: 0.0)
Returns:
None
fit(X, y, Z)

Fit/train an structural correpondence classifier.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
None
get_params()

Get classifier parameters.

is_trained()

Check whether classifier is trained.

predict(Z)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

Feature-Level Domain-Adaptive Classifier

class libtlda.flda.FeatureLevelDomainAdaptiveClassifier(l2=0.0, loss='logistic', transfer_model='blankout', max_iter=100, tolerance=1e-05, verbose=True)

Class of feature-level domain-adaptive classifiers.

Reference: Kouw, Krijthe, Loog & Van der Maaten (2016). Feature-level domain adaptation. JMLR.

Methods contain training and prediction functions.

Methods

fit(X, y, Z) Fit/train a robust bias-aware classifier.
flda_log_grad(theta, X, y, E, V[, l2]) Compute gradient with respect to theta for flda-log.
flda_log_loss(theta, X, y, E, V[, l2]) Compute average loss for flda-log.
get_params() Get classifier parameters.
is_trained() Check whether classifier is trained.
mle_transfer_dist(X, Z[, dist]) Maximum likelihood estimation of transfer model parameters.
moments_transfer_model(X, iota[, dist]) Moments of the transfer model.
predict(Z_) Make predictions on new dataset.
fit(X, y, Z)

Fit/train a robust bias-aware classifier.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
None
flda_log_grad(theta, X, y, E, V, l2=0.0)

Compute gradient with respect to theta for flda-log.

Parameters:
theta : array

classifier parameters (D features by 1)

X : array

source data set (N samples by D features)

y : array

label vector (N samples by 1)

E : array

expected value with respect to transfer model (N samples by D features)

V : array

variance with respect to transfer model (D features by D features by N samples)

l2 : float

regularization parameter (def: 0.0)

Returns:
dR : array

Value of gradient.

flda_log_loss(theta, X, y, E, V, l2=0.0)

Compute average loss for flda-log.

Parameters:
theta : array

classifier parameters (D features by 1)

X : array

source data set (N samples by D features)

y : array

label vector (N samples by 1)

E : array

expected value with respect to transfer model (N samples by D features)

V : array

variance with respect to transfer model (D features by D features by N samples)

l2 : float

regularization parameter (def: 0.0)

Returns:
dL : array

Value of loss function.

get_params()

Get classifier parameters.

is_trained()

Check whether classifier is trained.

mle_transfer_dist(X, Z, dist='blankout')

Maximum likelihood estimation of transfer model parameters.

Parameters:
X : array

source data set (N samples by D features)

Z : array

target data set (M samples by D features)

dist : str

distribution of transfer model, options are ‘blankout’ or ‘dropout’ (def: ‘blankout’)

Returns:
iota : array

estimated transfer model parameters (D features by 1)

moments_transfer_model(X, iota, dist='blankout')

Moments of the transfer model.

Parameters:
X : array

data set (N samples by D features)

iota : array

transfer model parameters (D samples by 1)

dist : str

transfer model, options are ‘dropout’ and ‘blankout’ (def: ‘blankout’)

Returns:
E : array

expected value of transfer model (N samples by D feautures)

V : array

variance of transfer model (D features by D features by N samples)

predict(Z_)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

Target Contrastive Pessimistic Classifier

class libtlda.tcpr.TargetContrastivePessimisticClassifier(loss='lda', l2=1.0, max_iter=500, tolerance=1e-12, learning_rate=1.0, rate_decay='linear', verbosity=0)

Classifiers based on Target Contrastive Pessimistic Risk minimization.

Methods contain models, risk functions, parameter estimation, etc.

Methods

add_intercept(X) Add 1’s to data as last features.
combine_class_covariances(Si, pi) Linear combination of class covariance matrices.
discriminant_parameters(X, Y) Estimate parameters of Gaussian distribution for discriminant analysis.
error_rate(preds, u_) Compute classification error rate.
fit(X, y, Z) Fit/train an importance-weighted classifier.
get_params() Return classifier parameters.
learning_rate_t(t) Compute current learning rate after decay.
neg_log_likelihood(X, theta) Compute negative log-likelihood under Gaussian distributions.
predict(Z_) Make predictions on new dataset.
predict_proba(Z) Compute posteriors on new dataset.
project_simplex(v[, z]) Project vector onto simplex using sorting.
remove_intercept(X) Remove 1’s from data as last features.
risk(Z, theta, q) Compute target contrastive pessimistic risk.
tcpr_da(X, y, Z) Target Contrastive Pessimistic Risk - discriminant analysis.
add_intercept(X)

Add 1’s to data as last features.

combine_class_covariances(Si, pi)

Linear combination of class covariance matrices.

Parameters:
Si : array

Covariance matrix (D features by D features by K classes)

pi : array

class proportions (1 by K classes)

Returns:
Si : array

Combined covariance matrix (D by D)

discriminant_parameters(X, Y)

Estimate parameters of Gaussian distribution for discriminant analysis.

Parameters:
X : array

data array (N samples by D features)

Y : array

label array (N samples by K classes)

Returns:
pi : array

class proportions (1 by K classes)

mu : array

class means (K classes by D features)

Si : array

class covariances (D features D features by K classes)

error_rate(preds, u_)

Compute classification error rate.

fit(X, y, Z)

Fit/train an importance-weighted classifier.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
None
get_params()

Return classifier parameters.

learning_rate_t(t)

Compute current learning rate after decay.

Parameters:
t : int

current iteration

Returns:
alpha : float

current learning rate

neg_log_likelihood(X, theta)

Compute negative log-likelihood under Gaussian distributions.

Parameters:
X : array

data (N samples by D features)

theta : tuple(array, array, array)

tuple containing class proportions ‘pi’, class means ‘mu’, and class-covariances ‘Si’

Returns:
L : array

loss (N samples by K classes)

predict(Z_)

Make predictions on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

predict_proba(Z)

Compute posteriors on new dataset.

Parameters:
Z : array

new data set (M samples by D features)

Returns:
preds : array

label predictions (M samples by 1)

project_simplex(v, z=1.0)

Project vector onto simplex using sorting.

Reference: “Efficient Projections onto the L1-Ball for Learning in High Dimensions (Duchi, Shalev-Shwartz, Singer, Chandra, 2006).”

Parameters:
v : array

vector to be projected (n dimensions by 0)

z : float

constant (def: 1.0)

Returns:
w : array

projected vector (n dimensions by 0)

remove_intercept(X)

Remove 1’s from data as last features.

risk(Z, theta, q)

Compute target contrastive pessimistic risk.

Parameters:
Z : array

target samples (M samples by D features)

theta : array

classifier parameters (D features by K classes)

q : array

soft labels (M samples by K classes)

Returns:
float

Value of risk function.

tcpr_da(X, y, Z)

Target Contrastive Pessimistic Risk - discriminant analysis.

Parameters:
X : array

source data (N samples by D features)

y : array

source labels (N samples by 1)

Z : array

target data (M samples by D features)

Returns:
theta : array

classifier parameters (D features by K classes)

Examples

In the /demos folder, there are a number of example scripts. These show a potential use case on synthetic data.

Here we walk through a simple version.

First, we import a number of modules and generate a synthetic data set:

import numpy as np
import numpy.random as rnd

from sklearn.linear_model import LogisticRegression
from libtlda.iw import ImportanceWeightedClassifier

"""Generate synthetic data set"""

# Sample sizes
N = 100
M = 50

# Class properties
labels = [0, 1]
nK = 2

# Dimensionality
D = 2

# Source domain
pi_S = [1./2, 1./2]
si_S = 1.0
N0 = int(np.round(N*pi_S[0]))
N1 = N - N0
X0 = rnd.randn(N0, D)*si_S + (-2, 0)
X1 = rnd.randn(N1, D)*si_S + (+2, 0)
X = np.concatenate((X0, X1), axis=0)
y = np.concatenate((labels[0]*np.ones((N0,), dtype='int'),
                    labels[1]*np.ones((N1,), dtype='int')), axis=0)

# Target domain
pi_T = [1./2, 1./2]
si_T = 3.0
M0 = int(np.round(M*pi_T[0]))
M1 = M - M0
Z0 = rnd.randn(M0, D)*si_T + (-2, -2)
Z1 = rnd.randn(M1, D)*si_T + (+2, +2)
Z = np.concatenate((Z0, Z1), axis=0)
u = np.concatenate((labels[0]*np.ones((M0,), dtype='int'),
                    labels[1]*np.ones((M1,), dtype='int')), axis=0)

Next, we create an adaptive classifier:

# Call an importance-weighted classifier
clf = ImportanceWeightedClassifier(iwe='lr', loss='logistic')

# Train classifier
clf.fit(X, y, Z)

# Make predictions
pred_adapt = clf.predict(Z)

We can compare this with a non-adaptive classifier:

# Train a naive logistic regressor
lr = LogisticRegression().fit(X, y)

# Make predictions
pred_naive = lr.predict(Z)

And compute error rates:

# Compute error rates
print('Error naive: ' + str(np.mean(pred_naive != u, axis=0)))
print('Error adapt: ' + str(np.mean(pred_adapt != u, axis=0)))

Contact

Any comments, questions, or general feedback can be submitted to the repository’s issues tracker.

If you would like to see a particular classifier / model / algorithm / technique / method for transfer learning or domain adaptation, please submit this to the issues tracker as well.