Welcome to TNNF’s documentation!

Contents:

How to Install and Run

Python installation

We use Python 2.7.4+ It is true that some (or all) of libraries may work with Python3, but we didn’t test this case.

  • Linux
    Python included in most recent distributive and you can easily run it and check its version.

    In console:

    user$ python -V
    

    If you don’t have Python, you can easily installed it on most distributive executing:

    • For Ubuntu/Debian:

      user$ sudo apt-get install python
      
    • For Red Hat/CentOS:

      user$ sudo yum install python
      

    or equivalent for your UNIX based OS.

  • Windows:
    Choose and download correct version for your system from here.
    Then run and proceed with installation.

    To check python’s version run:

    C:\Users\user> python -V
    

Dependencies

Here are libraries that should be installed before running code:

  • Numpy
  • Theano
  • PIL
  • cPickle
  • matplotlib
  • h5py

Each of these libraries has its own dependencies (which may overlap). All dependencies should be satisfied.

The best suggestion is to use installation instruction for each particular library on their official site:

Caution

Installing Theano on Windows machine may be difficult, buggy and frustrating.

It is obvious Theano’s advantage that its (properly written) code is easily can be run on CPU or GPU without editing.

There is nothing special you have to do to run it on CPU. All dependencies will be installed during Theano installation. When talking about GPU - there are a few constraints.

Theano’s GPU limitations:

  • nVidia GPU only
  • supports only CUDA (OpenCL support is expected in near future)
  • While computing on GPU - use float32 only. (Float64 is requested, but not implemented yet)

To perform GPU calculation you have to have:

  • Supported nVidia GPU
  • Appropriate nVidia driver installed
  • Appropriate CUDA toolkit installed (both 5.5 and 5.0 are supported)

TNNF installation

There is no “installation” needed.

SSBrain is a number of Python modules.

The only thing you need:

  1. Download latest version from GitHub

  2. Import it using standard Python syntax:

    import fTheanoNNclassCORE
    
  3. You are ready to go!

Examples

Simplest example

I’ll try do describe how to use TNNF to solve a simplest artificial task I was able to design.

Let’s create a simple classifier that will assign 0 or 1 class to data from input. And as we don’t want random classification, let’s try to train our classifier on previously manually labeled data.

Data

Let’s imagine we have some amount of manually labeled data (to label data we will use particular “decision rule”). This data will be used to train our classifier.

In this particular task I assume we have to classify a pair of numbers \((X_1, X_2)\), where \(X_1, X_2 \in [0, 1)\) and assign each pair 0 or 1 class. To manually label this data let’s use such formula:

\[\begin{split}Label = \begin{cases} 0 & \text{if: } -X_1 + 1 < X_2\\ 1 & \text{if: } -X_1 + 1 \geq X_2 \end{cases}\end{split}\]

Here is how our labeled data looks on 2D plot:

Labeled raw data
Neural Network

As was mentioned - let’s use TNNF to solve this task.

What we have:

  • Randomly generated data for training
  • Randomly generated data for cross-validation
  • Labels

What we want to achieve:

  • Given generated data - using TNNF - predict correct labels

To do this, we will use one-layer Neural Network with simplest Linear activation function and architecture:

  • Input layer: 2 neurons
  • Output layer: 1 neuron

To define predicted label we will round the activation of output layer:

\[\begin{split}Output = \begin{cases} 0 & activation \leq 0.5\\ 1 & activation > 0.5 \end{cases}\end{split}\]

Here is how it looks in code.

Set general options for whole network, such as:

  • Learning step
  • Size of mini-batch we will use (in this case we use full batch)
  • Size of cross-validation set

#Common options for whole NN
options = OptionsStore(learnStep=0.05,
                       minibatch_size=dataSize,
                       CV_size=dataSize)

Describe per layer architcture. Set:

  • Number of neurons on layer’s input
  • Number of neurons on layer’s output
  • Specify activation function to use

#Layer architecture
#We will use only one layer with 2 neurons on input and 1 on output
L1 = LayerNN(size_in=dataFeatures,
             size_out=1,
             activation=FunctionModel.Linear)

Put everything together and create network object:


#Compile NN
NN = TheanoNNclass(options, (L1, ))

How it performs

Instead of talking how well it performs its better to show this. To visualise predicted boundary we will use network’s weights and bias that it learned during training.

As we use Linear activation function, we can write a formula to calc output activation:

\[activation = w_1X_1 + w_2X_2 + b\]

Using our previous formula to define predicted label, we can rewrite this as follows:

\[w_1X_1 + w_2X_2 + b \gtreqless 0.5\]

To be able to draw predicted decision boundary we need to express \(X_2\) :

\[X_2 \gtreqless -\frac{w_1}{w_2}X_1 + \frac{0.5 - b}{w_2}\]

Here is how it looks like in code:


        #Recalculate predicted decision boundary
        y_predicted = -w1 * x / w2 + (0.5 - b) / w2

If you enable drawing and set reasonable drawEveryStep you will get number of pictures that shows how Neural Net evolves. To get rid you from routine we already did this with small drawEveryStep and prepare a gif to show a progress:

_images/How_it_performs.gif

As you can see - with each iteration Neural Net moves closer to original decision boundary. That is what we wanted to show you: how Neural Network adapt and learn on real data.

Here is one more informative graph. We can track how accuracy and network error evolves vs iterations.

_images/NN_error.png

It almost reach 100% accuracy!

Full script listing:

import numpy as np
import unittest
import os
import sys
sys.path.append('../../../CORE')
from fTheanoNNclassCORE import OptionsStore, TheanoNNclass, NNsupport, FunctionModel, LayerNN
from fDataWorkerCORE import csvDataLoader
from fGraphBuilderCORE import Graph
from matplotlib.pylab import plot, title, xlabel, ylabel, legend, grid, margins, savefig, close, xlim, ylim

dataSize = 1000
dataFeatures = 2

#Supposed boundary line
#Where x - [0] row, f(x) - [1] row in data
# if f(x) < -x + 1 - then label = 1
#                    else label = 0

#Create random data [0, 1)
data = np.random.rand(dataFeatures, dataSize)

#Create random cross-validation [0, 1)
CV = np.random.rand(dataFeatures, dataSize)

#Create array for labels
labels = np.zeros((1, dataSize))

#Create array for cross-validation labels
CV_labels = np.zeros((1, dataSize))

#Calc labels (and cross-validation) based on supposed boundary decision function analytically
labels[0, :] = -data[0, :] + 1 > data[1, :]
CV_labels[0, :] = -CV[0, :] + 1 > CV[1, :]

#Let's draw our data and decision boundary we use to divide it
#Calc decision boundary
x = np.arange(0, 1.0, 0.02)
y = -x + 1

#Draw labeled data
#Uncomment next part if you want to visualise input data
'''
ones = np.array([[], []])
zeros = np.array([[], []])
for i in xrange(dataSize):
    if labels[0, i] == 0:
        zeros = np.append(zeros, data[:, i].reshape(-1, 1), axis=1)
    else:
        ones = np.append(ones, data[:, i].reshape(-1, 1), axis=1)

xlim(0, 1)
ylim(0, 1)

plot(ones[0, :], ones[1, :], 'gx', markeredgewidth=1, label='Ones')
plot(zeros[0, :], zeros[1, :], 'bv', markeredgewidth=1, label='Zeros')
plot(x, y, 'r.', markeredgewidth=0, label='Decision boundary')
xlabel('X_1')
ylabel('X_2')

legend(loc='upper right', fontsize=10, numpoints=3, shadow=True, fancybox=True)
grid()
savefig('data_visualisation.png', dpi=120)
close()
'''

#Check average
avgLabel = np.average(labels)

print data.shape
print 'Data:\n', data[:, :6]
print labels.shape
print 'Average label (should be ~ 0.5):', avgLabel

#For now we have labeled and checked data.
#Let's try to train NN to see, how it solves such task
#NN part

#Common options for whole NN
options = OptionsStore(learnStep=0.05,
                       minibatch_size=dataSize,
                       CV_size=dataSize)

#Layer architecture
#We will use only one layer with 2 neurons on input and 1 on output
L1 = LayerNN(size_in=dataFeatures,
             size_out=1,
             activation=FunctionModel.Linear)

#Compile NN
NN = TheanoNNclass(options, (L1, ))

#Compile NN train
NN.trainCompile()

#Compile NN oredict
NN.predictCompile()

#Number of iterations (cycles of training)
iterations = 1000

#Set step to draw
drawEveryStep = 100
draw = False

#CV error accumulator (for network estimation)
cv_err = []

#Accuracy accumulator (for network estimation)
acc = []

#Main cycle
for i in xrange(iterations):

    #Train NN using given data and labels
    NN.trainCalc(data, labels, iteration=1, debug=True)

    #Draw data, original and current decision boundary every drawEveryStep's step
    if draw and i % drawEveryStep == 0:

        #Get current coefficient for our network
        b = NN.varWeights[0]['b'].get_value()[0]
        w1 = NN.varWeights[0]['w'].get_value()[0][0]
        w2 = NN.varWeights[0]['w'].get_value()[0][1]

        #Recalculate predicted decision boundary
        y_predicted = -w1 * x / w2 + (0.5 - b) / w2

        #Limit our plot by axes
        xlim(0, 1)
        ylim(0, 1)

        #Plot predicted decision boundary
        plot(x, y_predicted, 'g.', markeredgewidth=0, label='Predicted boundary')

        #Plot original decision boundary
        plot(x, y, 'r.', markeredgewidth=0, label='Original boundary')

        #Plot raw data
        plot(data[0, :], data[1, :], 'b,', label='data')

        #Draw legend
        legend(loc='upper right', fontsize=10, numpoints=3, shadow=True, fancybox=True)

        #Eanble grid
        grid()

        #Save plot to file
        savefig('data' + str(i) + '.png', dpi=120)

        #Close and clear current plot
        close()

        #Estimate Neural Network error (square error, "distance" between real and predicted value) on cross-validation
        cv_err.append(NNsupport.crossV(CV_labels, CV, NN))

        #Estimate network's accuracy
        accuracy = np.mean(CV_labels == np.round(NN.out))
        acc.append(accuracy)

        #Draw how error and accuracy evolves vs iterations
        Graph.Builder(name='NN_error.png', error=NN.errorArray, cv=cv_err, accuracy=acc, legend_on=True)

Simple AutoEncoder

Here I’ll describe second step in understanding what TNNF can do for you. Using MNIST data - let’s create simple (one layer) sparse AutoEncoder (AE), train it and visualise its weights. This will give understanding of how to compose a little bit complicate networks in TNNF (two layers) and how sparse AE works.

Data

To train our AE - let’s use widely known MNIST data set.

You can download it in .csv format here: http://www.pjreddie.com/projects/mnist-in-csv

As .csv format is comparatively slow and increases train time significantly we recommend to use HDF to store data on your drive.

It takes you to install h5py package to start use it. Here is more about it.

Note

It’s better to download their package instead of trying to install through pip.

Once you install h5py you need to convert .csv to HDF. I’ve prepared a short script in Python for you to do this.

You can download it directly from GitHub. You need to run it against both .csv files: train and test sets.

Here is what you need to change there:


srcFolder = './src/'
csv_type = '.csv'
hdf_type = '.hdf5'
target_csv = 'mnist_test'
target_hdf = 'mnist_test'

Where:
srcFolder - directory, where csv file located
csv_type - .csv file extension. You don’t need to change it usually.
hdf_type - HFD file extension. You don’t need to change it usually.
target_csv - .csv file name, without extension.
target_hdf - HDF file name. File will be created and filled with data from .csv .

After all you will have two .h5py files with train and test MNIST data.

Mini batches

Using whole data we have is usually computationally ineffective. Moreover, in modern task it is impossible. We can gather millions (or billions) of examples for our training, so we will face with memory issue even for data storage. To address this issue people usually use mini batches. In other words, instead of training algorithm on the whole available data - we train algorithm on the number of randomly sampled examples from the whole data set. We call such examples - mimibatch. Be aware, examples should be sampled randomly from whole dat set into the miniBatch.

To speed up training time, we will use miniBatches. To sample each batch we will use small function included in the example code:



#Sampling minibatch from whole data set
def getBatch(d, n, i):
    idx = random.sample(i, n)
    idx = np.sort(idx)
    #Remove labels and read data
    res = d[idx, 1:]
    #Normalise output from 0..255 to 0..1
    return res.T / 255.0

Where:
  • d - HDF data set
  • n - number of batches to sample
  • i - list of valid data’s indexes to sample from
Theory

A little theory about sparse constraint, how it looks like.

When we introduce sparse constraint to our network, we expect that average activation of particular neuron for one data’s batch will be equal to the one, we specified. To achieve this - we will add penalty to the network error each time our real average activation will diff from the specified. To estimate, how big this penalty should be we use Kullback-Leibler equation. Here how it looks like:

\[penalty=\sum\limits_{j=1}^{hiddenUnits}\rho\log\frac{\rho}{\hat\rho_{j}}+(1-\rho)\log\frac{1-\rho}{1-\hat\rho_{j}}\]

To be more human oriented, let’s visualise its graph.

On the following graph we assume we want our average activation = 0.2 . Then, given variety of average activation on X-axis we can observe penalty value on Y-axis.

_images/KL.png

As you can see - Penalty is very close to zero only when our Average activation is close to the 0.2 and vice-versa - the more average activation gets away from desired 0.2, the bigger Penalty becomes.

We will restrict hidden layer average activation to force neural network detect common patterns in input data. Concretely, we restrict the number of neurons we need to describe data. This means, that every neuron that activates - represents pattern (features) in input data.

Neural Network

We done with theory, let’s program AE using TNNF!

So our AE consist of two layers and has Input, Hidden and Output abstracts. Don’t be confused with three abstracts. Input abstract is always there. It doesn’t take separate layer.

As input we have 28x28 images, that gives us 28 x 28 = 784 values for each example on the input. As we train AE, the neural network that tries to replicate on the output what it has on input - we have 784 values on output.

Let’s use 196 neuron on the hidden layer with average activation = 0.1 . (Architecture took from UFLDL tutorial)

Everything above give us next network architecture:

  • Input: 784 neurons
  • Hidden: 196 neurons (with average activation restricted to 0.1)
  • Output: 784 neurons

To represent this in code we need two layers. First layer:

L1 = LayerNN(size_in=inputSize,
             size_out=numberOfFeatures,
             sparsity=0.1,
             beta=3,
             weightDecay=3e-3,
             activation=FunctionModel.Sigmoid)

Where:
  • size_in - 784 (number of neurons on the input)
  • size_out - 196 (number of neurons on hidden layer)
  • sparsity - 0.1, average hidden layer activation we want to achieve
  • beta - sparsity coefficient (how strong we amplify sparsity penalty)
  • weightDecay - weight penalty (just to improve training time, you can try to run without it)
  • activation - activation function to use (calculating sparsity constraint makes sense only for Sigmoid)

Second layer:

L2 = LayerNN(size_in=numberOfFeatures,
             size_out=inputSize,
             weightDecay=3e-3,
             activation=FunctionModel.Sigmoid)

Where:
  • size_in - 196 (number of neurons on hidden layer, we use first layer output as input to the second)
  • size_out - 784 (number of neurons on output equal to the input)
  • weightDecay - weight penalty (again, just to improve training time, you can try to run without it)
  • activation - activation function to use (calculating sparsity constraint makes sense only for Sigmoid)

Be default, script will do the following:

  • assume that HDF data files are located in: script directory/Data/src (you can change this in the script)
  • train set file name: mnist_train.hdf5
  • test set file name: mnist_test.hdf5
  • perform 10,000 iterations
  • use miniBatch size of 200 examples per batch
  • check cross validation (CV) error every 500 iterations and print it
  • print square error and square error + all penalty each iteration
  • save error vs iterations graph every 500 iterations in the script’s directory
  • save hidden layer weights visualisation in the script’s directory

On laptop GPU GeForce GT 635M it takes about 3 minute to finish. On laptop CPU i5-3210M CPU @ 2.50GHz (on numpy 1.9.1 and OpenBLAS recompiled to use all cores) it takes about 8 minutes.

Parameters we use to run tests:

THEANO_FLAGS=mode=FAST_RUN,floatX=float32,device=gpu python ScriptName.py

How AE error evolves with iterations:

_images/AE_error.png

To inspire you and add some clearness what AE is and how its weights evolves through the time we prepared and added cool gif visualisation. It is ~5 Mb, so it takes some time to load.

_images/multi_weights.gif

Final features (hidden layer’s weights) in the original size:

_images/multi_weights.png

Full SimpleAutoEncoder script listing:

import numpy as np
import unittest
import os
import sys
import h5py
import random
sys.path.append('../../../CORE')
from fTheanoNNclassCORE import OptionsStore, TheanoNNclass, NNsupport, FunctionModel, LayerNN
from fGraphBuilderCORE import Graph


#Sampling minibatch from whole data set
def getBatch(d, n, i):
    idx = random.sample(i, n)
    idx = np.sort(idx)
    #Remove labels and read data
    res = d[idx, 1:]
    #Normalise output from 0..255 to 0..1
    return res.T / 255.0

#We use HDF because of its speed and convenience
#Set data's file names and path
srcFolder = './Data/src/'
hdf_type = '.hdf5'
train_set = 'mnist_train'
test_set = 'mnist_test'

#Read train data
f_train = h5py.File(srcFolder + train_set + hdf_type, 'r+')
DATA = f_train['/hdfDataSet']

#Read CV data
f_test = h5py.File(srcFolder + test_set + hdf_type, 'r+')
DATA_CV = f_test['/hdfDataSet']

#Print out shapes of loaded data
print 'Data shape:', DATA.shape, '\n', 'CV shape:', DATA_CV.shape

#Extract some useful data
dataSize = DATA.shape[0]
cvSize = DATA_CV.shape[0]
validDataIndexes = xrange(0, dataSize)

# As we have all data we need for Auto Encoder (AE),
# let's create an appropriate NN

# Set few additional options
numberOfFeatures = 196
batchSize = 200
inputSize = DATA.shape[1] - 1   # Subtract label
iterations = 10000
checkCvEvery = 500

#Common options for whole NN
options = OptionsStore(learnStep=0.005,
                       rmsProp=0.9,
                       mmsmin=1e-20,
                       minibatch_size=batchSize,
                       CV_size=cvSize)

#First layer
L1 = LayerNN(size_in=inputSize,
             size_out=numberOfFeatures,
             sparsity=0.1,
             beta=3,
             weightDecay=3e-3,
             activation=FunctionModel.Sigmoid)

#Second layer
L2 = LayerNN(size_in=numberOfFeatures,
             size_out=inputSize,
             weightDecay=3e-3,
             activation=FunctionModel.Sigmoid)

#Compile all together
AE = TheanoNNclass(options, (L1, L2))

#Compile train and predict functions
AE.trainCompile()
AE.predictCompile()

#Normalise CV data from 0..255 to 0..1
X_CV = DATA_CV[:, 1:].T / 255.0

#Empty list to collect CV errors
CV_error = []

#Let's iterate!
for i in xrange(iterations):

    #Get miniBatch of defined size from whole DATA
    X = getBatch(DATA, batchSize, validDataIndexes)

    #Train on given data/labels
    AE.trainCalc(X, X, iteration=1, debug=True, errorCollect=True)

    #Check CV error every *checkCvEvery* cycles
    if i % checkCvEvery == 0:

        #Caclculate CV error give CV data/labels
        CV_error.append(NNsupport.crossV(X_CV, X_CV, AE))

        #Print current CV error
        print 'CV error: ', CV_error[-1]

        #Draw how error and accuracy evolves vs iterations
        Graph.Builder(name='AE_error.png', error=AE.errorArray, cv=CV_error, legend_on=True)

        #Visualise hidden layers weights
        AE.weightsVisualizer(folder='.', size=(28, 28))

fTheanoNNclassCORE module

Inheritance diagram of fTheanoNNclassCORE

class fTheanoNNclassCORE.FunctionModel

Bases: object

Collection of activation functions we support.

static LReLU(z, *args)

Leaky Rectified Linear Unit. More info.

\[\begin{split}activation= \begin{cases} z, & if z > 0\\ 0.01z, & otherwise \end{cases}\end{split}\]
Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – array, additional parameters. For now uses for MaxOut.
Returns:

array, same size as z.

static Linear(z, *args)

Linear activation function. Returns input as-is.

Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – array, additional parameters. For now uses for MaxOut.
Returns:

array, same size as z.

static MaxOut(z, *args)

MaxOut activation function.

Original paper: http://arxiv.org/pdf/1302.4389.pdf

\[activation_{i} = \max_{j \in [1,k]} z_{i,j}\]
Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – [0] - the number of “lines” to emulate MaxOut in each pool. Say, in case we have here 3 - each output neuron will be emulated as 3 linear functions.
Returns:

array, size along [0] axis reduced times “lines”.

static ReLU(z, *args)

Rectified Linear Unit. More info.

\[activation = \max(0, z)\]
Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – array, additional parameters. For now uses for MaxOut.
Returns:

array, same size as z.

static Sigmoid(z, *args)

Standard sigmoid.

\[activation = \frac{1}{1 + e^{-z}}\]
Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – array, additional parameters. For now uses for MaxOut.
Returns:

array, same size as z.

static SoftMax(z, *args)

SoftMax activation function with several updates to avoid NaN.

It is useful for output layer only.

\[\begin{split}activation = \frac{1}{\sum\limits_{j=1}^k e^{\theta_j^T x^{(i)}}} \left[\begin{aligned} e&^{\theta_1^Tx^{(i)}}\\ e&^{\theta_2^Tx^{(i)}}\\ &\vdots\\ e&^{\theta_k^Tx^{(i)}} \end{aligned}\right]\end{split}\]

Some hacks for fixing float32 GPU problem:

a = T.clip(a, float(np.finfo(np.float32).tiny), float(np.finfo(np.float32).max))
a = T.clip(a, 1e-20, 1e20)

Proof links:

Links about possible approaches to fix NaN:

Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – array, additional parameters. For now uses for MaxOut.
Returns:

array, same size as z.

static Tanh(z, *args)

Hyperbolic tangent.

\[activation = \frac{e^z - e^{-z}}{e^z + e^{-z}}\]
Parameters:
  • z – array, raw activation, usually calculated as \(z=W^Tx\) that will be used for further calculation.
  • args – array, additional parameters. For now uses for MaxOut.
Returns:

array, same size as z.

class fTheanoNNclassCORE.LayerCNN(kernel_shape=None, stride=1, pooling=False, pooling_shape=None, optimized=False, validConvolution=True, **kwargs)

Bases: fTheanoNNclassCORE.LayerNN

Layer class that extends standard LayerNN class and implements CNN (convolution, not fully connected) type of network. The most useful type of network to apply for image processing beyond others NN algorithms. It implements the most brain-like way to process data (applies the same weights to small parts of input data). Read more about convolution here:

Parameters:
  • kernel_shape – tuple of int, kernels to use (number of kernels, colors, shape X, shape Y)
  • stride – int, step between windows in pixels
  • pooling – boolean, whether to use pooling after convolution or not
  • pooling_shape – int, pooling window’s shape. Stride will be the same, so only standard non-overlapping pooling is available.
  • optimized – boolean, whether to use highly optimized version or not. In case TRUE - it is able to run only on GPU.
  • validConvolution – whether to use valid (convolve fully overlapped parts) or full (convolve partially overlapped parts) convolution.
  • kwargs – other parameters are inherited from LayerNN.__init__()

Note

In case optimized = True there are number of restrictions you have take into account:

  • The number of channels must be even, or less than or equal to 3. If you want to compute the gradient, it should be divisible by 4. Valid numbers of input channels are: 1, 2, 3, 4, 8, 12, 16, ...
  • Filters must be square.
  • The number of filters must be a multiple of 16.
  • All minibatch sizes are supported, but the best performance is achieved when the minibatch size is a multiple of 128.
  • Only “valid” convolutions are supported. If you want to perform a “full” convolution, you will need to use zero-padding (more on this later).
  • Only works on the GPU. You cannot run your Theano code on the CPU if you use it. But still possible to train on GPU and to load & run on CPU.
compileActivation(net, layerNum)
compileDropout(net, R)

Compile necessary mask matrix for dropout regularisation.

Parameters:
  • net – TheanoNNclass object
  • R – Theano’s RandomGenerator object
compilePredictActivation(net, layerNum)
compileSparsity(net, layerNum, num)

In general, method does the same as compileSparsity(). Can be used in combination with Sigmoid() only.

But concretely for CNN it was a little bit modified, to be able to calculates average activations from bc01 format.

Note

bc01 - mean: batch x color x size_X x size_Y

Parameters:
  • net – TheanoNNclass object
  • layerNum – int, layer’s index
  • num – batch size
compileWeight(net, layerNum)

Allocates weights to be used as shared variable in Theano. It is impossible to use MaxOut as activation function yet. In case you experience train issues - try to change init random values.

Parameters:
  • net – TheanoNNclass object
  • layerNum – layer’s index.
class fTheanoNNclassCORE.LayerNN(size_in=1, size_out=1, activation=<function Sigmoid>, weightDecay=False, sparsity=False, beta=False, dropout=False, dropConnect=False, pool_size=False)

Bases: object

Basic layer class. By default - standard NeuralNet fully-connected network.

Parameters:
  • size_in – int, number of neurons on input
  • size_out – int, number neurons on out
  • activation – FunctionModel, activation function to use
  • weightDecay – float or False, weight decay regularization and its coefficient
  • sparsity – float or False, sparcity constraint. Make sense only with Sigmoid activation function
  • beta – float, sparse weight coefficient
  • dropout – float or False, dropout regularisation with defined coefficient
  • dropConnect – TBD
  • pool_size – int, Should be specified only for MaxOut activation function. Number of lines to emulate each neroun.
Returns:

layer object.

Printer()

Prints layer properties :return:

compileActivation(net, layerNum)

Compile layer’s activation taking into account dropout and specified activation function. Used during network’s training to calculate activations.

Parameters:
  • net – TheanoNNclass object
  • layerNum – int, layer’s index
Returns:

compileDropout(net, R)

Compile necessary mask matrix for dropout regularisation.

Parameters:
  • net – TheanoNNclass object
  • R – Theano’s RandomGenerator object
Returns:

compilePredictActivation(net, layerNum)

Compile layer’s activation taking into account dropout and specified activation function. Used to calculate predictions without training.

Parameters:
  • net – TheanoNNclass object
  • layerNum – int, layer’s index
Returns:

compileSparsity(net, layerNum, num)

Compile necessary sparsity constraint calculations.

Average activation of hidden unit j (averaged over the training set):

\[\hat{\rho} = \frac{1}{m}\sum\limits_{i=1}^{m}\left[a_j(x^{(i)})\right]\]

Then penalty (using Kullback-Leibler):

\[penalty = \sum\limits_{j=1}^{hiddenUnits}\rho\log\frac{\rho}{\hat\rho_{j}} + (1 - \rho)\log\frac{1 - \rho}{1 - \hat\rho_{j}}\]

where \(\rho\) - is sparsity parameter. Means - the level of average activation we want to achieve.

Parameters:
  • net – TheanoNNclass object
  • layerNum – int, layer’s index
  • num – batch size
Returns:

compileWeight(net, layerNum)

Allocates weights to be used as shared variable in Theano

Parameters:
  • net – TheanoNNclass object
  • layerNum – layer’s index.
Returns:

compileWeightDecayPenalty(net, layerNum)

Adds weight decay penalty to network’s error. Useful to decrease absolute weight’s values.

\[\begin{split}penalty = \frac{1}{2}\sum W_{target\>layer}^2\end{split}\]
Parameters:
  • net – TheanoNNclass object
  • layerNum – int, layer’s index
Returns:

class fTheanoNNclassCORE.LayerRNN(blocks=1, peeholes=False, **kwargs)

Bases: fTheanoNNclassCORE.LayerNN

Layer class that extends standard LayerNN class and implements RNN (recurrent) type of network. Particularly, here we implement LSTM (Long Short-Term Memory).

You can find more info about it on:

Parameters:
  • blocks – int, number of blocks to create. Should be equivalent to size_out
  • peeholes – boolean, whether to use peeholes or not (send Acc to input gate).
  • kwargs – needs for compatibility.
Returns:

LayerRNN object

compileActivation(net, layerNum)

Compile layer’s activation taking into account dropout. It is meaningful to use Sigmoid activation function (or probably hyperbolic tang).

Activation calculated as follows:

  1. \(Input\>activation\)
  2. \(Input\>gate\)
  3. \(Forget\>gate\)
  4. \(Output\>gate\)

Note

All above where calculated in one step

  1. \(Pi = {Input\>activation} \times {Input\>gate}\)
  2. \(Pr = {Forget\>gate} \times {Cell\>state}\)
  3. \({Cell\>state} = Pi + Pr\)
  4. \(output = {Output\>gate} \times {Cell\>state}\)
Parameters:
  • net – TheanoNNclass object
  • layerNum – layer’s index.
compilePredictActivation(net, layerNum)

Compile layer’s activation taking into account dropout and specified activation function. Used to calculate predictions without training. Uses separate Accumulator to store cell’s state independently from training.

Parameters:
  • net – TheanoNNclass object
  • layerNum – layer’s index
compileWeight(net, layerNum)

Allocates weights to be used as shared variable in Theano.

To initialise bias we use values advised here:

  • Input gate: 0.0
  • Forget gate: -2.0
  • Output gate: +2.0
Parameters:
  • net – TheanoNNclass object
  • layerNum – layer’s index.
class fTheanoNNclassCORE.NNsupport

Bases: object

static crossV(number, y, x, modelObj)
static errorG(errorArray, folder, plotsize=50)
class fTheanoNNclassCORE.OptionsStore(learnStep=0.01, rmsProp=False, mmsmin=1e-10, rProp=False, minibatch_size=1, CV_size=1)

Bases: object

Container for global network’s options.

Parameters:
  • learnStep – float, learn step to use in gradient descent or RMSprop
  • rmsProp – False or float, whether to use RMSprop or not. If yes - rate of RootMeanSquare. Usually 0.9
  • mmsmin – float, clip RootMeanSquare to avoid NaN. Default: 1e-10. Reasonable: down to 1e-20
  • rProp – False or float, use only for full batch. If yes - rate to increase next weight’s change.
  • minibatch_size – int, size of batch you use. Can’t be changed compiling.
  • CV_size – int, size of cross validation set. Can’t be changed compiling.
Returns:

OptionStore object.

Printer()

Print out to stdout current options. Useful for debug.

Returns:nothing
class fTheanoNNclassCORE.TheanoNNclass(opt, architecture)

Bases: object

The most important class. Here everything combines together.

Using info defined in OptionStore and Layers - compile Network object.

Parameters:
  • opt – OptionStore, general network’s options.
  • architecture – list, list of layers to build a network.
Returns:

TheanoNNclass object

getStatus()
modelLoader(folder)
modelSaver(folder)
paramGetter()
paramSetter(loaded)
predictCalc(X, debug=False)
predictCompile(layerNum=-1)
roll(a)
trainCalc(X, Y, iteration=10, debug=False, errorCollect=False)

Standard method to train network using labeled data.

Parameters:
  • X – array, data to train network on.
  • Y – array, data’s labels.
  • iteration – number of cycles you want network to train on current X
  • debug – boolean, whether to print some useful info.
  • errorCollect – boolean, whether to collect network’s error in self.errorArray field
Returns:

trainCalcExternal(model, X, Y)

Call this method in case you want to use external optimizer.

Parameters:
  • model – vector, new weights for network.
  • X – array, data to train on.
  • Y – array, labels for data.
Returns:

(float, array), network’s error and weight’s gradients

trainCompile()

Using OptionsStore, Layers - create shared variable and Theno’s function to train network. Usually, should be call only once for each network.

Returns:link self.train with appropriate theano’s function
trainCompileExternal()

It is possible to use external optimisation.

In case yu decide to use something external - this method will prepare necessary functions. So after you should be able to use returned gradient and load updated weights.

Returns:
unroll()
weightsVisualizer(folder, size=(100, 100), color='L', second=False, name='weights')

fImageWorkerCORE module

class fImageWorkerCORE.Graphic

Bases: object

static PicSaver(img, folder, name, color='L')
class fImageWorkerCORE.MultiWeights(path='./', name='multi_weights.png')

Bases: object

add(p)
defineOptimalPicLocation(n)
draw()

fCutClassCORE module

class fCutClassCORE.CutClass(img=0, array=<Mock name='mock.array()' id='140444730637648'>, xwindow=25, ywindow=25)

Bases: object

REPORT = 'OK'
cutter(conv=False, step=25)
getter()
getter2()
status()
class fCutClassCORE.CutClassWindow(img=0, array=<Mock name='mock.array()' id='140444730637648'>, xy1=(25, 25), xy2=(29, 29))

Bases: fCutClassCORE.CutClass

cutter()
class fCutClassCORE.RandomCutClass(img=0, array=<Mock name='mock.array()' id='140444730637648'>, xwindow=25, ywindow=25)

Bases: fCutClassCORE.CutClass

cutter(winNum)
class fCutClassCORE.SaveClass(obj)

Bases: object

picleSaver(folderName)
pictureSaver(folderName)

fDataWorkerCORE module

class fDataWorkerCORE.BatchMixin

Bases: object

REPORT = 'OK'
miniBatch(number)
class fDataWorkerCORE.DataMutate

Bases: object

static Normalizer(ia)
static PCAW(X, epsilon=0.01)
static deNormalizer(ia, afterzero=20)
fDataWorkerCORE.binarizer(arr, base)
class fDataWorkerCORE.csvDataLoader(folder, startColumn=1, skip=1)

Bases: fDataWorkerCORE.BatchMixin

class fDataWorkerCORE.multiData(*objs)

Bases: fDataWorkerCORE.BatchMixin

fDataWorkerCORE.noisedSinGen(number=10000, phase=0)
fDataWorkerCORE.rollOut(l)
fDataWorkerCORE.sparser(arr, base)

fGraphBuilderCORE module

class fGraphBuilderCORE.Graph

Bases: object

static Builder(error=False, cv=False, accuracy=False, name='./GraphBuilder_default_name.png', legend_on=True, **kwargs)

Indices and tables