pyrenn: A recurrent neural network toolbox for python and matlab

Maintainer:Dennis Atabay, <dennis.atabay@tum.de>
Organization:Institute for Energy Economy and Application Technology, Technische Universität München
Version:0.1
Date:Jun 30, 2018
Copyright:This documentation is licensed under a Creative Commons Attribution 4.0 International license.
https://zenodo.org/badge/18757/yabata/pyrenn.svg

Contents

This documentation contains the following pages:

Create a neural network

This chapter describes how to create a feed forward or recurrent neural network in pyrenn.

Feed forward neural networks in pyrenn

pyrenn allows to create multilayer perceptron (MLP) neural networks. A MLP is a feedforward artificial neural network, that is defined by:

  • an input layer with \(R\) inputs
  • \(M-1\) hidden layers, where each layer \(m\) has an abritary number of neurons \(S^\text{m}\)
  • and an output layer with \(S^\text{M}\) number of neurons, which corespond to the number of outputs of the neural network

The following notation allows a short description of a MLP which gives the number of inputs \(R\), the number of layers \(M\) and the number of neurons \(S^\text{m}\) in each layer \(m\):

\[nn = [R\; S^\text{1}\; S^\text{2}\; ...\; S^\text{M}]\]

In a MLP each layer has a full connection to the next layer, which means that each neuron output in layer \(m\) is an input to each neuron in layer \(m+1\) (and inputs are connected to all neurons in the first layer).

figure 1 shows an example of a MLP with 2 inputs, two hidden layers with two neurons each and an output layer with one neuron (and therefore one output \(\hat{y}\)). The MLP can be described with

\[nn = [2\; 2\; 2\; 1]\]
_images/MLP2221_detailed.png

Figure 1: A \([2\; 2\; 2\; 1]\) MLP

The MLP in figure 1 consists of 5 neurons. The structure of a neuron is shown in figure 2.

_images/neuron.png

Figure 2: Structure of a neuron

Neurons are the constitutive units in an artificial neural network. They can have several inputs \(p\) which are multiplied by the connection weights \(w\) and summed up together with the bias weight \(b\) to the summation output \(n\). Then the neuron output \(a\) is calculated using the transfer function (also activation function) \(f(n)\).

\[\begin{split}\begin{gather} n = \sum_{i} \left( w_{i} \: p_{i}\right) + b \\ a = f(n) \end{gather}\end{split}\]

Note

Generally different transfer function could be used in a neural network. In pyrenn the transferfunction is defined as:

  • the hyperbolic tangent \(a=tanh(n)\) for all neurons in hidden layers
  • the linear function \(y=a=n\) for all neurons in the output layer

The array-matrix illustration allows a clearer description of a MLP. Therefore the inputs \(p\), the neural network outputs \(\hat{y}\), the summation outputs \(n\) , the layer outputs \(a\), the transfer functions \(f(n)\) and the bias weights \(b\) of one layer \(m\) are represented by the arrays \(\underline{p},\:\underline{\hat{y}},\:\underline{n}^m,\:\underline{a}^m,\:\underline{f}^m,\:\underline{b}^m\) (the upper index represents the layer \(m\): and the lower index the number of the neuron or input).

\[\begin{split}\underline{p} = \begin{bmatrix} p_1\\ ...\\ p_R\\ \end{bmatrix}\: \underline{\hat{y}}= \begin{bmatrix} {\hat{y}}_1\\ ...\\ {\hat{y}}_{S^M}\\ \end{bmatrix}\: \underline{n}^m= \begin{bmatrix} {n}^m_1\\ ...\\ {n}^m_{S^m}\\ \end{bmatrix}\: \underline{a}^m= \begin{bmatrix} {a}^m_1\\ ...\\ {a}^m_{S^m}\\ \end{bmatrix}\: \underline{f}^m= \begin{bmatrix} {f}^m_1\\ ...\\ {f}^m_{S^m}\\ \end{bmatrix}\: \underline{b}^m= \begin{bmatrix} {b}^m_1\\ ...\\ {b}^m_{S^m}\\ \end{bmatrix}\end{split}\]

The connection weights \(w\) are represented by the matrix \(\widetilde{IW}^{1,1}\) which contains the connection weights of the first layer and the matrices \(\widetilde{LW}^{m,l}\), which contain the weights that connect the outputs of layer \(l\) with layer \(m\). For the example in figure 1 the connection matrices are:

\[\begin{split}\widetilde{IW}^{1,1}= \begin{bmatrix} w^1_{1,1} & w^1_{1,2} \\ w^1_{2,1} & w^1_{2,2} \end{bmatrix}\; \widetilde{LW}^{2,1}= \begin{bmatrix} w^2_{1,1} & w^2_{1,2} \\ w^2_{2,1} & w^2_{2,2} \end{bmatrix}\; \widetilde{LW}^{3,2}= \begin{bmatrix} w^3_{1,1} & w^3_{1,2} \end{bmatrix}\end{split}\]

figure 3 shows the array-matrix illustration of the MLP of figure 1 :

_images/MLP2221_array-matrix.png

Figure 3: Array-matrix illustration of a MLP with two hidden layers

Recurrent neural networks in pyrenn

pyrenn allows also to define different topologies of recurrent neural networks, which are networks, where connections between units form a directed cycle. In pyrenn this is implemented by connecting the output of a layer \(m\) with the input of previous layers \(<m\) or with it’s own layer input. Since this would lead to an infeasible system, a real-valued time-delay has to be applied to the recurrent connections. This is done by so called Tapped Delay Lines (TDL). A TDL contains delay operators \(z^{-d}\) which delay time-discrete signals by a real-valued delay \(d\). To describe the delay elements in a TDL, the Sets \({DI}^{l,m}\) and \({DL}^{l,m}\) are introduced. They contain all real-valued delays \(d_i\) between a connection from the output of layer \(l\) to the input of layer \(m\). Consequently for every \(d_i \in {DI}^{l,m}\) or \(d_i \in {DL}^{l,m}\) there has to be a connection matrix \(\widetilde{IW}^{m,l}[d_i]\) or \(\widetilde{LW}^{m,l}[d_i]\). figure 4 shows the detailed and simplified illustration a TDL example.

_images/TDL.png

Figure 4: TDL in detailed (a) and simplified (b) illustration

With pyrenn it is possible to define three different types of TDLs, which will add (recurrent) time-delayed connections with their weight matrices to the MLP. The influence of these setting on the neural network structure is explained with the help of figure 5.

  • Input delays \(dIn \in [0,1,2,...]\) (blue; default for MLP \(dIn=[0]\)):

    This allows to delay the inputs \(\underline{p}\) of the neural network by any real-valued time-step \(d \geq 0\). Thereby the neural network can be used for systems where the output depends not only on the current input, but also previous inputs. \(dIn\) has to be non-empty, otherwise no inputs are connected with the neural network! When the current input should be used in the neural network, \(dIn\) hast to contain 0. Since this only delays the inputs, this will not lead to a recurrent network.

    \[{DI}^{1,1} = dIn\]
  • Output delays \(dOut \in [1,2,...]\) (green; default for MLP \(dOut=[]\)):

    This allows to add a recurrent connection of the outputs \(\underline{\hat{y}}\) of the neural network to it’s first layer (which is similar to a recurrent connection of the output of the network to it’s input). Thereby the neural network can be used for systems where the output depends not only on the inputs, but also on prevoius outputs (states). Since this adds a recurrent connection if \(dIn\) is non-empty, the delays has to be greater than zero \(d>0\). A neural network with such a connection will be a recurrent neural network.

    \[{DL}^{1,M} = dOut\]
  • Internal delays \(dIntern \in [1,2,...]\) (red; default for MLP \(dOut=[]\)):

    This allows to add a recurrent connection from all layers to all previous layers and to it self (except from the output layer to the first layer). Thereby the neural network can be used for systems where the output depends on prevoius internal states. Since this adds recurrent connections if \(dIntern\) is non-empty, the delays has to be greater than zero \(d>0\). A neural network with such a connection will be a recurrent neural network.

    \[{DL}^{m,l} = dIntern \;\;\;\;\forall (m \leq l|\; {DL}^{m,l} \neq {DL}^{1,M})\]

In pyrenn all forward connections (except the inputs) only have un-delayed direct connections!

_images/recurrent_nn.png

Figure 5: Possible delayed and recurrent connections that can be created with pyrenn for a neural network with two hidden layers.

Note

With the described definitions, every neural network in pyrenn can be defined by only four parameters:

  • The short notation \(nn\) which describes the number of inputs, layers, neurons and outputs
  • the input delays \(dIn\) of the neural network
  • the output delays \(dOut\) of the neural network
  • the internal delays \(dIntern\) of the neural network

Creating a neural network with CreateNN()

The function CreateNN creates a pyrenn neural network object that can be trained and used. When only the short notation \(nn\) is given as input, the created neural network will be a MLP with no delayed connections. If a (recurrent) neural network with delays should be created, the parameters \(dIn\), \(dIntern\) and/or \(dOut\) have to be secified as described above. In the Examples different neural network topologies are created.

Python
pyrenn.CreateNN(nn, [dIn=[0],dIntern=[ ], dOut=[ ]])

Creates a neural network object with random values between -0.5 and 0.5 for the weights.

Parameters:
  • nn (list) – short notation of the neural network \([R\; S^\text{1}\; S^\text{2}\; ...\; S^\text{M}]\)
  • dIn (list) – Set of input delays of the neural network
  • dIntern (list) – Set of inernal delays of the neural network
  • dOut (list) – Set of output delays of the neural network
Returns:

a pyrenn neural network object

Return type:

dict

Matlab
CreateNN(nn, [dIn=[0], dIntern=[ ], dOut=[ ]])

Creates a neural network object with random values between -0.5 and 0.5 for the weights.

Parameters:
  • nn (array) – short notation of the neural network \([R\; S^\text{1}\; S^\text{2}\; ...\; S^\text{M}]\); size [1 x M+1]
  • dIn (array) – Set of input delays of the neural network; size [1 x X]
  • dIntern (array) – Set of itnernal delays of the neural network; size [1 x X]
  • dOut (array) – Set of output delays of the neural network; size [1 x X]
Returns:

a pyrenn neural network object

Return type:

struct

Train a neural network

Once a neural network is created, it can be trained. To train a neural network, training data is required. If you have a system, that produces the output \(\underline{y}\), when given the input \(\underline{p}\) (see figure 6), \((\underline{p},\underline{y})\) represents one sample of training data.

_images/training_sample.png

Figure 6: A system with input \(\underline{p}\) and output \(\underline{y}\)

For training neural networks usually more than one data sample is required to obtain good results. Therefore the training data is defined by an input matrix \(\widetilde{P}\) and an output (or target) matrix \(\widetilde{Y}\) containing \(Q\) samples of training data. For static systems (feed forward neural networks) it is only important that element \(q\) of the input matrix corresponds to element \(q\) of the output matrix (in any given order). For dynamic systems (recurrent neural networks) the samples have to be in the correct time order. For both systems, the training data should represent the system as good as possible.

_images/training_matrix.png

Figure 7: Generated training data set \(\widetilde{P}\) and \(\widetilde{Y}\) of a system

\[\begin{split}\widetilde{P} = \begin{bmatrix} \underline{p}[1] & \underline{p}[2] & ... &\underline{p}[q] & ... &\underline{p}[Q] \end{bmatrix}\\ \widetilde{Y} = \begin{bmatrix} \underline{y}[1] & \underline{y}[2] & ... &\underline{y}[q] & ... &\underline{y}[Q] \end{bmatrix}\\\end{split}\]

With the training data, the neural network can be trained. Training a neural network means, that all weights in the weight vector \(\underline{w}\), which contains all connection weights \(\widetilde{IW}\) and \(\widetilde{LW}\) and all bias weights \(\underline{b}\), are updated step by step, such that the neural network output \(\hat{\widetilde{Y}}\) matches the training data output (target) \(\widetilde{Y}\). The objective of this optimization is to minimize the error \(E\) (cost function) between neural network and system outputs.

_images/training.png

Figure 8: Training a neural network

Note

Generally there are different methods to calculate the error \(E\) (cost function) for neural network training. In pyrenn always the mean squared error is used, which is necessary to apply the Levenberg-Marquardt algorithm.

The training repeats adapting the weights of the weight vector \(\underline{w}\) until one of the two termination conditions becomes active:

  • the maximal number of iterations (epochs) \(k_{max}\) is reached
  • the Error is minimized to the goal \(E \leq E_{stop}\)

train_LM(): train with Levenberg-Marquardt Algorithm

The function train_LM() is an implementation of the Levenberg–Marquardt algorithm (LM) based on:

Levenberg, K.: A Method for the Solution of Certain Problems in Least Squares. Quarterly of Applied Mathematics, 2:164-168, 1944.

and

Marquardt, D.: An Algorithm for Least-Squares Estimation of Nonlinear Parameters. SIAM Journal, 11:431-441, 1963.

The LM algorithm is a second order optimization method that uses the Jacobian matrix \(\widetilde{J}\) to approximate the Hessian matrix \(\widetilde{H}\). In pyrenn the Jacobian matrix is calculated using the Real-Time Recurrent Learning (RTRL) algorithm based on:

Williams, Ronald J.; Zipser, David: A Learning Algorithm for Continually Running Fully Recurrent Neural Networks. In: Neural Computation, Nummer 2, Vol. 1 (1989), S. 270-280

Python
pyrenn.train_LM(P, Y, net[, k_max=100, E_stop=1e-10, dampfac=3.0, dampconst=10.0, verbose = False])

Trains the given neural network net with the training data inputs P and outputs (targets) Y using the Levenberg–Marquardt algorithm.

Parameters:
  • P (numpy.array) – Training input data set \(\widetilde{P}\), 2d-array of shape \((R,Q)\) with \(R\) rows (=number of inputs) and \(Q\) columns (=number of training samples)
  • Y (numpy.array) – Training output (target) data set \(\widetilde{Y}\), 2d-array of shape \((S^M,Q)\) with \(S^M\) rows (=number of outputs) and \(Q\) columns (=number of training samples)
  • net (dict) – a pyrenn neural network object created by pyrenn.CreateNN()
  • k_max (int) – maximum number of training iterations (epochs)
  • E_stop (float) – termination error (error goal), training stops when the \(E \leq E_{stop}\)
  • dampfac (float) – damping factor of the LM algorithm
  • dampconst (float) – constant to adapt damping factor of LM
  • verbose (bool) – activates console outputs during training if True
Returns:

a trained pyrenn neural network object

Return type:

dict

Matlab
train_LM(P, Y, net, [k_max=100, E_stop=1e-10])

Trains the given neural network net with the training data inputs P and outputs (targets) Y using the Levenberg–Marquardt algorithm.

Parameters:
  • P (array) – Training input data set \(\widetilde{P}\), 2d-array with size \((R,Q)\) with \(R\) rows (=number of inputs) and \(Q\) columns (=number of training samples)
  • Y (array) – Training output (target) data set \(\widetilde{Y}\), 2d-array with size \((S^M,Q)\) with \(S^M\) rows (=number of outputs) and \(Q\) columns (=number of training samples)
  • net (struct) – a pyrenn neural network object created by CreateNN()
  • k_max (int) – maximum number of training iterations (epochs)
  • E_stop (double) – termination error (error goal), training stops when the \(E \leq E_{stop}\)
Returns:

a trained pyrenn neural network object

Return type:

struct

train_BFGS(): train with Broyden–Fletcher–Goldfarb–Shanno Algorithm (Matlab only)

The function train_BFGS() is an implementation of the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS). The BFGS algorithm is a second order optimization method that uses rank-one updates specified by evaluations of the gradient \(\underline{g}\) to approximate the Hessian matrix \(H\). In pyrenn the gradient \(\underline{g}\) for BFGS is calculated using the Backpropagation Through Time (BPTT) algorithm based on:

Werbos, Paul: Backpropagation through time: what it does and how to do it. In: Proceedings of the IEEE, Nummer 10, Vol. 78 (1990), S. 1550-1560.

Matlab
train_BFGS(P, Y, net, [k_max=100, E_stop=1e-10])

Trains the given neural network net with the training data inputs P and outputs (targets) Y using the Broyden–Fletcher–Goldfarb–Shanno algorithm.

Parameters:
  • P (array) – Training input data set \(\widetilde{P}\), 2d-array with size \((R,Q)\) with \(R\) rows (=number of inputs) and \(Q\) columns (=number of training samples)
  • Y (array) – Training output (target) data set \(\widetilde{Y}\), 2d-array with size \((S^M,Q)\) with \(S^M\) rows (=number of outputs) and \(Q\) columns (=number of training samples)
  • net (struct) – a pyrenn neural network object created by CreateNN()
  • k_max (int) – maximum number of training iterations (epochs)
  • E_stop (double) – termination error (error goal), training stops when the \(E \leq E_{stop}\)
Returns:

a trained pyrenn neural network object

Return type:

struct

Use a trained neural network

Once a neural network is trained successfully, it can be used to calculate the neural network outputs for new (different from the training data) input data. The input data \(\widetilde{P}\) for using the neural network has the same structure than for training. The neural network calculates the output data \(\hat{\widetilde{Y}}\) which has the same structure than the training output data \(\widetilde{Y}\). Any arbitrary number of data samples \(Q\) can be used, resulting in the same amount oft output samples.

_images/apply.png
\[\begin{split}\widetilde{P} = \begin{bmatrix} \underline{p}[1] & \underline{p}[2] & ... &\underline{p}[q] & ... &\underline{p}[Q] \end{bmatrix}\\ \hat{\widetilde{Y}} = \begin{bmatrix} \underline{\hat{y}}[1] & \underline{\hat{y}}[2] & ... &\underline{\hat{y}}[q] & ... &\underline{\hat{y}}[Q] \end{bmatrix}\\\end{split}\]

Using previous inputs and outputs for recurrent networks or networks with delayed inputs

Neural networks with delayed recurrent connections between their output and the input layer (green) and networks with delayed inputs \(d>0\) (blue) need outputs or inputs of previous timesteps \(t-d\) to calculate the output for timestep \(t\). When the neural netwoork is used applying the input data \(\widetilde{P}\), for the first time-step(s) these previous inputs and outputs are not known yet. pyrenn sets all unknown previous inputs and outputs to zero, which will probably lead to an error in the first time-steps.

_images/recurrent_nn.png

But pyrenn allows to pass previous inputs \(\widetilde{P0}\) and previous outputs \(\widetilde{Y0}\) to the neural network, if they are known by the user. \(\widetilde{P0}\) and \(\widetilde{Y0}\) have the same structure than \(\widetilde{P}\) and \(\widetilde{Y}\). Both must have the same number of previous data samples \(Q0\), even if one of them is irrelevant for the neural network. The neural network output \(\underline{\hat{y}}[q]\) at time \(q\) is then calculated using this previous inputs and outputs at time \(q-d\), where \(\underline{{p}}[0]\) and \(\underline{\hat{y}}[0]\) is the last element of \(\widetilde{P0}\) and \(\widetilde{Y0}\), respectively.

\[\begin{split}\begin{gather} &\widetilde{P0} &\widetilde{P}\\ &\overbrace{\begin{bmatrix} \underline{p}[Q0-1] & ... & \underline{p}[-1] & \underline{p}[0] \end{bmatrix}} \; &\overbrace{\begin{bmatrix} \underline{p}[1] & \underline{p}[2] & ... &\underline{p}[q] & ... &\underline{p}[Q] \end{bmatrix}}\\\\ &\underbrace{\begin{bmatrix} \underline{\hat{y}}[Q0-1] & ... &\underline{\hat{y}}[-1] &\underline{\hat{y}}[0] \end{bmatrix}} \; &\underbrace{ \underline{\hat{y}}[q] = f(\underline{p}[q],\underline{p}[q-d],\underline{\hat{y}}[q-d]) }\\ &\widetilde{Y0} &\widetilde{Y} \\ \end{gather}\end{split}\]

Setting previous values for the outputs of hidden layers (red connections) is not possible. If a neural network has internal recurrent connections, the previous outputs of hidden layers are set to zero, when not known yet.

Calculate neural network outputs with NNOut()

Python
pyrenn.NNOut(P, net[, P0=None, Y0=None])

Calculates the output of a trained neural network net given the inputs P

Parameters:
  • P (numpy.array) – Input data set \(\widetilde{P}\), 2d-array of shape \((R,Q)\) with \(R\) rows (=number of inputs) and \(Q\) columns (=number of data samples)
  • net (dict) – a pyrenn neural network object
  • P0 (numpy.array) – Previous input data set \(\widetilde{P0}\), 2d-array of shape \((R,Q0)\) with \(R\) rows (=number of inputs) and \(Q0\) columns (=number of previous data samples)
  • Y0 (numpy.array) – previous output data set \(\widetilde{Y0}\), 2d-array of shape \((S^M,Q0)\) with \(S^M\) rows (=number of outputs) and \(Q0\) columns (=number of previous data samples)
Returns:

Neural network output \(\hat{\widetilde{Y}}\), 2d-array of shape \((S^M,Q)\) with \(S^M\) rows (=number of outputs) and \(Q\) columns (=number of input data samples)

Return type:

numpy.array

Matlab
NNOut(P, net, [P0=[ ], Y0=[ ]])

Calculates the output of a trained neural network net given the inputs P

Parameters:
  • P (array) – Input data set \(\widetilde{P}\), 2d-array of size \((R,Q)\) with \(R\) rows (=number of inputs) and \(Q\) columns (=number of data samples)
  • net (struct) – a pyrenn neural network object
  • P0 (array) – Previous input data set \(\widetilde{P0}\), 2d-array of size \((R,Q0)\) with \(R\) rows (=number of inputs) and \(Q0\) columns (=number of previous data samples)
  • Y0 (array) – previous output data set \(\widetilde{Y0}\), 2d-array of size \((S^M,Q0)\) with \(S^M\) rows (=number of outputs) and \(Q0\) columns (=number of previous data samples)
Returns:

Neural network output \(\hat{\widetilde{Y}}\), 2d-array of size \((S^M,Q)\) with \(S^M\) rows (=number of outputs) and \(Q\) columns (=number of input data samples)

Return type:

array

Save and load a neural network

The function saveNN allows to save the structure and the trained weights of a neural network to a csv file. The function loadNN allows to load a saved neural network. This allows also to interchange neural network objects between python and matlab.

Save a neural network with saveNN()

Python
pyrenn.saveNN(net, filename)

Saves a neural network object to a csv file

Parameters:
  • net (dict) – a pyrenn neural network object
  • filename (string) – full or relative path of a csv file to save the neural network (filename = ‘\folder\file.csv’)

Example: Saving the neural network object net to ‘C:nnmynetwork.csv’

pyrenn.saveNN(net,'C:\nn\mynetwork.csv')
Matlab
saveNN(net, filename)

Saves a neural network object to a csv file

Parameters:
  • net (struct) – a pyrenn neural network object
  • filename (string) – full or relative path of a csv file to save the neural network (filename = ‘\folder\file.csv’)

Example: Saving the neural network object net to ‘C:nnmynetwork.csv’

saveNN(net,'C:\nn\mynetwork.csv')

Load a neural network with loadNN()

Python
pyrenn.loadNN(filename)

Load a neural network object from a csv file

Parameters:filename (string) – full or relative path of a csv file which contains a saved pyrenn neural network (filename = ‘\folder\file.csv’)
Returns:a pyrenn neural network object
Return type:dict

Example: Load a neural network saved in ‘C:nnmynetwork.csv’ into net

net = pyrenn.loadNN('C:\nn\mynetwork.csv')
Matlab
loadNN(filename)

Load a neural network object from a csv file

Parameters:
  • filename (string) – full or relative path of a csv file which contains a saved pyrenn neural network (filename = ‘\folder\file.csv’)
Returns:

a pyrenn neural network object

Return type:

dict

Example: Load a neural network saved in ‘C:nnmynetwork.csv’ into net

net = loadNN('C:\nn\mynetwork.csv')

Examples

The examples given in this chapter show how to create, train and use neural networks in pyrenn for different systems. All examples can be found in the folder python\examples or matlab\examples.

Friction Curve

In this example a neural network is used to learn the friction curve of a system, which is given by the following function, where \(F\) is the friction force and \(v\) is the velocity:

\[F = 0.5 * (tanh(25*v) - tanh(v)) + 0.2*tanh(v) + 0.03*v\]

For training 41 data samples of the velocity and the resulting friction force are given.

This is an example of a static system with one output and one input and can be found in python\examples\example_friction.py and matlab\examples\example_friction.m.

Python

At first the needed packages are imported. pandas for reading the excelfile, matplotlib for plotting the results and pyrenn for the neural network.

import pandas as pd
import matplotlib.pyplot as plt
import pyrenn as prn

Then the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using pandas. Because we have only one input and one output here, the input and output data can be either a 1d numpy array, where the number of elements represents the number of data samples (this is the case here), or a 2d numpy array with shape (1,Q), where Q is the number of data samples.

df = pd.ExcelFile('example_data.xlsx').parse('friction')
P = df.loc[0:40]['P']
Y = df.loc[0:40]['Y']
Ptest = df['Ptest'].values
Ytest = df['Ytest'].values

Then the neural network is created. Since we have a system with 1 input and 1 output, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 3 neurons. Since the friction curve is a static system, we do not need a recurrent network and no delayed inputs, so we do not have to change the delay inputs.

net = prn.CreateNN([1,3,3,1])

Now we can train the created neural network net with the training data P and Y. verbose=True activates diplaying the error during training. We set the number of iterations (epochs) to 100 and the termination error to 1e-5. The Training will stop after 100 iterations or when Error <= E_stop.

net = prn.train_LM(P,Y,net,verbose=True,k_max=100,E_stop=1e-5)

After the training is finished, we can use the neural network. We calculate the neural network output y, using the training data P as input as well as the the output ytest, using the test data Ptest as input.

y = prn.NNOut(P,net)
ytest = prn.NNOut(Ptest,net)

Now we can plot the results, comparing the output of the neural network with the training and the test data of the system.

fig = plt.figure(figsize=(11,7))
ax0 = fig.add_subplot(211)
ax1 = fig.add_subplot(212)
fs=18

#Train Data
ax0.set_title('Train Data',fontsize=fs)
ax0.plot(P,y,color='b',lw=2,label='NN Output')
ax0.plot(P,Y,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Train Data')
ax0.tick_params(labelsize=fs-2)
ax0.legend(fontsize=fs-2,loc='upper left')
ax0.grid()

#Test Data
ax1.set_title('Test Data',fontsize=fs)
ax1.plot(Ptest,ytest,color='b',lw=2,label='NN Output')
ax1.plot(Ptest,Ytest,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax1.tick_params(labelsize=fs-2)
ax1.legend(fontsize=fs-2,loc='upper left')
ax1.grid()

fig.tight_layout()
plt.show()
_images/example_python_friction.png
Matlab

At first the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using xlsread. Because we have only one input and one output here, the input and output data has to be an array with size (1,Q), where Q is the number of data samples.

file = 'example_data.xlsx';
num = xlsread(file,'friction');
P = num(1:41,2).';
Y = num(1:41,3).';
Ptest = num(:,4).';
Ytest = num(:,5).';

Then the neural network is created. Since we have a system with 1 input and 1 output, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 3 neurons. Since the friction curve is a static system, we do not need a recurrent network and no delayed inputs, so we do not have to change the delay inputs.

net = CreateNN([1,3,3,1])

Now we can train the created neural network net with the training data P and Y. In matlab we have the option to train the network with the LM algorithm or the BFGS algorithm.

We start with the LM algorithm. We set the number of iterations (epochs) to 100 and the termination error to 1e-5. The Training will stop after 100 iterations or when Error <= E_stop.

netLM = train_LM(P,Y,net,100,1e-5);

After the training is finished, we can use the neural network. We calculate the neural network output y_LM, using the training data P as input as well as the the output ytest_LM, using the test data Ptest as input.

y_LM = NNOut(P,netLM);
ytest_LM = NNOut(Ptest,netLM);

Now we can do the same using the BFGS algorithm. The BFGS algorithm usually takes less time for one iteration, but needs more iterations to reach the same error than the LM. Therefore we set the number of iterations (epochs) to 200 and the termination error to 1e-5.

netBFGS = train_BFGS(P,Y,net,200,1e-5);
y_BFGS = NNOut(P,netBFGS);
ytest_BFGS = NNOut(Ptest,netBFGS);

Now we can plot the results, comparing the output of the two different neural networks with each other and with the training and the test data of the system.

fig = figure();
set(fig, 'Units', 'normalized', 'Position', [0.2, 0.1, 0.6, 0.6])
axis tight

subplot(311)
set(gca,'FontSize',16)
plot(P,Y,'r:','LineWidth',2)
hold on
grid on
plot(P,y_LM,'b','LineWidth',2)
plot(P,y_BFGS,'g','LineWidth',2)
l1 = legend('Train Data','LM output','BFGS output','Location','northwest');
set(l1,'FontSize',14)

subplot(312)
set(gca,'FontSize',16)
plot(Ptest,Ytest,'r:','LineWidth',2)
hold on
grid on
plot(Ptest,ytest_LM,'b','LineWidth',2)
plot(Ptest,ytest_BFGS,'g','LineWidth',2)
l2 = legend('Test Data','LM output','BFGS output','Location','northwest');
set(l2,'FontSize',14)

subplot(313)
set(gca,'FontSize',16)
plot(netLM.ErrorHistory,'b','LineWidth',2)
hold on
grid on
plot(netBFGS.ErrorHistory,'g','LineWidth',2)
ylim([0,0.1])
l3 = legend('LM Error','BFGS Error','Location','northeast');
set(l3,'FontSize',14)
_images/example_matlab_friction.png

Second order transfer function (PT2)

In this example a neural network is used to learn the response Y to an input P of a second order transfer function described by:

\[G(s) = \frac{Y(s)}{P(s)} = \frac{10}{0.1*s^2 + s + 100}\]

For training and testing the system a Amplitude Modlulated Pseudo Random Binary Sequence (APRBS) is used as input P and Ptest. A APRBS is a Pseudorandom binary sequence (PRBS), which is generated by a variation of the amplitude levels in addition to the variation of hold times of a PRBS.

This is an example of a dynamic system with one input and one output and can be found in python\examples\example_pt2.py and matlab\examples\example_pt2.m.

Python

At first the needed packages are imported. pandas for reading the excelfile, matplotlib for plotting the results and pyrenn for the neural network.

import pandas as pd
import matplotlib.pyplot as plt
import pyrenn as prn

Then the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using pandas. Because we have only one input and one output here, the input and output data can be either a 1d numpy array, where the number of elements represents the number of data samples (this is the case here), or a 2d numpy array with shape (1,Q), where Q is the number of data samples.

df = pd.ExcelFile('example_data.xlsx').parse('pt2')
P = df['P'].values
Y = df['Y'].values
Ptest = df['Ptest'].values
Ytest = df['Ytest'].values

Then the neural network is created. Since we have a system with 1 input and 1 output, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 2 neurons. The system has no input delays, so we only use the current input dIn = [0]. Since we have a dynamic system, we want to create a recurrent neural network. We create a neural network with recurrent connections with delay of 1 timestep in the hidden layers (dIntern = [1]) and a recurrent connection with delay of 1 and 2 timesteps from the output to the first layer (dOut = [1,2])

net = prn.CreateNN([1,2,2,1],dIn=[0],dIntern=[1],dOut=[1,2])

Now we can train the created neural network net with the training data P and Y. verbose=True activates diplaying the error during training. We set the number of iterations (epochs) to 100 and the termination error to 1e-3. The Training will stop after 100 iterations or when Error <= E_stop.

net = prn.train_LM(P,Y,net,verbose=True,k_max=100,E_stop=1e-3)

After the training is finished, we can use the neural network. We calculate the neural network output y, using the training data P as input as well as the the output ytest, using the test data Ptest as input.

y = prn.NNOut(P,net)
ytest = prn.NNOut(Ptest,net)

Now we can plot the results, comparing the output of the neural network with the training and the test data of the system.

fig = plt.figure(figsize=(11,7))
ax0 = fig.add_subplot(211)
ax1 = fig.add_subplot(212)
fs=18

#Train Data
ax0.set_title('Train Data',fontsize=fs)
ax0.plot(y,color='b',lw=2,label='NN Output')
ax0.plot(Y,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Train Data')
ax0.tick_params(labelsize=fs-2)
ax0.legend(fontsize=fs-2,loc='upper left')
ax0.grid()

#Test Data
ax1.set_title('Test Data',fontsize=fs)
ax1.plot(ytest,color='b',lw=2,label='NN Output')
ax1.plot(Ytest,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax1.tick_params(labelsize=fs-2)
ax1.legend(fontsize=fs-2,loc='upper left')
ax1.grid()

fig.tight_layout()
plt.show()
_images/example_python_pt2.png
Matlab

At first the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using xlsread. Because we have only one input and one output here, the input and output data has to be an array with size (1,Q), where Q is the number of data samples.

file = 'example_data.xlsx';
num = xlsread(file,'pt2');
P = num(:,2).';
Y = num(:,3).';
Ptest = num(:,4).';
Ytest = num(:,5).';

Then the neural network is created. Since we have a system with 1 input and 1 output, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 2 neurons. The system has no input delays, so we only use the current input dIn = [0]. Since we have a dynamic system, we want to create a recurrent neural network. We create a neural network with recurrent connections with delay of 1 timestep in the hidden layers (dIntern = [1]) and a recurrent connection with delay of 1 and 2 timesteps from the output to the first layer (dOut = [1,2])

nn = [1 2 2 1];
dIn = [0];
dIntern=[1];
dOut=[1,2];
net = CreateNN(nn,dIn,dIntern,dOut);

Now we can train the created neural network net with the training data P and Y. In matlab we have the option to train the network with the LM algorithm or the BFGS algorithm.

We start with the LM algorithm. We set the number of iterations (epochs) to 100 and the termination error to 1e-3. The Training will stop after 100 iterations or when Error <= E_stop.

netLM = train_LM(P,Y,net,100,1e-3);

After the training is finished, we can use the neural network. We calculate the neural network output y_LM, using the training data P as input as well as the the output ytest_LM, using the test data Ptest as input.

y_LM = NNOut(P,netLM);
ytest_LM = NNOut(Ptest,netLM);

Now we can do the same using the BFGS algorithm. The BFGS algorithm usually takes less time for one iteration, but needs more iterations to reach the same error than the LM. Therefore we set the number of iterations (epochs) to 200 and the termination error to 1e-3.

netBFGS = train_BFGS(P,Y,net,200,1e-3);
y_BFGS = NNOut(P,netBFGS);
ytest_BFGS = NNOut(Ptest,netBFGS);

Now we can plot the results, comparing the output of the two different neural networks with each other and with the training and the test data of the system.

fig = figure();
set(fig, 'Units', 'normalized', 'Position', [0.2, 0.1, 0.6, 0.6])
axis tight

subplot(311)
set(gca,'FontSize',16)
plot(Y,'r:','LineWidth',2)
hold on
grid on
plot(y_LM,'b','LineWidth',2)
plot(y_BFGS,'g','LineWidth',2)
l1 = legend('Train Data','LM output','BFGS output','Location','northwest');
set(l1,'FontSize',14)

subplot(312)
set(gca,'FontSize',16)
plot(Ytest,'r:','LineWidth',2)
hold on
grid on
plot(ytest_LM,'b','LineWidth',2)
plot(ytest_BFGS,'g','LineWidth',2)
l2 = legend('Test Data','LM output','BFGS output','Location','northwest');
set(l2,'FontSize',14)

subplot(313)
set(gca,'FontSize',16)
plot(netLM.ErrorHistory,'b','LineWidth',2)
hold on
grid on
plot(netBFGS.ErrorHistory,'g','LineWidth',2)
ylim([0,0.1])
l3 = legend('LM Error','BFGS Error','Location','northeast');
set(l3,'FontSize',14)
_images/example_matlab_pt2.png

narendra4

In this example a neural network is used to learn the narendra4 function introduced in

Narendra, K.S. und K. Parthasarathy: Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1):4–27,March 1990.

given by:

\[y[k+1] = \frac{y[k]*y[k-1]*y[k-2]*p[k-1]*(y[k-2]-1)+p[k]}{1+(y[k-1])^2 + (y[k-2])^2}\]

For training and testing the system a Amplitude Modlulated Pseudo Random Binary Sequence (APRBS) is used as input P and Ptest. A APRBS is a Pseudorandom binary sequence (PRBS), which is generated by a variation of the amplitude levels in addition to the variation of hold times of a PRBS.

This is an example of a dynamic system with one output and one delayed input and can be found in python\examples\example_narendra4.py and matlab\examples\example_narendra4.m.

Python

At first the needed packages are imported. pandas for reading the excelfile, matplotlib for plotting the results and pyrenn for the neural network.

import pandas as pd
import matplotlib.pyplot as plt
import pyrenn as prn

Then the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using pandas. Because we have only one input and one output here, the input and output data can be either a 1d numpy array, where the number of elements represents the number of data samples (this is the case here), or a 2d numpy array with shape (1,Q), where Q is the number of data samples.

df = pd.ExcelFile('example_data.xlsx').parse('narendra4')
P = df['P'].values
Y = df['Y'].values
Ptest = df['Ptest'].values
Ytest = df['Ytest'].values

Then the neural network is created. Since we have a system with 1 input and 1 output, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 3 neurons. To calculate the current output y[k], this system needs the inputs delayed by 1 and 2 time-steps p[k-1] and p[k-2], but not the current one p[k]. So we set the input delays to dIn = [1,2]. Since we have a dynamic system, we want to create a recurrent neural network. We know that the system has no internal states, so we do not need internal recurrent connections (dIntern = []). But we need a recurrent connection with delay of 1,2 and 3 timesteps from the output to the first layer (dOut = [1,2,3]), to calculate the current output y[k].

net = prn.CreateNN([1,3,3,1],dIn=[1,2],dIntern=[],dOut=[1,2,3])

Now we can train the created neural network net with the training data P and Y. verbose=True activates diplaying the error during training. We set the number of iterations (epochs) to 200 and the termination error to 1e-3. The Training will stop after 200 iterations or when Error <= E_stop.

net = prn.train_LM(P,Y,net,verbose=True,k_max=200,E_stop=1e-3)

After the training is finished, we can use the neural network. We calculate the neural network output y, using the training data P as input as well as the the output ytest, using the test data Ptest as input.

y = prn.NNOut(P,net)
ytest = prn.NNOut(Ptest,net)

Now we can plot the results, comparing the output of the neural network with the training and the test data of the system.

fig = plt.figure(figsize=(11,7))
ax0 = fig.add_subplot(211)
ax1 = fig.add_subplot(212)
fs=18

#Train Data
ax0.set_title('Train Data',fontsize=fs)
ax0.plot(y,color='b',lw=2,label='NN Output')
ax0.plot(Y,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Train Data')
ax0.tick_params(labelsize=fs-2)
ax0.legend(fontsize=fs-2,loc='upper left')
ax0.grid()

#Test Data
ax1.set_title('Test Data',fontsize=fs)
ax1.plot(ytest,color='b',lw=2,label='NN Output')
ax1.plot(Ytest,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax1.tick_params(labelsize=fs-2)
ax1.legend(fontsize=fs-2,loc='upper left')
ax1.grid()

fig.tight_layout()
plt.show()
_images/example_python_narendra4.png
Matlab

At first the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using xlsread. Because we have only one input and one output here, the input and output data has to be an array with size (1,Q), where Q is the number of data samples.

file = 'example_data.xlsx';
num = xlsread(file,'narendra4');
P = num(:,2).';
Y = num(:,3).';
Ptest = num(:,4).';
Ytest = num(:,5).';

Then the neural network is created. Since we have a system with 1 input and 1 output, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 3 neurons. To calculate the current output y[k], this system needs the inputs delayed by 1 and 2 time-steps p[k-1] and p[k-2], but not the current one p[k]. So we set the input delays to dIn = [1,2]. Since we have a dynamic system, we want to create a recurrent neural network. We know that the system has no internal states, so we do not need internal recurrent connections (dIntern = []). But we need and a recurrent connection with delay of 1,2 and 3 timesteps from the output to the first layer (dOut = [1,2,3]), to calculate the current output y[k].

nn = [1 2 2 1];
dIn = [1,2];
dIntern=[];
dOut=[1,2,3];
net = CreateNN(nn,dIn,dIntern,dOut);

Now we can train the created neural network net with the training data P and Y. In matlab we have the option to train the network with the LM algorithm or the BFGS algorithm.

We start with the LM algorithm. We set the number of iterations (epochs) to 200 and the termination error to 1e-3. The Training will stop after 200 iterations or when Error <= E_stop.

netLM = train_LM(P,Y,net,200,1e-3);

After the training is finished, we can use the neural network. We calculate the neural network output y_LM, using the training data P as input as well as the the output ytest_LM, using the test data Ptest as input.

y_LM = NNOut(P,netLM);
ytest_LM = NNOut(Ptest,netLM);

Now we can do the same using the BFGS algorithm. The BFGS algorithm usually takes less time for one iteration, but needs more iterations to reach the same error than the LM. Therefore we set the number of iterations (epochs) to 400 and the termination error to 1e-3.

netBFGS = train_BFGS(P,Y,net,400,1e-3);
y_BFGS = NNOut(P,netBFGS);
ytest_BFGS = NNOut(Ptest,netBFGS);

Now we can plot the results, comparing the output of the two different neural networks with each other and with the training and the test data of the system.

fig = figure();
set(fig, 'Units', 'normalized', 'Position', [0.2, 0.1, 0.6, 0.6])
axis tight

subplot(311)
set(gca,'FontSize',16)
plot(Y,'r:','LineWidth',2)
hold on
grid on
plot(y_LM,'b','LineWidth',2)
plot(y_BFGS,'g','LineWidth',2)
l1 = legend('Train Data','LM output','BFGS output','Location','northwest');
set(l1,'FontSize',14)

subplot(312)
set(gca,'FontSize',16)
plot(Ytest,'r:','LineWidth',2)
hold on
grid on
plot(ytest_LM,'b','LineWidth',2)
plot(ytest_BFGS,'g','LineWidth',2)
l2 = legend('Test Data','LM output','BFGS output','Location','northwest');
set(l2,'FontSize',14)

subplot(313)
set(gca,'FontSize',16)
plot(netLM.ErrorHistory,'b','LineWidth',2)
hold on
grid on
plot(netBFGS.ErrorHistory,'g','LineWidth',2)
ylim([0,5])
l3 = legend('LM Error','BFGS Error','Location','northeast');
set(l3,'FontSize',14)
_images/example_matlab_narendra4.png

Compressed air storage system

In this example a neural network is used to learn the behavior of a compressed air storage test system.

_images/DLA.png

The system consists of a compressor, a booster, and a high pressure storage. The system has 3 inputs. If the binary input P1 (charge) is 1, the booster is running and charges the storage (to a maximum pressure of 38 bar). If the binary input P2 (discharge) is 1, the booster and the compressor are off and the air demand (input P3) is covered by the air in the storage, which is discharged (to a minimum pressure of 7 bar). If the storage is neither charged nor discharged, the compressor covers the air demand. The neural network is trained to calculate the state of the pressure in the storage (Y1) and the electric power consumption of the booster and the compressor (Y2) based on the given current inputs and the previous outputs.

This is an example of a dynamic system with 2 outputs and 3 inputs (see python\examples\example_compair.py and matlab\examples\example_compair.m).

Python

At first the needed packages are imported. pandas for reading the excelfile, matplotlib for plotting the results and pyrenn for the neural network.

import pandas as pd
import matplotlib.pyplot as plt
import pyrenn as prn

Then the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using pandas. We have 3 inputs here, so the input P is a 2d array with shape(3,Q), where Q is the number of data samples. Since we have 2 ouputs the output Y is a 2d array with shape (2,Q).

df = pd.ExcelFile('example_data.xlsx').parse('compressed_air')
P = np.array([df['P1'].values,df['P2'].values,df['P3'].values])
Y = np.array([df['Y1'].values,df['Y2']])
Ptest = np.array([df['P1test'].values,df['P2test'].values,df['P3test'].values])
Ytest = np.array([df['Y1test'].values,df['Y2test']])

Then the neural network is created. Since we have a system with 3 inputs and 2 outputs, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 5 neurons. The system has no input delays, so we only use the current input dIn = [0]. Since we have a dynamic system, we want to create a recurrent neural network. We know that the system has no internal states, so we do not need internal recurrent connections (dIntern = []). But we need and a recurrent connection with delay of 1 timesteps from the output to the first layer (dOut = [1]), because the pressure of the current time-step depends on the pressure in the previous one.

net = prn.CreateNN([3,5,5,2],dIn=[0],dIntern=[],dOut=[1])

Now we can train the created neural network net with the training data P and Y. verbose=True activates diplaying the error during training. We set the number of iterations (epochs) to 500 and the termination error to 1e-5. The Training will stop after 500 iterations or when Error <= E_stop.

net = prn.train_LM(P,Y,net,verbose=True,k_max=500,E_stop=1e-5)

After the training is finished, we can use the neural network. We calculate the neural network output y, using the training data P as input as well as the the output ytest, using the test data Ptest as input. Since we have 2 outputs, y and ytest here are 2d arrays of shape (2,Q).

y = prn.NNOut(P,net)
ytest = prn.NNOut(Ptest,net)

Now we can plot the results, comparing the output of the neural network with the training and the test data of the system.

fig = plt.figure(figsize=(15,10))
ax0 = fig.add_subplot(221)
ax1 = fig.add_subplot(222,sharey=ax0)
ax2 = fig.add_subplot(223)
ax3 = fig.add_subplot(224,sharey=ax2)
fs=18

t = np.arange(0,480.0)/4 #480 timesteps in 15 Minute resolution
#Train Data
ax0.set_title('Train Data',fontsize=fs)
ax0.plot(t,y[0],color='b',lw=2,label='NN Output')
ax0.plot(t,Y[0],color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Data')
ax0.tick_params(labelsize=fs-2)
ax0.legend(fontsize=fs-2,loc='upper left')
ax0.grid()
ax0.set_ylabel('Storage Pressure [bar]',fontsize=fs)
plt.setp(ax0.get_xticklabels(), visible=False)

ax2.plot(t,y[1],color='b',lw=2,label='NN Output')
ax2.plot(t,Y[1],color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Train Data')
ax2.tick_params(labelsize=fs-2)
ax2.grid()
ax2.set_xlabel('Time [h]',fontsize=fs)
ax2.set_ylabel('el. Power [kW]',fontsize=fs)

#Test Data
ax1.set_title('Test Data',fontsize=fs)
ax1.plot(t,ytest[0],color='b',lw=2,label='NN Output')
ax1.plot(t,Ytest[0],color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax1.tick_params(labelsize=fs-2)
# ax1.legend(fontsize=fs-2,loc='upper left')
ax1.grid()
plt.setp(ax1.get_xticklabels(), visible=False)
plt.setp(ax1.get_yticklabels(), visible=False)

ax3.plot(t,ytest[1],color='b',lw=2,label='NN Output')
ax3.plot(t,Ytest[1],color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax3.tick_params(labelsize=fs-2)
ax3.grid()
ax3.set_xlabel('Time [h]',fontsize=fs)
plt.setp(ax3.get_yticklabels(), visible=False)

fig.tight_layout()
plt.show()
_images/example_python_compair.png
Matlab

At first the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given excel file using xlsread. We have 3 inputs here, so the input P is a 2d array with size(3,Q), where Q is the number of data samples. Since we have 2 ouputs the output Y is a 2d array with size (2,Q).

file = 'example_data.xlsx';
num = xlsread(file,'compressed_air');
P = num(:,2:4).';
Y = num(:,5:6).';
Ptest = num(:,7:9).';
Ytest = num(:,10:11).';

Then the neural network is created. Since we have a system with 3 inputs and 2 outputs, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with two hidden layers, each with 5 neurons. The system has no input delays, so we only use the current input dIn = [0]. Since we have a dynamic system, we want to create a recurrent neural network. We know that the system has no internal states, so we do not need internal recurrent connections (dIntern = []). But we need and a recurrent connection with delay of 1 timesteps from the output to the first layer (dOut = [1]), because the pressure of the current time-step depends on the pressure in the previous one.

nn = [3 5 5 2];
dIn = [0];
dIntern=[];
dOut=[1];
net = CreateNN(nn,dIn,dIntern,dOut);

Now we can train the created neural network net with the training data P and Y. In matlab we have the option to train the network with the LM algorithm or the BFGS algorithm.

We start with the LM algorithm. We set the number of iterations (epochs) to 500 and the termination error to 1e-5. The Training will stop after 500 iterations or when Error <= E_stop.

netLM = train_LM(P,Y,net,500,1e-5);

After the training is finished, we can use the neural network. We calculate the neural network output y_LM, using the training data P as input as well as the the output ytest_LM, using the test data Ptest as input. Since we have 2 outputs, y and ytest here are 2d arrays of ssize (2,Q).

y_LM = NNOut(P,netLM);
ytest_LM = NNOut(Ptest,netLM);

Now we can do the same using the BFGS algorithm. The BFGS algorithm usually takes less time for one iteration, but needs more iterations to reach the same error than the LM. Therefore we set the number of iterations (epochs) to 1000 and the termination error to 1e-5.

netBFGS = train_BFGS(P,Y,net,1000,1e-5);
y_BFGS = NNOut(P,netBFGS);
ytest_BFGS = NNOut(Ptest,netBFGS);

Now we can plot the results, comparing the output of the two different neural networks with each other and with the training and the test data of the system.

t = (1:480.0)./4; %480 timesteps in 15 Minute resolution

fig = figure();
set(fig, 'Units', 'normalized', 'Position', [0.2, 0.1, 0.6, 0.6])

subplot(221)
title('Test Data')
set(gca,'FontSize',16)
plot(t,Y(1,:),'r:','LineWidth',2)
hold on
grid on
plot(t,y_LM(1,:),'b','LineWidth',2)
plot(t,y_BFGS(1,:),'g','LineWidth',2)
l1 = legend('Train Data','LM output','BFGS output','Location','northoutside','Orientation','horizontal');
set(l1,'FontSize',14)
ylabel('Storage Pressure [bar]')
axis tight

subplot(223)
set(gca,'FontSize',16)
plot(t,Y(2,:),'r:','LineWidth',2)
hold on
grid on
plot(t,y_LM(2,:),'b','LineWidth',2)
plot(t,y_BFGS(2,:),'g','LineWidth',2)
ylabel('el. Power [kW]')
xlabel('time [h]')
axis tight

subplot(222)
title('Train Data')
set(gca,'FontSize',16)
plot(t,Ytest(1,:),'r:','LineWidth',2)
hold on
grid on
plot(t,ytest_LM(1,:),'b','LineWidth',2)
plot(t,ytest_BFGS(1,:),'g','LineWidth',2)
l2 = legend('Test Data','LM output','BFGS output','Location','northoutside','Orientation','horizontal');
set(l2,'FontSize',14)
axis tight

subplot(224)
set(gca,'FontSize',16)
plot(t,Ytest(2,:),'r:','LineWidth',2)
hold on
grid on
plot(t,ytest_LM(2,:),'b','LineWidth',2)
plot(t,ytest_BFGS(2,:),'g','LineWidth',2)
xlabel('time [h]')
axis tight
_images/example_matlab_compair.png

Using previous data P0 and Y0 in a recurrent neural network

This example shows how to use known previous data P0 and Y0 when using a trained neural network to calculate outputs (see Using previous inputs and outputs for recurrent networks or networks with delayed inputs).

It is based on the narendra4 example and can be found in python\examples\example_using_P0Y0_narendra4.py and matlab\examples\example_using_P0Y0_narendra4.m. Another example that shows how to use previous inputs and outputs P0 and Y0 can be found in example_using_P0Y0_compair.py and example_using_P0Y0_compair.m

Python

At first the needed packages are imported, and the training and test data is read from the excel file.

import pandas as pd
import matplotlib.pyplot as plt
import pyrenn as prn

df = pd.ExcelFile('example_data.xlsx').parse('narendra4')
P = df['P'].values
Y = df['Y'].values
Ptest_ = df['Ptest'].values
Ytest_ = df['Ytest'].values

Now we assume, that we already know the first 3 data samples of the input and output of test data and define them as P0test and Y0test (we choose 3 time-steps, because the neural network has a maximum delay of 3). Then we define the applied test data Ptest and the system output we can compare the network output with Ytest, which do not contain the data samples we already know.

#define the first 3 timesteps t=[0,1,2] of Test Data as previous (known) data P0test and Y0test
P0test = Ptest_[0:3]
Y0test = Ytest_[0:3]
#Use the timesteps t = [3..99] as Test Data
Ptest = Ptest_[3:100]
Ytest = Ytest_[3:100]

Then we create and train the network

net = prn.CreateNN([1,3,3,1],dIn=[1,2],dIntern=[],dOut=[1,2,3])
net = prn.train_LM(P,Y,net,verbose=True,k_max=200,E_stop=1e-3)

Now we can use the trained network. To investigate the influence of using the previous data P0test and Y0test, we calculate the neural network output ytest0 with and ytest without using them.

ytest = prn.NNOut(Ptest,net)
y0test = prn.NNOut(Ptest,net,P0=P0test,Y0=Y0test)

Now we plot the results

fig = plt.figure(figsize=(11,7))
ax1 = fig.add_subplot(111)
fs=18

#Test Data
ax1.set_title('Test Data',fontsize=fs)
ax1.plot(ytest,color='b',lw=2,label='NN Output without P0,Y0')
ax1.plot(y0test,color='g',lw=2,label='NN Output with P0,Y0')
ax1.plot(Ytest,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax1.tick_params(labelsize=fs-2)
ax1.legend(fontsize=fs-2,loc='lower right')
ax1.grid()

fig.tight_layout()
plt.show()

Looking at the results, we can see the difference of the neural network outputs. If we do not use previous data P0 and Y0, the neural network sets the not known delayed inputs and outputs to zero, which leads to a poor result of the neural network ouput for the first time-steps. In contrast, when the prvoius outputs P0 and Y0 are used, the network shows good results from the beginning.

_images/example_python_narendra4_P0Y0.png
Matlab

At first the training and test data is read from the excel file.

file = 'example_data.xlsx';
num = xlsread(file,'narendra4');
P = num(:,2).';
Y = num(:,3).';
Ptest_ = num(:,4).';
Ytest_ = num(:,5).';

Now we assume, that we already know the first 3 data samples of the input and output of test data and define them as P0test and Y0test (we choose 3 time-steps, because the neural network has a maximum delay of 3). Then we define the applied test data Ptest and the system output we can compare the network output with Ytest, which do not contain the data samples we already know.

%define the first 3 timestep t=[1,2,3] of Test Data as previous (known)
%data P0test and Y0test
P0test = Ptest_(:,1:3);
Y0test = Ytest_(:,1:3);
%Use the timesteps t = [4..100] as Test Data
Ptest = Ptest_(:,4:100);
Ytest = Ytest_(:,4:100);

Then we create and train the network with the LM algorithm

nn = [1 2 2 1];
dIn = [1,2];
dIntern=[];
dOut=[1,2,3];
net = CreateNN(nn,dIn,dIntern,dOut);
net = train_LM(P,Y,net,200,1e-3);

Now we can use the trained network. To investigate the influence of using the previous data P0test and Y0test, we calculate the neural network output ytest0 with and ytest without using them.

ytest = NNOut(Ptest,net);
y0test = NNOut(Ptest,net,P0test,Y0test);

Now we plot the results

fig = plt.figure(figsize=(11,7))
ax1 = fig.add_subplot(111)
fs=18

#Test Data
ax1.set_title('Test Data',fontsize=fs)
ax1.plot(ytest,color='b',lw=2,label='NN Output without P0,Y0')
ax1.plot(y0test,color='g',lw=2,label='NN Output with P0,Y0')
ax1.plot(Ytest,color='r',marker='None',linestyle=':',lw=3,markersize=8,label='Test Data')
ax1.tick_params(labelsize=fs-2)
ax1.legend(fontsize=fs-2,loc='lower right')
ax1.grid()

fig.tight_layout()
plt.show()

Looking at the results, we can see the difference of the neural network outputs. If we do not use previous data P0 and Y0, the neural network sets the not known delayed inputs and outputs to zero, which leads to a poor result of the neural network ouput for the first time-steps. In contrast, when the prvoius outputs P0 and Y0 are used, the network shows good results from the beginning.

_images/example_matlab_narendra4_P0Y0.png

Calculate the gradient of a neural network

This example shows how to calculate the gradient vector (or derivative vector) \(\underline{g}\) of the Error \(E\) of a neural network with respect to the weight vector \(\underline{w}\). The gradient vector can be calculated either using RTRL or BPTT algorithm. The gradient \(\underline{g}\) or the Jacobian \(\widetilde{J}\) can be used to implement other training algorithms than the ones given in pyrenn. The examples can be found in python\examples\gradient_calculation.py and matlab\examples\gradient_calculation.m.

Calculating the gradient vector \(\underline{g}\) with the RTRL algorithm needs more time than with the BPTT algorithm. But for calculating the Jacobian matrix \(\widetilde{J}\), the RTRL algorithm is faster than the BPTT algorithm (calculating \(\widetilde{J}\) with BPTT is not implemented in pyrenn). So if the training algorithm uses the Jacaobian matrix \(\widetilde{J}\) (like the LM algorithm), RTRL is prefered and if the training algorithm uses the gradient vector \(\underline{g}\) (like the BFGS algorithm), BPTT is preferred.

Python

At first the needed packages are imported, the training data is read from the excel file and a neural network is created.

import numpy as np
import pandas as pd
import time

import pyrenn as prn

df = pd.ExcelFile('example_data.xlsx').parse('pt2')
P = df['P'].values
Y = df['Y'].values
net = prn.CreateNN([1,2,2,1],dIn=[0],dIntern=[1],dOut=[1,2])

Then the dict data, which contains the training data P and Y has to be created using prn.prepare_data().

data,net = prn.prepare_data(P,Y,net)

Now the gradient can be calculated. The function prn.RTRL() uses the Real Time Recurrent Learning algorithm and returns the Jacobian matrix \(\widetilde{J}\), the error vector \(\underline{e}\) and the error \(E\) for the current weight vector \(\underline{w}=\) net['w']. The gradient vector \(\underline{g}\) can then be caluclated with \(\underline{g} = 2 * \widetilde{J}^T * \underline{e}\)

J,E,e = prn.RTRL(net,data)
g_rtrl = 2 * np.dot(J.transpose(),e)

The function prn.BPTT() uses the Back Propagation Through Time algorithm and returns the gradient vector \(\underline{g}\) and the error \(E\) for the current weight vector \(\underline{w}=\) net['w']

g_bptt,E = prn.BPTT(net,data)
Matlab

At first the training data is read from the excel file and a neural network is created.

file = 'example_data.xlsx';
num = xlsread(file,'pt2');
P = num(:,2).';
Y = num(:,3).';

nn = [1 2 2 1];
dIn = [0];
dIntern=[1];
dOut=[1,2];
net = CreateNN(nn,dIn,dIntern,dOut);

Then the struct data, which contains the training data P and Y has to be created using prepare_data().

data = prepare_data(P,Y,net,{},0);

Now the gradient can be calculated. The function prn.RTRL() uses the Real Time Recurrent Learning algorithm and returns the Jacobian matrix \(\widetilde{J}\), the error vector \(\underline{e}\) and the error \(E\) for the current weight vector \(\underline{w}=\) net.w. The gradient vector \(\underline{g}\) can then be caluclated with \(\underline{g} = 2 * \widetilde{J}^T * \underline{e}\)

[J,E,e] = RTRL(net,data);
g_rtrl = 2.*J'*e;

The function prn.BPTT() uses the Back Propagation Through Time algorithm and returns the gradient vector \(\underline{g}\) and the error \(E\) for the current weight vector \(\underline{w}=\) net.w.

g_bptt = BPTT(net,data);

Classification (MNIST Data)

In this example a neural network is used to learn to recognize handwritten digits. Thefore the MNIST dataset hosted on `Yann LeCun's website`_ is used. The data set consists of 60,000 data points for training and 10,000 data points for testing. To reduce the size of the data file, here only 25,000 data points for training and 5,000 for testing are used. Each data point is defined by an 28X28 pixel image (784 numbers) and the corresponing number represesented by a 10 element vector (one element for each digit 0,1,2,3,4,5,6,7,8,9). For the number n, only the n-th element is 1, all others are zero. So the vector [0 0 0 0 0 1 0 0 0 0] represents the number 5. A more detailed explanation of the MNIST data can be found in the Tensorflow tutorial.

Python

At first the needed packages are imported. pickle for reading the data, matplotlib for plotting the results, numpy for its random function and pyrenn for the neural network.

import matplotlib as mpl
import matplotlib.pyplot as plt
import pickle
import numpy as np
import pyrenn as prn

Then the training input data P and output (target) data Y as well as the test input data Ptest and output data Ytest is read from the given pickle file. Each image is defined by the value of its 784 pixel, so P is a 2d array of size (784,Q), where Q is the number of data samples (25,000). Y is defined by an 10 element vector, which gives us a 2d array of size (10,Q=5000).

mnist = pickle.load( open( "MNIST_data.pkl", "rb" ) )
P = mnist['P']
Y = mnist['Y']
Ptest = mnist['Ptest']
Ytest = mnist['Ytest']

Then the neural network is created. Since we have a system with 28*28 inputs and 10 outputs, we need a neural network with the same number of inputs and outputs. For this system we choose a neural network with one hidden layer with 10 neurons. Since there is no interconnection between the images, we do not need a recurrent network and no delayed inputs, so we do not have to change the delay inputs.

net = prn.CreateNN([28*28,10,10])

Because training the network with all the available training data would need a lot of memory and time, we randomly extract a batch of 1000 data samples and use it to train the network. Because we want to use as much information of our data as possible, we only perform one iteration (k_max=1) and then extract a new batch. In this example we do this 20 times, so we train the net for 20 iterations, but each iteration with new training data. verbose=True activates diplaying the error during training.

batch_size = 1000
number_of_batches=20

for i in range(number_of_batches):
    r = np.random.randint(0,25000-batch_size)
    Ptrain = P[:,r:r+batch_size]
    Ytrain = Y[:,r:r+batch_size]

    #Train NN with training data Ptrain=input and Ytrain=target
    #Set maximum number of iterations k_max
    #Set termination condition for Error E_stop
    #The Training will stop after k_max iterations or when the Error <=E_stop
    net = prn.train_LM(Ptrain,Ytrain,net,
                           verbose=True,k_max=1,E_stop=1e-5)
    print('Batch No. ',i,' of ',number_of_batches)

After the training is finished, we can use the neural network. Therefore we choose 9 random samples of the test data set and use the input to calculate the NN outputs Then we can plot the results, comparing the output of the neural network (number above the image) with the training (image).

idx = np.random.randint(0,5000-9)
P_ = Ptest[:,idx:idx+9]
Y_ = prn.NNOut(P_,net)

fig = plt.figure(figsize=[11,7])
gs = mpl.gridspec.GridSpec(3,3)

for i in range(9):

    ax = fig.add_subplot(gs[i])

    y_ = np.argmax(Y_[:,i]) #find index with highest value in NN output
    p_ = P_[:,i].reshape(28,28) #Convert input data for plotting

    ax.imshow(p_) #plot input data
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_title(str(y_), fontsize=18)

plt.show()
_images/example_python_classification.png

Features

Get Started

  1. download or clone (with git) this repository to a directory of your choice.
    • Python: Copy the pyrenn.py file in the python folder to a directory which is already in python’s search path or add the python folder to python’s search path (sys.path) (how to)
    • Matlab: Add the matlab folder to Matlab’s search path (how to)
  2. Run the given examples in the examples folder.
  3. Create your own neural network.

Dependencies (Python)

  • numpy for mathematical operations
  • pandas only for using the examples