prodyn: Implementation of the dymaic programming algorithm for optimal system control

Maintainer:Dennis Atabay, <dennis.atabay@tum.de>
Organization:Institute for Energy Economy and Application Technology, Technische Universität München
Version:0.1
Date:Mar 04, 2018
Copyright:This documentation is licensed under a Creative Commons Attribution 4.0 International license.

Contents

This documentation contains the following pages:

Dynamic Programming

Dynamic Programming (DP) is an optimization technique introduced by Richard Bellman, that can be applied to multistage decision problems requiring a sequence of interrelated decisions. It is based on Bellman’s principle of optimality which states that a sub-policy of an optimal policy for a given problem must itself be an optimal policy of the sub-problem.

In DP the decision problem is usually described as follows. The variable x describes the state of the system. x is an element of a set of possible system states \(x \in X = [x^1, x^2, x^3, ...]\). If x not already a discrete variable, it has to be discretized in order to use DP. The possible (discrete) decisions (or control signals) that can be applied to the system are defined as a set \(u \in U = [u^1, u^2, u^3, ...]\). The function \(f_t(x_t,u_t)\) calculates the following state of the system \(x_{t+1}\) for the current state \(x_{t}\) when the decision \(u_t\) is applied at timestep \(t\).

\[x_{t+1} = f_t(x_t,u_t)\]

The function \(g_t(x_t,u_t)\) calculates the costs \(c_{t,x_t,u_t}\) for going from \(x_{t}\) to \(x_{t+1}\) when the decision \(u_t\) is applied at timestep \(t\).

\[c_{t,x_t,u_t} = g_t(x_t,u_t)\]

The DP algorithm finds the control sequence \(\pi = [u_1, u_2, ..., u_N]\) that minimizes the total costs \(J\) over all timesteps \(T = [t_1, t_2, ..., t_N]\).

\[J_\pi =\sum_{t=1}^{N}g_t(x_t,u_t^\pi)\]

To find the optimal solution, the DP algorithms proceeds forward or backward in time solving the subproblem of every timestep under consideration of the already solved subproblems.

In the following the general idea of both, the forward and the backward implementation of DP will be explained based on a simple storage example.

Simple storage example

To explain the DP algorithm a simple storage example is inroduced. A battery storage with an energy capacity of 1kWh is conected to the grid. The electricity price for buying from or selling to the grid for every timestep is given as a time-series \(d_t\) (\(d_1 = 1€\), \(d_2 = 2€\), \(d_3 = 3€\)). The energy content of the battery, which is the state variable of this system is defined by the discrete set X = [0kWh, 1kWh], which means the storage can be either full or empty. The set of possible control signals (decisions) consists of the three signals: charge with 1 kWh (+1kWh), wait (+-0kWh) and discharge with 1 kWh (-1kWh).

_images/storage_example.png

Simple storage example

So, when the battery is charged, the new state is always 1kWh (full), even if was already full before. When it is discharged, it is allways 0kWh (full) and if it is neither charged nor discharged, the state stays unchanged.

\[\begin{split}x_{t+1} = \begin{cases} 1kWh & \qquad \text{if u = +1kWh}\\ x_t & \qquad \text{if u = +-0kWh}\\ 0kWh & \qquad \text{if u = -1kWh} \end{cases}\end{split}\]

If the storage is charged, we have to pay for the electricity from the grid, so in this case the costs are defined by the price at the current timestep \(d_t\). If the storage is already full (\(x_t\) =1kWh), the control signal charge (u=+1kWh) is an invalid signal. To prevent that this signal will be part of the solution, we set the (penalty) costs for this case very high (99 €). We do the same, if storage is already empty (\(x_t\) =0kWh) and the signal discharge (u=-1kWh) is applied. If we discharge a full storage, we sell the electricity to the grid which results in negative costs \(d_t\) . If we neither charge nor discharge, we don’t make or spend any money.

\[\begin{split}c_{t,x_t,u_t} = \begin{cases} 99 € & \qquad \text{if u=+1kWh and x_t=1kWh} \\ d_t & \qquad \text{if u=+1kWh and x_t=0kWh} \\ 0 € & \qquad \text{if u=+-0kWh}\\ -d_t & \qquad \text{if u=-1kWh and x_t=1kWh} \\ 99 € & \qquad \text{if u=-1kWh and x_t=0kWh} \end{cases}\end{split}\]

The following figure shows the possible states x and decisions u of the storage system for the 3 timesteps. Note, that because we need a defined initial and end state, the system has 4 states!

_images/DP_general.png

Possible states x and decisions u of the storage system

An implementation of the example can be found in the examples folder.

Backward Dynamic Programming

The Backward Dynamic Programming starts with the last timestep and goes backward in time to solve the optimization problem step by step. the backward algorithm is easier to implement (and in prodyn faster) than the forward one.

So based on the possibilities shown in following figure, we want to find the path which leads to the cheapest total costs. Therefore we need to calculate our total costs \(J_t\) to get to a certain state in a certain timestep t. We can also define initial end costs (here \(J_3\)) for the last timestep, which is necessary in Backward DP if we want our solution to have a defined end state (see Define initial costs). In this example, we set the initial end costs \(J_3\) (end of last timestep in Backward DP) to 0 € for both possible states.

_images/DP_bw_01.png

Now we go one timestep backwards (t=2) and calculate the total costs \(J_2\) for every possible state \(x_{2}\) and every possible decision u. The total costs \(J_2\) are defined as the sum of the costs c_{t,x_t,u_t} for going from \(x_{t}\) to \(x_{t+1}\) when the decision \(u_t\) is applied at timestep \(t\) and the total costs \(J_3\) of the following sate \(x_{t+1}\). Then we select the decision, which leads to least total costs \(J_2\) as solution of our sub-problem.

\[J_{t,x_t,u_t} = c_{t,x_t,u_t} + J_{t+1,x_t,u_t}\]

In our example we start with state \(x_{2}=1kWh\) . We try to charge (u=+1kWh) the full storage, the costs are very high because of the penalty costs (c=99€). The following state is here defined as x_{3}=1kWh . If we wait, we have no costs (c=0€) and the foloowing state stays the same x_{3}=1kWh . If we discharge, we sell the electricity to the grid for the current price \(d_3 = 3€\), which leads to costs of c=-3€. Since \(J_3=0€\) for every following state, here discharging leads to the minimal total costs.

_images/DP_bw_02.png

If we do the same for state \(x_{2}=0kWh\), we see that the costs for waiting (c=0€) is cheaper than for charging (c=3€) and discharging (c=99€). Since \(J_3=0€\) the go-to costs c already represent the total costs J.

_images/DP_bw_03.png

Now for every possible state \(x_{2}\) we select the decision (path) which leads to the lowest total costs \(J_2\). We discard all other options, whcih is basically the “trick” in DP. If we are in timestep 2, we now the best solution to go from here for every possible state \(x_{2}\). So all other states are not relevant for the solution. In our example, we can see, that a full storage at the end of timetsep 3 will not be part of the solution. If we wanted to find a solution, where the storage is full at the end, we should have defined proper initial total costs \(J_2\) .

_images/DP_bw_04.png

Now we go one more timestep backwards (t=1) and calculate the total costs \(J_1\) for every possible state \(x_{1}\) and every possible decision u. Now we have to consider the total following costs \(J_2\) dependent on the following state \(x_{2}\) . Here, the decisions leading to the cheapest total costs are waiting for \(x_{1}=1kWh\) and charging for \(x_{1}=0kWh\) .

_images/DP_bw_05.png
_images/DP_bw_06.png

Once again, we go one more timestep backwards (t=0) and calculate the total costs \(J_0\) for every possible initial state \(x_{0}\) and every possible decision u. Now we have to consider the total following costs \(J_1\) dependent on the following state \(x_{1}\) . Here, the decisions leading to the cheapest total costs are waiting for \(x_{0}=1kWh\) and charging for \(x_{0}=0kWh\) .

_images/DP_bw_07.png
_images/DP_bw_08.png

As we can see, the Backwards Dynamic Programming algorithm (in prodyn) gives us the solution for every possible initial state \(x_{0}\) . If we want the solution to have a specific end state, we have to define initial costs \(J_end\) . How to do this is prodyn see Define initial costs

Forward Dynamic Programming

The Forward Dynamic Programming starts with the first timestep and goes forward in time to solve the optimization problem step by step. The forward algorithm is more difficult to implement (and in prodyn slower) than the backward one. The advantage of the forward algorithm is that it allows one to use the results and states of all previous timesteps to calculate the next state and the costs.

Again we want to find the path which leads to the cheapest total costs, based on the possibilities shown in the following figure.

_images/DP_fw_01.png

Therefore we need to calculate our total costs \(J_t\) to get to a certain state in a certain timestep t. We can also define initial costs (here \(J_0\)) for the first timestep, which is necessary in forward DP if we want our solution to have a defined starting state (see Define initial costs). In this example, we set the initial costs \(J_0\) to 0 € for both possible states.

_images/DP_fw_02.png

Now we calculate the total costs \(J_1\) for every possible initial state \(x_{0}\) and every possible decision u. The total costs \(J_1\) are defined as the sum of the costs c_{t,x_t,u_t} for going from \(x_{t}\) to \(x_{t+1}\) when the decision \(u_t\) is applied at timestep \(t\) and the total costs \(J_0\) of the previous sate \(x_{t}\). Then for each possible following state \(x_{t+1}\) , we select the decision, which leads to least total costs \(J_1\) as solution of our sub-problem.

\[J_{t,x_t,u_t} = c_{t,x_t,u_t} + J_{t+1,x_t,u_t}\]

In our example we start with state \(x_{0}=1kWh\) . If we try to charge (u=+1kWh) the full storage, the costs are very high because of the penalty costs (c=99€). The following state is here defined as \(x_{1}=1kWh\) . If we wait, we have no costs (c=0€) and the foloowing state stays the same \(x_{1}=1kWh\) . If we discharge, we sell the electricity to the grid for the current price \(d_1 = 1€\), which leads to costs of c=-1€. If we do the same for state \(x_{0}=0kWh\), we can caculate the costs for waiting (c=0€), charging (c=1€) and discharging (c=99€).

_images/DP_fw_03.png

To choose the cheapest path, in forward DP we want to find the decision, which leads to the minimal total costs J for getting to each possible following state. So at first we have a look at the following state \(x_{1}=1kWh\) . From the 3 options to get to this state (blue), we choose the one with the cheapest total costs J, which is from state \(x_{0}=1kWh\) by waiting (u=+-0kWh).

Now we look at the following state \(x_{1}=0kWh\) . To get there we have again 3 options (green). We choose the option discharging (u=-1kWh) from previous state x_{0}=1kWh , which is the option with the least total costs J.

_images/DP_fw_04.png

Now we so the same for the following timesteps, as shown in the following figures.

_images/DP_fw_05.png
_images/DP_fw_06.png
_images/DP_fw_07.png

After finishing with the last timestep, we have an optomal solution for every possible end state \(x_{3}\).

As we cann see, starting from an empty storage \(x_{0}=0kWh\) was already excluded as part of optimal solution after the first step. If we want to have a specific initial state in forward DP, we have to define initial costs \(J_end\) . How to do this is prodyn see Define initial costs

Using prodyn

Define states and decisions in prodyn

States in prodyn

In prodyn states are defined using a pandas DataFrame. The index of the DataFrame represents the names of the defined state variables. The DateFrame has three columns:

  • min: The minimum possible value of the state variable
  • max: The maximum possible value of the state variable
  • xstpeps: The number of discrete states for the state variable

For internal calculations and as an input for the system function, prodyn creates an numpy state array X.

When only one state variable is defined, X is a 1-D numpy array of shape (xstpeps,) with xstep values between xmin and xmax. For internal calculations, prodyn also calculates an index vector Xidx and an array that contains all state vectors of each variable XX. For one state variable X and XX have the same values. Xidx is the index array of X. Every possible state value has a coresponding index from zero to the number of discrete states xsteps .

Example of states Dataframe for one state variable
  xmin xmax xsteps
state      
battery 0 12 5
\[\begin{split}X = \begin{bmatrix} 0\\ 3\\ 6\\ 9\\ 12\\ \end{bmatrix}\:\:\:\:\:\:\:\: Xidx = \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ 4\\ \end{bmatrix}\:\:\:\:\:\:\:\: XX = \begin{bmatrix} 0\\ 3\\ 6\\ 9\\ 12\\ \end{bmatrix}\end{split}\]

For N state variables, X becomes a 2-D array of shape (N,xstpeps1*xstpeps2*…*xstpepsN). XX is an array that contains all state vectors of each variable. X is the cartesian product of XX with all possible state combinations.

Xidx is the index array of X. Every row (possible state combination) gets a coresponding index from zero to the number of states (xstpeps1*xstpeps2*…*xstpepsN).

Example of states Dataframe for two state variables
  xmin xmax xsteps
state      
battery 0 3 4
heat-storage 0 5 2
\[\begin{split} X = \begin{bmatrix} 0 & 0\\ 0 & 5\\ 1 & 0\\ 1 & 5\\ 2 & 0\\ 2 & 5\\ 3 & 0\\ 3 & 5\\ \end{bmatrix}\:\:\:\:\:\:\:\: Xidx = \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ 4\\ 5\\ 6\\ 7\\ \end{bmatrix}\:\:\:\:\:\:\:\: XX = \begin{bmatrix} \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ \end{bmatrix} \begin{bmatrix} 0\\ 5\\ \end{bmatrix} \end{bmatrix}\end{split}\]

If you want to see how the state vector X, which is an input for the system funtion, or the index vector Xidx looks like, this is possible by calling the function prepare_DP() .

X, Xidx, XX, xsteps, columns, columns_u = prodyn.prepare_DP(states)
prodyn.prepare_DP(states)
Parameters:
  • states – pandas DataFrame where each index represents a state variable
  • xmin – minimum value of the state variable
  • xmax – maximum value of the state variable
  • xstpes – number of discretization steps for this state variable
Returns:

the state vector X and other parameters needed for internal calculations.

  • X: state vector with all possible state combinations
  • Xidx: index vector
  • XX: array that contains all state vectors of each variable
  • xsteps: arry containig the number of steps for each variable
  • columns: column names needed to create the Data Dataframe
  • columns_u: columns with dditional element ‘U’

Decisions in prodyn

All possible decisions in prodyn are simply defined within a list (usually called U) and can be numbers or string or any single element data type. The elements of the list are passed one by one to the system function within a loop. The system funtion must be able to calculate an valid ouputs for every decision u within the list of all decisions U.

Here is one example of defining U where the possible decisions are [-1,0.+1]

U = [-1,0,1]

Here is another example of defining U where the possible decisions are [charge,normal,discharge]

U = ['charge','normal','discharge']

The system function in prodyn

To use prodyn, a system function that calculates the following state vector x_j and the cost to go vector cost for each element of the state vector X. The function has to be defined with 6 inputs and has to return 3 outputs. How the outputs are calculated inside the function can be chosen freely. It is also possible to call external functions or programs. Also the names of the funtion itself, and its inputs and outputs can be different than in the following description. But they have to be in the right order and the right data type.

prodyn.sytem_function(u, X, t, cst, srs, Data)
Parameters:
  • u – current decision, one element of the defined decisions U
  • X – state vector with all possible state combinations
  • t (int) – current timestep
  • cst – any type of variable that is directly passed to the system function and can be used inside the system function
  • srs – any type of variable that is directly passed to the system function and can be used inside the system function
  • Data – pandas DataFrame that contains the results from the previously calculated timesteps. It can be accessed in the same way as the result data (see Accessing the results)
Returns:

cost for getting to following state x_j and additional data

  • cost: cost to go vector, containg the costs gor every possible state combination when applying decision u at the current timestep t, 1-D numpy array with the same number of elements as possible state combinations in X
  • x_j: calculated following states of each state in X, numpy array with the same shape as X.
  • data: pandas DataFrame allowing to store additional parameters in the result DataFrame Data, which will also be an input for the system function in the following timestep.

The inputs cst and srs are parameters that are an input to the DP funtion and directly passed to the system funtion. They are not neccesary for DP and can have any data type. They are ment to be used inside the system funtion representing constant (cst) and time dependent (srs) data. You can also just use one of them or none, but they have to be defines as inputs in your system funtion.

The funtion output data allows on to pass other caöcuöated variables than the state variables to the result DataFrame Data, which is also an input to the system function and can also be accesed at every call. The index of data has to be a range from zero to M-1, where M is the number of state combinations. data has to be defined even if no additional parameters should be passed to Data. For each parameter that should be passed, simply a new colums with its name and a valid value for each index has to be created. Make sure, that the column names are neither ‘U’, ‘J’ nor one of the state variable names.

Backward Dynamic Programming DP_backward()

If you have defined your states, decisions, timesteps and the system function, you can simply run the funtion DP_backward() to find the optimal solution of your system using backward DP. The timesteps are simply a numpy array containg all timesteps of your optimization horizon. The parameters cst and srs are ment to be used in the system function and can be set to None if not needed.

prodyn.DP_backward(states,U,timesteps,cst,srs,system,[J0=None,verbose=False,t_verbose=1)
Parameters:
  • states – pandas dataframe where each index represents a state variable see States in prodyn
  • U – List of possible decision, see Decisions in prodyn
  • timesteps – numpy array containing all timesteps
  • cst – (Constant) Parameters for system simulation
  • srs – (Time-series) parameters for system simulation
  • system – function to simulate the system that is optimized using DP, see The system function in prodyn
  • J0 – Cost vector for first time-step, allows to define initial start costs and therby select a certain end state for backward DP. JT has to be a 1D numpy array with the same nuber of elements as defined state combinations in X. See Define initial costs . Each element defines the initial end cost of the coresponding state.
  • verbose – Bool, turn shell output on (True) or off (False)
  • t_verbose – Show output every t_verbose timesteps
Returns:

Data: pandas DataFrame with results, see Access backward DP results

Forward Dynamic Programming DP_forward()

If you have defined your states, decisions, timesteps and the system function, you can simply run the funtion DP_forward() to find the optimal solution of your system using forward DP. The timesteps are simply a numpy array containg all timesteps of your optimization horizon. The parameters cst and srs are ment to be used in the system function and can be set to None if not needed.

prodyn.DP_forward(states,U,timesteps,cst,srs,system,[JT=None,verbose=False,t_verbose=1)
Parameters:
  • states – pandas dataframe where each index represents a state variable see States in prodyn
  • U – List of possible decision, see Decisions in prodyn
  • timesteps – numpy array containing all timesteps
  • cst – (Constant) Parameters for system simulation
  • srs – (Time-series) parameters for system simulation
  • system – function to simulate the system that is optimized using DP, see The system function in prodyn
  • JT – Cost vector for last time-step, allows to define initial end costs and therby select a certain end state for backward DP. JT has to be a 1D numpy array with the same nuber of elements as defined state combinations in X. See Define initial costs . Each element defines the initial end cost of the coresponding state.
  • verbose – Bool, turn shell output on (True) or off (False)
  • t_verbose – Show output every t_verbose timesteps
Returns:

Data: pandas DataFrame with results, see Access forward DP results

Accessing the results

The funtions DP_forward() and DP_backward() return a pandas DataFrame with the results. The result DataFrames of the two functions differ from each other. How to access the results is explained based on the Simple storage example .

Access backward DP results

The result of the function DP_backward() is a pandas DataFrame (called Data here) with two indices. t represents the timestep and Xidx_start represents the index of the system state after finishing the optimization, which is the system state at the first timestep for backward DP (see Backward Dynamic Programming).

The following table and figure shows the result of the Simple storage example solved with backward DP.

DP_backward() always calculates a solution for every possible starting state. For this example we have two results. One if we want to start with an empty storage, and one if we want to start with full storage.

For each result we have the optimal decisions U, the total costs J, and the values of the state variables, here battery for every timestep. If we pass additional values using the data output of the system function (in this example cost), this is also part of the result DataFrame.

While U, J, and the additional values are defined during timestep t, the state variables are defined as before timestep t. This is why in this example, where we have the defined optimizatiom timesteps [1,2,3], we have a value for the state variable battery for t=4, which is the state before t=4 (or after t=3).

So index t=1 and Xidx_start=0 means results for timestep t=1 (states before t=1) and when starting with the state with index 0.

Index t=1 and Xidx_start=1 means results for timestep t=1 (states before t=1) and when starting with the state with index 1.

Index t=2 and Xidx_start=0 means results for timestep t=2 (states before t=2) and when starting with the state with index 0, and so on.

Note that Xidx_start represents the index of state vector X, not the state value (although in our example there is no difference). So to find the results for any state combination in X, we have to know the coresponding value in Xidx, which is the index array (see States in prodyn). The function find_index() finds the (closest) corresponding index for any given state value (combination). See Using find_index() .

result for the simple storage example solved with backward DP
    J U cost battery
t Xidx_start        
1 0 -2 1 1 0
  1 -3 0 0 1
2 0 -3 0 0 1
  1 -3 0 0 1
3 0 -3 -1 -3 1
  1 -3 -1 -3 1
4 0 NaN NaN NaN 0
  1 NaN NaN NaN 0
_images/DP_bw_08.png

Result for the simple storage example solved with backward DP

To acess the results from DataFrame Data (which of course can have any name you choose) for any state index xidx, simply use the pandas.DataFrame.xs() function with level=’Xidx_start’

result_xidx = Data.xs(xidx,level='Xidx_start')

The result will be a pandas DataFrame with index t and the same colums as before

For our example accessing the results for starting with an empty storage (Xidx_start=0) will give us the following result

result_0 = Data.xs(0,level='Xidx_start')
result for the simple storage example solved with backward DP for starting with an empty storage
  J U cost battery
t        
1 -2 1 1 0
2 -3 0 0 1
3 -3 -1 -3 1
4 NaN NaN NaN 0
Access forward DP results

The result of the function DP_forward() is a pandas DataFrame (called Data here) with two indices. t represents the timestep and Xidx_end represents the index of the system state after finishing the optimization, which is the system state after the last timestep for forward DP (see Backward Dynamic Programming).

The following table and figure shows the result of the Simple storage example solved with forward DP.

DP_forward() always calculates a solution for every possible ending state. For this example we have two results. One if we want to end with an empty storage, and one if we want to end with a full storage.

For each result we have the optimal decisions U, the total costs J, and the values of the state variables, here battery for every timestep. If we pass additional values using the data output of the system function (in this example cost), this is also part of the result DataFrame.

While U, J, and the additional values are defined during timestep t, the state variables are defined as before timestep t. This is why in this example, where we have the defined optimizatiom timesteps [1,2,3], we have a value for the state variable battery for t=4, which is the state before t=4 (or after t=3).

So index t=1 and Xidx_end=0 means results for timestep t=1 (states before t=1) and when ending with the state with index 0.

Index t=1 and Xidx_end=1 means results for timestep t=1 (states before t=1) and when ending with the state with index 1.

Index t=2 and Xidx_end=0 means results for timestep t=2 (states before t=2) and when ending with the state with index 0, and so on.

Note that Xidx_end represents the index of state vector X, not the state value (although in our example there is no difference). So to find the results for any state combination in X, we have to know the coresponding value in Xidx, which is the index array (see States in prodyn). The function find_index() finds the (closest) corresponding index for any given state value (combination). See Using find_index() .

result for the simple storage example solved with forward DP
    J U cost battery
t Xidx_end        
1 0 0 0 0 1
  1 0 0 0 1
2 0 0 0 0 1
  1 0 0 0 1
3 0 -3 -1 -3 1
  1 0 0 0 1
4 0 NaN NaN NaN 0
  1 NaN NaN NaN 1
_images/DP_fw_07.png

Result for the simple storage example solved with forward DP

To acess the results from DataFrame Data (which of course can have any name you choose) for any state index xidx, simply use the pandas.DataFrame.xs() function with level=’Xidx_end’

result_xidx = Data.xs(xidx,level='Xidx_end')

The result will be a pandas DataFrame with index t and the same colums as before

For our example accessing the results for ending with an empty storage (Xidx_endt=0) will give us the following result

result_0 = Data.xs(0,level='Xidx_end')
result for the simple storage example solved with backward DP for starting with an empty storage
  J U cost battery
t        
1 0 0 0 1
2 0 0 0 1
3 -3 -1 -3 1
4 NaN NaN NaN 0

Using find_index()

For accessing the results or creating initial costs, it is usefull to know the corresponding index of a state value (combination). The function find_index() returns the index of X (the corresponding entry in Xidx) for the state which is closest to a given input state value.

prodyn.find_index(xvalues, states)
Parameters:
  • xvalues – array of states, for which the index is needed
  • sates – pandas dataframe where each index represents a state variable * xmin: minimum value of the state variable * xmax: maximum value of the state variable * xstpes: number of discretization steps for this state variable
Returns:

idx: vector that contains the index of X, which value is the nearest to xvalues

Here is an example of using find_index() for one state variable

Example of states Dataframe for one state variable
  xmin xmax xsteps
state      
battery 0 12 5
\[\begin{split} X = \begin{bmatrix} 0\\ 3\\ 6\\ 9\\ 12\\ \end{bmatrix}\:\:\:\:\:\:\:\: Xidx = \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ 4\\ \end{bmatrix}\:\:\:\:\:\:\:\:\end{split}\]
find_index([0],states)
>> 0

find_index([9],states)
>> 3

find_index([7],states)
>> 2

And another example of using find_index() for one two state variables

Example of states Dataframe for two state variables
  xmin xmax xsteps
state      
battery 0 3 4
heat-storage 0 5 2
\[\begin{split} X = \begin{bmatrix} 0 & 0\\ 0 & 5\\ 1 & 0\\ 1 & 5\\ 2 & 0\\ 2 & 5\\ 3 & 0\\ 3 & 5\\ \end{bmatrix}\:\:\:\:\:\:\:\: Xidx = \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ 4\\ 5\\ 6\\ 7\\ \end{bmatrix}\:\:\:\:\:\:\:\:\end{split}\]
find_index([0,5],states)
>> 1

find_index([3,0],states)
>> 6

find_index([1.1,2],states)
>> 2

Define initial costs

To define intial start costs J0 for forward DP or intial end costs JT for backward DP, create a 1D numpy array with the same size as Xidx (the same size as X for one state variable or the same number of rows like X for multiple state variables).

The following example shows how to define initial starting costs J0, where the initial costs for state x=3 (xidx=1) is set to -9999. (numpy is imported as np)

J0 = np.zeros(5)
J0[1]=-9999
\[\begin{split} X = \begin{bmatrix} 0\\ 3\\ 6\\ 9\\ 12\\ \end{bmatrix}\:\:\:\:\:\:\:\: Xidx = \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ 4\\ \end{bmatrix}\:\:\:\:\:\:\:\: J0 = \begin{bmatrix} 0\\ -9999\\ 0\\ 0\\ 0\\ \end{bmatrix}\:\:\:\:\:\:\:\:\end{split}\]

The following example shows how to define initial end costs JT, where the initial costs for state x=(1,5) (xidx=3) is set to -9999. (numpy is imported as np)

J0 = np.zeros(5)
J0[3]=-9999
\[\begin{split} X = \begin{bmatrix} 0 & 0\\ 0 & 5\\ 1 & 0\\ 1 & 5\\ 2 & 0\\ 2 & 5\\ 3 & 0\\ 3 & 5\\ \end{bmatrix}\:\:\:\:\:\:\:\: Xidx = \begin{bmatrix} 0\\ 1\\ 2\\ 3\\ 4\\ 5\\ 6\\ 7\\ \end{bmatrix}\:\:\:\:\:\:\:\: JT = \begin{bmatrix} 0\\ 0\\ 0\\ -9999\\ 0\\ 0\\ 0\\ 0\\ \end{bmatrix}\:\:\:\:\:\:\:\:\end{split}\]

Examples

The examples given in this chapter show how to implement dynamic programming algorithm. The first example is presented very detailed. For other systems only brief description is given. An implementaion of each example can be found in the example folder.

Simple storage example

Description

In this example a battery storage with an energy capacity of 1kWh is conected to the grid. The electricity price for buying from or selling to the grid for every timestep is given as a time-series \(d_t\) (\(d_1 = 1€\), \(d_2 = 2€\), \(d_3 = 3€\)). The energy content of the battery, which is the state variable of this system is defined by the discrete set X = [0kWh, 1kWh], which means the storage can be either full or empty. The set of possible control signals (decisions) consists of the three signals: charge with 1 kWh (+1kWh), wait (+-0kWh) and discharge with 1 kWh (-1kWh).

_images/storage_example.png

Simple storage example

So, when the battery is charged, the new state is always 1kWh (full), even if was already full before. When it is discharged, it is allways 0kWh (full) and if it is neither charged nor discharged, the state stays unchanged.

\[\begin{split}x_{t+1} = \begin{cases} 1kWh & \qquad \text{if u = +1kWh}\\ x_t & \qquad \text{if u = +-0kWh}\\ 0kWh & \qquad \text{if u = -1kWh} \end{cases}\end{split}\]

If the storage is charged, we have to pay for the electricity from the grid, so in this case the costs are defined by the price at the current timestep \(d_t\). If the storage is already full (\(x_t\) =1kWh), the control signal charge (u=+1kWh) is an invalid signal. To prevent that this signal will be part of the solution, we set the (penalty) costs for this case very high (99 €). We do the same, if storage is already empty (\(x_t\) =0kWh) and the signal discharge (u=-1kWh) is applied. If we discharge a full storage, we sell the electricity to the grid which results in negative costs \(d_t\) . If we neither charge nor discharge, we don’t make or spend any money.

\[\begin{split}c_{t,x_t,u_t} = \begin{cases} 99 € & \qquad \text{if u=+1kWh and x_t=1kWh} \\ d_t & \qquad \text{if u=+1kWh and x_t=0kWh} \\ 0 € & \qquad \text{if u=+-0kWh}\\ -d_t & \qquad \text{if u=-1kWh and x_t=1kWh} \\ 99 € & \qquad \text{if u=-1kWh and x_t=0kWh} \end{cases}\end{split}\]

The following figure shows the possible states x and decisions u of the storage system for the 3 timesteps. Note, that because we need a defined initial and end state, the system has 4 states!

_images/DP_general.png

Possible states x and decisions u of the storage system

Implementation
Imports

Import pandas and numpy for data handling and operation, prodyn for using DP and matplotllib for plotting.

import pandas as pd
import numpy as np
import prodyn as prd
import matplotlib.pyplot as plt
States

At first, we describe the state variable “battery” of our system by defining its minimum and maximum value and the number of dicretization steps therefore we create a pandas DataFrame where the index represents the states (here only “battery”) and the columns [‘xmin’,’xmax’,’xsteps’] states = pd.DataFrame(index=[‘battery’], columns=[‘xmin’,’xmax’,’xsteps’])

we define a battery with 1 kWh capacity and define our steps to be 2 (0kWh, 1kWh)

states.loc['battery','xmin'] = 0
states.loc['battery','xmax'] = 1
states.loc['battery','xsteps'] = 2
Controls (decisions)

Now we define the possible control signals (decisions) of our system. Therefore we simply create a list containing all possible control signals. This can be numbers, or strings or any other type of variable, as long as the system model is defined such that it can handle the defined control signals. In our case we choose integers to represent our control signal. 1 for charging the battery, 0 for doing nothing, -1 for dischraging it.

controls = [-1,0,1] #The set of control signals/decisions (often called "U")
timesteps

We define the timesteps for our optimization. In this case we want to optimize for 3 hours, so we define a numpy array [0,1,2,3]

timesteps=np.arange(1,4)
Constant Parameters

We can define two variables that contain constant parameters we can access within our system function. Both variables can be defined as any type (dict, pandas DataFrame, list, array, constant,….) For a better readability I usually define one variable containing all the time independent constants and one variable containing a timeseries For this simple example, we define a constant variable as a dict, containing only the maximum capacity of the battery and a pandas DataFrame containing the electricity price curve (with timestpes as index)

constants = {'max_cap': 1}
timeseries = pd.DataFrame(index=timesteps, columns=['el. price'])

As an incentive to use the storage we define a price curve with low costs in the beginning and high costs at the end of the day

timeseries['el. price'] = np.array([1,2,3])
System model

Now we define the function describing our system. The function calculates for every possible state x and a given control signal u the following state x_j and the costs for going from x to x_j. Allowing prodyn to use a function, it has to be defined following some rules for the inputs and outputs. The function has to be defined with 6 inputs (u,x,t,cst,Srs,Data), where:

  • u is the current control signal (decision) within possible decisions U in our case u is one element of our defined controls, either -1, 1 or 0
  • x is a numpy array with dicretized steps of the state in our case x = [0, 1, 2, 3, 4, 5]
  • t is the current timestep
  • cst is one of our constant variables we defined, in our case constants
  • Srs is the other constant variables we defined, in our case timeseries
  • Data is a Dataframe created by prodyn containing previous results of the optimization. It allows us to access results from already solved timesteps. For this example, Data will not be used. How to use it is explained in another example.

The function has to return 3 outputs (cost, x_j, data), where:

  • cost is a numpy array of the same size as x containing the cost for going from x to x_j when using control signal u
  • x_j is the following state of x (numpy array of the same size as x)
  • data contains additional variables that will be saved in the result DataFrame “Data”, which also can be used in the following timesteps. data has to be defined as a pandas DataFrame with index x and a column for each variable that should be saved. It is not necessary to use data, but it has to be defined as a (empty) pandas DataFrame with index=x

Of course, the names of the inputs and outputs can be changed, but the order and typ of the variables have be to as described

def storage_model(u,x,t,cst,Srs,Data):

    #for our simple model we initialize the arrays cost and x_j with zeros
    #the arrays have to be the same size as x
    l = len(x) #number discretization steps of x
    cost = np.zeros(l)
    x_j = np.zeros(l)

    #we inizialize an additional array penelty_cost, which we use in our model
    penalty_cost = np.zeros(l)

    #We have to calculate the following state x_j and the cost using the control
    #signal u for every possible state beginning state x_i (every element of x)
    #In this example we use a loop to do this:

    for i,x_i in enumerate(x):

        #prodyn always minimizes the total costs over all timesteps
        #here we define the costs to be positive when we charge the battery
        #because we have to buy electricity and negative when we discharge it,
        #because then we sell electricity. The costs are zero, if we do nothing.
        #In this simple example we can charge or discharge with 1 kW per
        # timestep (1 hour), so we increase or decrease the storage content by
        # 1kWh. So the costs are defined by the electricity price in the current
        #timestep. By muliplying the price with the control signal u, which is
        #-1 for discharging, 0 for doing nothing and 1 for discharging, the cost
        #becomes negative, zero or positive, as we defined it.
        cost[i] = (u * Srs.loc[t]['el. price'])
        #as just described, the battery energy content, which is our state
        #variable increase by 1 kWh when charging the battery (u=1), decreases
        #by 1 kWh when discharging it (u=-1) and stays the same when doing
        #nothing (u=0)
        x_j[i] = x_i + u

        #With our definition above, it would be psooible that an already fully
        #charged battery with 5kWh energy content could be charged up to 6kWh
        #If this happens, change the energy content back to 5 kWh ('max_cap')
        #and we set the penalty costs to a high value (999)
        if x_j[i] > cst['max_cap']:
            x_j[i] = cst['max_cap']
            penalty_cost[i] = 999
        #we do the same, if the energy content goes below 0 kWh.
        elif x_j[i] < 0:
            x_j[i] = 0
            penalty_cost[i] = 999

        #Now we add the penalty costs to the cost. High costs prevent that this
        #decision (control signal) will be chosen by the algorithm to be part of
        #the solution with minimal costs at the end
        cost[i] = cost[i] + penalty_cost[i]

    #We have to create a pandas DataFrame with index=x for the output (even
    #if we don't use it)
    data = pd.DataFrame(index = x)
    #We create the row 'cost', where we save the costs for every x when applying
    #control signal u at the current timestep t. Tzhe costs will be saved in the
    # results and we cann acess them after the problem is solved.
    data['cost'] = cost

    return cost,x_j,data
Solving the problem and accessing the solution using prodyn

Prodyn can be used to find the control sequence that leads to the minimum total costs over the optimization horizon (timesteps) using the dynamic programming algorithm. prodyn has two different implementations of the DP algorithm. DP_forward solves the problem “forward in time”, which means the algorithm starts at the first timestep, while the DP_backward starts at the last timestep and solves the problem “backward in time”. The forward algorithm has the advantage, that states and calculated parameters from previous timesteps can be used in the model, which might be necessary to model the problem. The backward algorithm only can access “future” states and parameters, but because it is usually faster and therefore should be chosen if past information is not necessary to model the problem. Both algorithms are described more detailed in the chapter “how prodyn works”.

DP_forward
result_fw = prd.DP_forward(states,controls,timesteps,constants,timeseries,
                           storage_model,verbose=True,t_verbose=1)

Here we want to get the results where the storage is empty at the end (Xidx_end=0)

result0_fw = result_fw.xs(0,level='Xidx_end')
DP_backward
result_bw = prd.DP_backward(states,controls,timesteps,constants,timeseries,
                            storage_model,verbose=True,t_verbose=1)

Here we want to get the results where the storage is empty at the beginning (Xidx_start=0)

result0_bw = result_bw.xs(0,level='Xidx_start')

And compare it to the results where the storage is full at the beginning (Xidx_start=1)

result5_bw = result_bw.xs(1,level='Xidx_start')

Now we can plot the results

plt.plot(result0_fw['battery'],'b')
plt.plot(result5_bw['battery'],'g-.')
plt.plot(result0_bw['battery'],'r:')
plt.show()

CHP

Grid, gas-boiler, chp power plant, battery and heat storage are components of the system, which should cover given heat and electical demand. Energy contents of the battery and heat storage are 2 states of the system. When chp is on, it covers the demand. Surplus of electricity is stored in the battery and sold to the grid. Surplus of the heat is stored in the heat storage. When chp is off, at first both demands are covered by storages, then by the grid and gas boiler. The goal of optimization is to find the path, where both storages will be empty at the final timestep. Figure 14 shows simplified scheme of the chp system.

_images/chp.png

Figure 14: Illustration of the chp example

PV Storage

Photovoltaic system with storage form the system for covering given electrical demand. Energy content of the storage is the only state of the system. List U contains three possible decisions. With normal system operates without participation of the storage. Possible surplus of the produced by pv power can be saved in the storage with charge decision. With discharge system tries to cover the residual demand by stored energy. After each possible system’s decision grid load is checked. This residual power is covered by or fed into the grid. The main goal is to find the result, where the storage is empty at the end. Illustration of the current example is presented in the Figure 15.

_images/pv_storage.png

Figure 15: Illustration of the pv_storage example

Pv_storage_model, which describes the transition from i to j according to each possible decision u, is written in two ways. In first case the transition is applied for the whole array X, which characterizes the system. In the second case - for each possible condition of X. Calculation for each condition and jump from one to another are realized inside the loop.

Building

Description

A system in building example contains a model of the real building (pre-trained Neural Network) and a heat pump. The goal of the optimization is to keep room temperature Troom inside the range of allowed values [Tmin; Tmax] in a cost-efficient way. Simulation covers one day (19-44 hours) with 15 min time resolution. The picture in the Figure 11 visualizes current system.

_images/building.png

Figure 11: Illustration of the building example

Dynamic Programming algorithm for optimal control of the building is realized with using four following files:

  • building_data.xlsx - stores information about the system.
  • building_model.py - reads system’s data and describes transition from one timestep to another.
  • prodyn.py - realizes dynamic programming algorithm.
  • run_building_forward.py - runs the simulation and finds the optimal system’s control.

Run_building_forward.py and building_model.py are described in detail below.

run_building_forward.py

There is a script of the run_building_forward.py (one from four dynamic programming files) is explained step by step for better understanding.

import numpy as np
import matplotlib.pyplot as plt
import pyrenn as prn

Three packages are included:

  • numpy is the fundamental package for scientific computing with Python;
  • matplotlib.pyplot is a plotting library which allows present results in a diagram form quite easily;
  • pyrenn is a recurrent neural network toolbox for Python.
import building_model as model
import prodyn as prd

Then building_model and prodyn (two other files of dynamic programming) are imported. They assigned as model and prd respectively.

file = 'building_data.xlsx'

Gives the path to the excel-file building_data containing data about the current system. This is the last file of dynamic programming.

cst,srs,U,states = model.read_data(file)
srs['massflow'] = 0
srs['P_th'] = 0
srs['T_room'] = 20

Defines constants cst, timeseries srs, list of possible decisions U and parameters states, which characterize each possible building’s state, by reading the building_data file. Process of reading is realized due to read_data function hidden in the building_model (model) file. To timeseries srs written from building_data some extra data is added.

timesteps=np.arange(cst['t_start'],cst['t_end'])

Sets a timeframe on which optimization will be realized.

net = prn.loadNN('NN_building.csv')
cst['net'] = net

Defines a model net of the real building (pre-trained Neural Network) and saves it to the constants cst.

xsteps=np.prod(states['xsteps'].values)
J0 = np.zeros(xsteps)
idx = prd.find_index(np.array([20]),states)
J0[idx] = -9999.9

Creates an array J0 of initial terminal costs. J0 will be changed from transition to transition according to list of possible decisions U and will keep all costs. Due to stored infromation in J0 optimal control of the building can be found.

idx = prd.find_index(np.array([20]),states)
J0[idx] = -9999.9

Shifts the initial postition to index with temperature equaled to 20 degrees.

system=model.building

Defines function building from building_model for characterization the transition from one timestep to another.

result = prd.DP_forward(states,U,timesteps,cst,srs,system,J0=J0,verbose=True,t_verbose=5)
i_mincost = result.loc[cst['t_end']-1]['J'].idxmin()
opt_result = result.xs(i_mincost,level='Xidx_end')

Implements dynamic programming algorithm for the chosen timeframe and saves all data to the result. Then finds index for cost-minimal path, extracts it from result and saves to opt_result.

best_massflow=opt_result['massflow'].values[:-1]
Troom=opt_result['T_room'].values[:-1]
Pel=opt_result['P_el'].values[:-1]

Chooses parameters, which characterize cost-efficient building control system, and extracts them from opt_result. Best_massflow is a schedule, which shows at which timestep heat pump is switched on and at which switched off. Pel defines consumed electrical power, Troom - room temperature inside the house, which shouldn’t be out of the comfort zone [Tmin; Tmax].

Troom=np.concatenate((srs.loc[timesteps[0]-4:timesteps[0]-1]['T_room'],Troom))
Pel=np.concatenate((srs.loc[timesteps[0]-4:timesteps[0]-1]['P_th'],Pel))

Sums values for timesteps, which were not involved in the optimization, with those, which were extracted from opt_result. The remaining part of the code is responsible for plotting chosen and additional parameters. They are presented in the Figure 12.

_images/building_results.png

Figure 12: Cost-minimal control of the building for keeping Troom inside [Tmin; Tmax].

building_model.py

The script of the building_model.py is explained step by step for better understanding.

import pandas as pd
import numpy as np
import pyrenn as prn
import pdb

Three packages are included:

  • pandas is a source helping to work with data structure and data analysis;
  • numpy is the fundamental package for scientific computing with Python;
  • pyrenn is a recurrent neural network toolbox for Python;
  • pdb is a specific module, which allows to debug Python codes.
def read_data(file):
        xls = pd.ExcelFile(file)
        states = xls.parse('DP-States',index_col=[0])
        cst = xls.parse('Constants',index_col=[0])['Value']
        srs = xls.parse('Time-Series',index_col=[0])
        U = xls.parse('DP-Decisions',index_col=[0])['Decisions'].values
        return cst,srs,U,states

Read_data reads data about the building system from the excel-file and assigns it to different parameters.

def building(u,x,t,cst,Srs,Data):
        l = len(x)
        delay=4
        net = cst['net']

Opens function building responsible for the system transition. Also identifies the length l of the array with possible system states x, gives a name to the pre-trained Neural Network (NN) net and chooses number of timesteps delay for the initial input P0 and output Y0 needed for the NN’s usage.

hour = Srs.loc[t]['hour']
solar = Srs.loc[t]['solar']
T_amb = Srs.loc[t]['T_amb']
user  = Srs.loc[t]['use_room']
T_inlet = Srs.loc[t]['T_inlet']

Creates 5 inputs for the input array P required for the NN’s usage.

if u=='heating on':
        massflow = cst['massflow']
elif u=='heating off':
        massflow = 0

Defines the 6th and the last input of P in dependance of the current decision u.

P = np.array([[hour],[solar],[T_amb],[user],[massflow],[T_inlet]],dtype = np.float)

Builds the input array P from six inputs for the current timestep t.

hour0 = Srs.loc[t-delay:t-1]['hour'].values.copy()
solar0 = Srs.loc[t-delay:t-1]['solar'].values.copy()
T_amb0 = Srs.loc[t-delay:t-1]['T_amb'].values.copy()
user0  = Srs.loc[t-delay:t-1]['use_room'].values.copy()
T_inlet0 = Srs.loc[t-delay:t-1]['T_inlet'].values.copy()

Creates 5 inputs for the initial input array P0, which is also needed for the NN’s usage. The length of each input is equaled to the chosen delay at the beginning of the function.

x_j = np.zeros(l)
P_th = np.zeros(l)
costx = np.zeros(l)

Defines array x_j for the building states after the transition, array P_th for thermal power given to the building from heat pump and array costx, which will contain penalty costs for transition from each building state in x to x_j according to current decision u.

for i,xi in enumerate(x):
        #prepare 6th input for P0 and 2 outputs for Y0
        if t-delay<cst['t_start']:

                #take all values for P0 and Y0 from timeseries
                if Data is None or t==cst['t_start']:
                        T_room0 = Srs.loc[t-delay:t-1]['T_room'].values.copy()
                        P_th0 = Srs.loc[t-delay:t-1]['P_th'].values.copy()
                        massflow0 = Srs.loc[t-delay:t-1]['massflow'].values.copy()

                #take part of values from timeseries and part from big Data
                else:
                        tx = t-cst['t_start']
                        T_room0 = np.concatenate([Srs.loc[t-delay:t-tx-1]['T_room'].values.copy(),Data.loc[t-tx-1:t1].xs(i,level='Xidx_end')['T_room'].values.copy()])
                        P_th0 = np.concatenate([Srs.loc[t-delay:t-tx-1]['P_th'].values.copy(),Data.loc[t-tx-1:t-1].xs(i,level='Xidx_end')['P_th'].values.copy()])
                        massflow0 = np.concatenate([Srs.loc[t-delay:t-tx-1]['massflow'].values.copy(),Data.loc[t-tx-1:t-1].xs(i,level='Xidx_end')['massflow'].values.copy()])

        #take all values for P0 and Y0 from big Data
        else:
                T_room0 =Data.loc[t-delay:t-1].xs(i,level='Xidx_end')['T_room'].values.copy()
                P_th0 = Data.loc[t-delay:t-1].xs(i,level='Xidx_end')['P_th'].values.copy()
                massflow0 = Data.loc[t-delay:t-1].xs(i,level='Xidx_end')['massflow'].values.copy()

Loop for every possible state of the building from x opens. All other strings are responsible for prepairing the 6th input massflow0 for the input array P0 and two outputs T_room0, P_th0 for the initial output array Y0. In dependance of relation between current timestep t and t_start (initial timestep, from which optimal builidng control should be found) these three parameters are created with values from the timeseries srs and Data, which keeps all information about the previous transitions. There are three cases for the massflow0, T_room0 and P_th0 creation. Supporting commentaries in this part split these cases.

T_room0[-1] = xi
P0 = np.array([hour0,solar0,T_amb0,user0,massflow0,T_inlet0],dtype = np.float)
Y0 = np.array([T_room0,P_th0],dtype = np.float)

Corrects last value of T_room0 and builds initial input P0 and initial output Y0 arrays.

if np.any(P0!=P0) or np.any(Y0!=Y0):
        #if P0 or Y0 not valid use valid values and apply penalty costs
        costx[i] = 1000*10
        x_j[i] = xi
        P_th[i] = 0
else:
        x_j[i],P_th[i] = prn.NNOut(P,net,P0=P0,Y0=Y0)

if x_j[i] != x_j[i] or P_th[i] != P_th[i]:
        pdb.set_trace()

Runs NN for one timestep. Checks if P0 and Y0 are valid. Two outputs of the NN usage are array x_j, which keeps all possible states of the building after transition, and array P_th, which stores data about delivered thermal power from the pump to the building. In the case of mistake a Python debugger will be open. Here the loop for every possible state of the building from x closes.

Tmax = Srs.loc[t]['Tmax']
Tmin = Srs.loc[t]['Tmin']

costx = (x_j>Tmax)*(x_j-Tmax)**2*1000 + (x_j<Tmin)*(Tmin-x_j)**2*1000+costx

Selects borders for the allowed T_room and calculates penalty costs costx if any state of x_j is out of chosen borders.

x_j=np.clip(x_j,x[0],x[-1])

Corrects x_j. Values smaller than x[0] become x[0], and values larger than x[-1] become x[-1].

P_el = P_th*T_inlet/(T_inlet-T_amb)
cost = P_el * Srs.loc[t]['price_elec']*0.25 + costx

Calculates cost of the transition by summing electricity and penalty costs.

data = pd.DataFrame(index = np.arange(l))
data['P_th'] = P_th
data['P_el'] = P_el
data['T_room'] = x_j
data['massflow'] = massflow
data['cost'] = cost
data['costx'] = costx

return cost, x_j, data

Defines parameters, which will be put in data used in prodyn file. Returns of the building function are costs of the transition cost, new array with building states x_j and data.

Building with Storage

The presence of the heat storage makes this system different to the building. Due to this building_with_storage has 4 decisions and 2 states (the room temperature Troom and energy content of the storage E). The goal and period of simulation are identical to the building example. In the Figure 13 schematic picture the building_with_storage system is given.

By reason of long-time simulation the results are already given in the folder related to this example.

_images/building_with_storage.png

Figure 13: Illustration of the building_with_storage example

Note

Building and building_with_storage examples can be simulated only in forward direction.

Features

  • prodyn uses the dynamic programming algorithm to calculate the decision seuqence which leads to minimal total costs
  • prodyn can be apllied to any system function which follows the required structure for inputs and outputs
  • Forward and backward implementation of the dynamic programming algorithm
  • Supports systems with multiple state variables

Get Started

  1. download or clone (with git) this repository to a directory of your choice.
  2. Copy the prodyn.py file in the prodyn folder to a directory which is already in python’s search path or add the prodyn folder to python’s search path (sys.path) (how to)
  3. Run the given examples in the examples folder.
  4. Implement your own system function.

Dependencies (Python)

  • numpy for mathematical operations
  • pandas for data handling