Operator Discretization Library Documentation

Operator Discretization Library (ODL) is a python library for fast prototyping focusing on (but not restricted to) inverse problems.

The main intent of ODL is to enable mathematicians and applied scientists to use different numerical methods on real-world problems without having to implement all necessary parts from the bottom up. ODL provides some of the most heavily used building blocks for numerical algorithms out of the box, which enables users to focus on real scientific issues.

Getting Started

Welcome to the “Getting Started” section of the documentation. Here you can find an overview over the basics of ODL, a step-by-step installation guide and some in-depth code examples that show the capabilities of the framework.

About ODL

Operator Discretization Library (ODL) is a Python library for fast prototyping focusing on (but not restricted to) inverse problems. ODL is being developed at KTH Royal Institute of Technology, Stockholm, and Centrum Wiskunde & Informatica (CWI), Amsterdam.

The main intent of ODL is to enable mathematicians and applied scientists to use different numerical methods on real-world problems without having to implement all necessary parts from the bottom up. This is reached by an Operator structure which encapsulates all application-specific parts, and a high-level formulation of solvers which usually expect an operator, data and additional parameters. The main advantages of this approach is that

  1. Different problems can be solved with the same method (e.g. TV regularization) by simply switching operator and data.
  2. The same problem can be solved with different methods by simply calling into different solvers.
  3. Solvers and application-specific code need to be written only once, in one place, and can be tested individually.
  4. Adding new applications or solution methods becomes a much easier task.

ODL implements many abstract mathematical notions such as sets, vector spaces and operators. In the following, a few are shown by example.

Set

A Set is the fundamental building block of ODL objects. It mirrors the mathematical concept of a set in that it can tell if an object belongs to it or not:

>>> interv = odl.IntervalProd(0, 1)
>>> 0.5 in interv
True
>>> 2.0 in interv
False

The most commonly used sets in ODL are RealNumbers (set of all real numbers) and IntervalProd (“Interval product”, rectangular boxes of arbitrary dimension).

LinearSpace

The LinearSpace class is the most important subclass of Set. It is a general (abstract) implementation of a mathematical vector space and has a couple of widely used concrete realizations.

Spaces of n-tuples

Large parts of basic functionality, e.g. arithmetic or inner products, rest on array computations, i.e. computations on tuples of elements of the same kind. Typically, these vector spaces are of the type \mathbb{F}^n, where \mathbb{F} is a field (usually \mathbb{R} or \mathbb{C}), and n a positive integer. Example:

>>> c3 = odl.cn(3)
>>> u = c3.element([1 + 1j, 2 - 2j, 3])
>>> v = c3.one()  # vector of all ones
>>> u.inner(v)  # sum of the elements
(6-1j)
Function spaces

A function space is a set of functions f: \mathcal{X} \to \mathcal{Y} with fixed domain and range (more accurately: codomain), where \mathcal{Y} is a vector space. The ODL implementation FunctionSpace covers only the cases \mathcal{Y} = \mathbb{R} or \mathbb{C} since the general case has large overlaps with Operator. Note that we do not make a distinction between different types of function spaces with respect to regularity, integrability etc. on an abstract level since there is no obvious way to check it.

As linear spaces, function spaces support some interesting operations:

>>> import numpy as np
>>> space = odl.FunctionSpace(odl.IntervalProd(0, 2))
>>> exp = space.element(np.exp)
>>> exp(np.log(2))
2.0
>>> exp_plus_one = exp + space.one()
>>> exp_plus_one(np.log(2))
3.0
>>> ratio_func = exp_plus_one / exp  # x -> (exp(x) + 1) / exp(x)
>>> ratio_func(np.log(2))  # 3 / 2
1.5

A big advantage of the function space implementation in ODL is that the evaluation of functions is vectorized, i.e. that the values of a function can be computed from an array of input data “at once”, without looping in Python (which is slow, in general). What follows is a simple example, see the Vectorized functions guide for instructions on how to write vectorization-compatible functions.

>>> import numpy as np
>>> space = odl.FunctionSpace(odl.IntervalProd(0, 2))
>>> exp = space.element(np.exp)
>>> exp([0, 1, 2])
array([ 1.        ,  2.71828183,  7.3890561 ])
>>> x = np.linspace(0, 2, 1000)
>>> y = exp(x)  # works
Discretizations

A discretization typically represents the finite-dimensional, concrete counterpart of an infinite-dimensional, abstract vector space, which makes it accessible to computations. In ODL, a Discretization instance encompasses both continuous and discrete spaces as well as the mappings take one into the other. The canonical example is the space L^2(\Omega) of real-valued square-integrable functions on a rectangular domain (we take an interval for simplicity). It is the default in the convenience function uniform_discr:

>>> l2_discr = odl.uniform_discr(0, 1, 5)  # Omega = [0, 1], 5 subintervals
>>> type(l2_discr)
<class 'odl.discr.lp_discr.DiscreteLp'>
>>> l2_discr.exponent
2.0
>>> l2_discr.domain
IntervalProd(0.0, 1.0)

Discretizations have a large number of useful functionality, for example the direct and vectorized sampling of continuously defined functions. If we, for example, want to discretize the function f(x) = exp(-x), we can simply pass it to the element() method:

>>> exp_discr = l2_discr.element(lambda x: np.exp(-x))
>>> type(exp_discr)
<class 'odl.discr.lp_discr.DiscreteLpElement'>
>>> print(exp_discr)
[0.904837418036, 0.740818220682, 0.606530659713, 0.496585303791, 0.406569659741]
>>> exp_discr.shape
(5,)

Operators

This is the central class and general notion in ODL. The concept is derived from the mathematical theory of operators and implements many of its core properties. Any functionality that is implemented as an Operator has access to the full machinery of operator arithmetic, composition, differentiation and much more. It is the universal interface between application-specific code (e.g. line projectors in tomography for a given geometry) and other parts of the library that are written in an abstract mathematical language. The large benefit of this approach is that once an operator is fully implemented and functional, it can be used seamlessly by, e.g., optimization routines that expect an operator and data (among others) as input.

As a small example, we study the problem of solving a linear system with 2 equations and 3 unknowns. We use Landweber’s method to get a least-squares solution and plot the intermediate residual norm. The method needs a relaxation \lambda < 2 / \lVert A\rvert^2 to converge - in our case, the right-hand side is 0.14, so we choose 0.1.

>>> matrix = np.array([[1.0, 3.0, 2.0],
...                    [2.0, -1.0, 1.0]])
>>> matrix_op = odl.MatVecOperator(matrix)  # operator defined by the matrix
>>> matrix_op.domain
rn(3)
>>> matrix_op.range
rn(2)
>>> data = np.array([1.0, -1.0])
>>> niter = 5
>>> reco = matrix_op.domain.zero()  # starting with the zero vector
>>> for i in range(niter):
...     residual = matrix_op(reco) - data
...     reco -= 0.1 * matrix_op.adjoint(residual)
...     print('{:.3}'.format(residual.norm()))
1.41
0.583
0.24
0.0991
0.0409

If we now exchange matrix_op and data with a tomographic projector and line integral data, not a single line of code in the reconstruction method changes since the operator interface is exactly the same.

Further features

  • A unified structure Geometry for representing tomographic acquisition geometries
  • Interfaces to fast external libraries, e.g. ASTRA for X-ray tomography, STIR for emission tomography (preliminary), pyFFTW for fast Fourier transforms, ...
  • A growing number of “must-have” operators like Gradient, FourierTransform, WaveletTransform
  • Several solvers for variational inverse problems, ranging from simple gradient methods to state-of-the-art non-smooth primal-dual splitting methods like Douglas-Rachford
  • Standardized tests for the correctness of implementations of operators and spaces, e.g. does the adjoint operator fulfill its defining relation?
  • CUDA-accelerated data containers as a replacement for Numpy

Installing ODL

This guide will go through all steps necessary for a full ODL installation, starting from nothing more than a working operating system (Linux, MacOS or Windows).

TL;DR

If you already have a working python environment, ODL and some basic dependencies can be installed using either pip:

$ pip install odl[testing,show]

or conda:

$ conda install -c odlgroup odl matplotlib pytest scikit-image spyder

After installation, the installation can be verified by running the tests:

$ python -c "import odl; odl.test()"

Introduction

Installing ODL is intended to be straightforward, and this guide is meant for new users. For a working installation you should perform the following steps:

  1. Install a Python interpreter
  2. Install ODL and its dependencies
  3. (optional) Install extensions for more functionality
  4. (optional) Run the tests

Consider using Anaconda

We currently recommend to use Anaconda on all platforms since it offers the best out-of-the-box installation and run-time experience. Anaconda also has other benefits, for example the possibility to work in completely isolated Python environments with own installed packages, thereby avoiding conflicts with system-wide installed packages. Furthermore, Anaconda cooperates with pip (see below), i.e. packages can be installed with both Anaconda’s internal mechanism and pip without conflicts.

Alternatively, packages can be installed with pip in a user’s location, which should also avoid conflicts. We will provide instructions for this alternative.

Another possibility is to use virtualenv, which can be seen as a predecessor to Anaconda. Following the pip installation instructions in a virtualenv without the --user option works very well in our experience, but we do not provide explicit instructions for this variant.

Which Python version to use?

Any modern Python distribution supporting NumPy and SciPy should work for the core library, but some extensions require CPython (the standard Python distribution).

ODL fully supports both Python 2 and Python 3. If you choose to use your system Python interpreter (the “pip install as user” variant), it may be a good idea to stick with the default one, i.e. the one invoked by python. Otherwise, we recommend using Python 3, since Python 2 support will be discontinued in 2020.

Development environment

Since ODL is object-oriented, using an Integrated Development Environment (IDE) is recommended, but not required. The most popular ones are Spyder which works on all major platforms and can be installed through both conda and pip, and PyCharm which can be integrated with any text editor of your choice, such as Emacs or Vim.

In depth guides

If you are a new user or need more a detailed installation guide, we provide support for the following installation methods:

  1. Installing ODL using conda (recommended for users)
  2. Installing ODL using pip
  3. Installing ODL from source (recommended for developers)

To further extend ODL capability, a large set of extensions can also be installed.

Issues

If you have any problems during installation, consult the help in the FAQ. If that does not help, make an issue on GitHub or send us an email (odl@math.kth.se) and we’ll try to assist you promptly.

First steps

This guide is intended to give you a simple introduction to ODL and how to work with it. If you need help with a specific function you should look at the ODL API reference.

The best way to get started with ODL as a user is generally to find one (or more) examples that are relevant to whichever problem you are studying. These are available in the examples folder on GitHub. They are mostly written to be copy-paste friendly and show how to use the respective operators, solvers and spaces in a correct manner.

Example: Solving an inverse problem

In what follows, we will give an example of the workflow one might have when solving an inverse problem as it is encountered “in real life”. The problem we want to solve is

Af = g

Where A is the convolution operator

(Af)(x) = \int f(x) k(x-y) dy

where k is the convolution kernel, f is the unknown solution and g is known data. As is typical in applications, the convolution operator may not be available in ODL (we’ll pretend it’s not), so we will need to implement it.

We start by finding a nice implementation of the convolution operator – SciPy happens to have one – and create a wrapping Operator for it in ODL.

import odl
import scipy

class Convolution(odl.Operator):
    """Operator calculating the convolution of a kernel with a function.

    The operator inherits from ``odl.Operator`` to be able to be used with ODL.
    """

    def __init__(self, kernel):
        """Initialize a convolution operator with a known kernel."""

        # Store the kernel
        self.kernel = kernel

        # Initialize the Operator class by calling its __init__ method.
        # This sets properties such as domain and range and allows the other
        # operator convenience functions to work.
        odl.Operator.__init__(self, domain=kernel.space, range=kernel.space,
                              linear=True)

    def _call(self, x):
        """Implement calling the operator by calling scipy."""
        return scipy.signal.fftconvolve(self.kernel, x, mode='same')

We can verify that our operator works by calling it on some data. This can either come from an outside source, or from simulations. ODL also provides a nice range of standard phantoms such as the cuboid and shepp_logan phantoms:

# Define the space the problem should be solved on.
# Here the square [-1, 1] x [-1, 1] discretized on a 100x100 grid.
space = odl.uniform_discr([-1, -1], [1, 1], [100, 100])

# Convolution kernel, a small centered rectangle.
kernel = odl.phantom.cuboid(space, [-0.05, -0.05], [0.05, 0.05])

# Create convolution operator
A = Convolution(kernel)

# Create phantom (the "unknown" solution)
phantom = odl.phantom.shepp_logan(space, modified=True)

# Apply convolution to phantom to create data
g = A(phantom)

# Display the results using the show method
kernel.show('kernel')
phantom.show('phantom')
g.show('convolved phantom')
_images/getting_started_kernel.png _images/getting_started_phantom.png _images/getting_started_convolved.png

We can use this as right-hand side in our inverse problem. We try one of the most simple solvers, the landweber solver. The Landweber solver is an iterative solver that solves

f_{i+1} = f_i - \omega A^* (A(f_i) - g)

where \omega < 2/\|A\| is a constant and A^* is the adjoint operator associated with A. The adjoint is a generalization of the transpose of a matrix and defined as the (unique) operator such that

\langle Ax, y \rangle = \langle x, A^*y \rangle

where \langle x, y \rangle is the inner product. It is implemented in odl as Operator.adjoint. Luckily, the convolution operator is self adjoint if the kernel is symmetric, so we can add:

class Convolution(odl.Operator):
    ...  # old code

    @property  # making the adjoint a property lets users access it as conv.adjoint
    def adjoint(self):
        return self  # the adjoint is the same as this operator

With this addition we are ready to try solving the inverse problem using the landweber solver:

# Need operator norm for step length (omega)
opnorm = odl.power_method_opnorm(A)

f = space.zero()
odl.solvers.landweber(A, f, g, niter=100, omega=1/opnorm**2)
f.show('landweber')
_images/getting_started_landweber.png

This solution is not very good, mostly due to the ill-posedness of the convolution operator. Other solvers like conjugate gradient on the normal equations (conjugate_gradient_normal) give similar results:

f = space.zero()
odl.solvers.conjugate_gradient_normal(A, f, g, niter=100)
f.show('conjugate gradient')
_images/getting_started_conjugate_gradient.png

A method to remedy this problem is to instead consider a regularized problem. One of the classic regularizers is Tikhonov regularization where we add regularization to the problem formulation, i.e. slightly change the problem such that the obtained solutions have better regularity properties. We instead study the problem

\min_f \|Af - g\|_2^2 + a \|Bf\|_2^2,

where B is a “roughening’ operator and a is a regularization parameter that determines how strong the regularization should be. Basically one wants that Bf is less smooth than f so that the optimum solution is more smooth. To solve it with the above solvers, we can find the first order optimality conditions

2 A^* (Af - g) + 2 a B^* B f =0

This can be rewritten on the form Tf=b:

\underbrace{(A^* A + a B^* B)}_T f = \underbrace{A^* g}_b

We first use a multiple of the IdentityOperator in ODL as B, which is also known as ‘classical’ Tikhonov regularization. Note that since the operator T above is self-adjoint we can use the classical conjugate_gradient method instead of conjugate_gradient_normal. This improves both computation time and numerical stability.

B = odl.IdentityOperator(space)
a = 0.1
T = A.adjoint * A + a * B.adjoint * B
b = A.adjoint(g)

f = space.zero()
odl.solvers.conjugate_gradient(T, f, b, niter=100)
f.show('Tikhonov identity conjugate gradient')
_images/getting_started_tikhonov_identity_conjugate_gradient.png

Slightly better, but no major improvement. What about letting B be the Gradient?

B = odl.Gradient(space)
a = 0.0001
T = A.adjoint * A + a * B.adjoint * B
b = A.adjoint(g)

f = space.zero()
odl.solvers.conjugate_gradient(T, f, b, niter=100)
f.show('Tikhonov gradient conjugate gradient')
_images/getting_started_tikhonov_gradient_conjugate_gradient.png

Perhaps a bit better, but far from excellent.

Let’s try more modern methods, like TV regularization. Here we want to solve the problem

\min_{0 \leq f \leq 1} \|Af - g\|_2^2 + a \|\nabla f\|_1

Since this is a non-differentiable problem we need more advanced solvers to solve it. One of the stronger solvers in ODL is the Douglas-Rachford Primal-Dual method (douglas_rachford_pd) which uses Proximal Operators to solve the optimization problem. However, as a new user you do not need to consider the specifics, instead you only need to assemble the functionals involved in the problem you wish to solve.

Consulting the douglas_rachford_pd documentation we see that it solves problems of the form

\min_x f(x) + \sum_{i=1}^n g_i(L_i x),

where f, g_i are convex functions, L_i are linear Operator‘s. By identification, we see that the above problem can be written in this form if we let math:f be the indicator function on [0, 1], g_1 be the squared l2 distance \| \cdot - g\|_2^2, g_2 be the norm \| \cdot \|_1, L_1 be the convolution operator and L_2 be the gradient operator.

There are several examples available using this solver as well as similar optimization methods, e.g. forward_backward_pd, chambolle_pock_solver, etc in the ODL examples/solvers folder.

# Assemble all operators into a list.
grad = odl.Gradient(space)
lin_ops = [A, grad]
a = 0.001

# Create functionals for the l2 distance and l1 norm.
g_funcs = [odl.solvers.L2NormSquared(space).translated(g),
           a * odl.solvers.L1Norm(grad.range)]

# Functional of the bound constraint 0 <= x <= 1
f = odl.solvers.IndicatorBox(space, 0, 1)

# Find scaling constants so that the solver converges.
# See the douglas_rachford_pd documentation for more information.
opnorm_A = odl.power_method_opnorm(A, xstart=g)
opnorm_grad = odl.power_method_opnorm(grad, xstart=g)
sigma = [1 / opnorm_A ** 2, 1 / opnorm_grad ** 2]
tau = 1.0

# Solve using the Douglas-Rachford Primal-Dual method
x = space.zero()
odl.solvers.douglas_rachford_pd(x, f, g_funcs, lin_ops,
                                tau=tau, sigma=sigma, niter=100)
x.show('TV Douglas-Rachford', show=True)
_images/getting_started_TV_douglas_rachford.png

This solution is almost perfect, and we can happily go on to solving more advanced problems!

The full code in this example is available below.

"""Source code for the getting started example."""

import odl
import scipy
import scipy.signal


class Convolution(odl.Operator):
    """Operator calculating the convolution of a kernel with a function.

    The operator inherits from ``odl.Operator`` to be able to be used with ODL.
    """

    def __init__(self, kernel):
        """Initialize a convolution operator with a known kernel."""

        # Store the kernel
        self.kernel = kernel

        # Initialize the Operator class by calling its __init__ method.
        # This sets properties such as domain and range and allows the other
        # operator convenience functions to work.
        odl.Operator.__init__(self, domain=kernel.space, range=kernel.space,
                              linear=True)

    def _call(self, x):
        """Implement calling the operator by calling scipy."""
        return scipy.signal.fftconvolve(self.kernel, x, mode='same')

    @property  # making adjoint a property lets users access it as A.adjoint
    def adjoint(self):
        return self  # the adjoint is the same as this operator


# Define the space the problem should be solved on.
# Here the square [-1, 1] x [-1, 1] discretized on a 100x100 grid.
space = odl.uniform_discr([-1, -1], [1, 1], [100, 100])

# Convolution kernel, a small centered rectangle.
kernel = odl.phantom.cuboid(space, [-0.05, -0.05], [0.05, 0.05])

# Create convolution operator
A = Convolution(kernel)

# Create phantom (the "unknown" solution)
phantom = odl.phantom.shepp_logan(space, modified=True)

# Apply convolution to phantom to create data
g = A(phantom)

# Display the results using the show method
kernel.show('kernel')
phantom.show('phantom')
g.show('convolved phantom')

# Landweber

# Need operator norm for step length (omega)
opnorm = odl.power_method_opnorm(A)

f = space.zero()
odl.solvers.landweber(A, f, g, niter=100, omega=1/opnorm**2)
f.show('landweber')

# Conjugate gradient

f = space.zero()
odl.solvers.conjugate_gradient_normal(A, f, g, niter=100)
f.show('conjugate gradient')

# Tikhonov with identity

B = odl.IdentityOperator(space)
a = 0.1
T = A.adjoint * A + a * B.adjoint * B
b = A.adjoint(g)

f = space.zero()
odl.solvers.conjugate_gradient(T, f, b, niter=100)
f.show('Tikhonov identity conjugate gradient')

# Tikhonov with gradient

B = odl.Gradient(space)
a = 0.0001
T = A.adjoint * A + a * B.adjoint * B
b = A.adjoint(g)

f = space.zero()
odl.solvers.conjugate_gradient(T, f, b, niter=100)
f.show('Tikhonov gradient conjugate gradient')

# Douglas-Rachford

# Assemble all operators into a list.
grad = odl.Gradient(space)
lin_ops = [A, grad]
a = 0.001

# Create functionals for the l2 distance and l1 norm.
g_funcs = [odl.solvers.L2NormSquared(space).translated(g),
           a * odl.solvers.L1Norm(grad.range)]

# Functional of the bound constraint 0 <= f <= 1
f = odl.solvers.IndicatorBox(space, 0, 1)

# Find scaling constants so that the solver converges.
# See the douglas_rachford_pd documentation for more information.
opnorm_A = odl.power_method_opnorm(A, xstart=g)
opnorm_grad = odl.power_method_opnorm(grad, xstart=g)
sigma = [1 / opnorm_A**2, 1 / opnorm_grad**2]
tau = 1.0

# Solve using the Douglas-Rachford Primal-Dual method
x = space.zero()
odl.solvers.douglas_rachford_pd(x, f, g_funcs, lin_ops,
                                tau=tau, sigma=sigma, niter=100)
x.show('TV Douglas-Rachford', force_show=True)

User’s guide – selected topics

Welcome to the ODL user’s guide. This section contains in-depth explanations of selected topics in ODL. It is intended to familiarize you with important concepts that can be hard to infer from the API documentation and the overall introduction only.

Operators

Operators in ODL are represented by the abstract Operator class. As an abstract class, it cannot be used directly but must be subclassed for concrete implementation. To define your own operator, you start by writing:

class MyOperator(odl.Operator):
    ...

Operator has a couple of abstract methods which need to be explicitly overridden by any subclass, namely

domain: Set
Set of elements to which the operator can be applied
range : Set
Set in which the operator takes values

As a simple example, you can implement the matrix multiplication operator

\mathcal{A}: \mathbb{R}^m \to \mathbb{R}^n, \quad \mathcal{A}(x) = Ax

for a matrix A\in \mathbb{R}^{n\times m} as follows:

class MatVecOperator(odl.Operator):
    def __init__(self, matrix):
        self.matrix = matrix
        dom = odl.rn(matrix.shape[1])
        ran = odl.rn(matrix.shape[0])
        odl.Operator.__init__(self, dom, ran)

In addition, an Operator needs at least one way of evaluation, in-place or out-of-place.

In place evaluation

In-place evaluation means that the operator is evaluated on a Operator.domain element, and the result is written to an already existing Operator.range element. To implement this behavior, create the (private) Operator._call method with the following signature, here given for the above example:

class MatVecOperator(odl.Operator):
    ...
    def _call(self, x, out):
        self.matrix.dot(x, out=out.asarray())

In-place evaluation is usually more efficient and should be used whenever possible.

Out-of-place evaluation

Out-of-place evaluation means that the operator is evaluated on a domain element, and the result is written to a newly allocated range element. To implement this behavior, use the following signature for Operator._call (again given for the above example):

class MatVecOperator(odl.Operator):
    ...
    def _call(self, x):
        return self.matrix.dot(x)

Out-of-place evaluation is usually less efficient since it requires allocation of an array and a full copy and should be generally avoided.

Important: Do not call these methods directly. Use the call pattern operator(x) or operator(x, out=y), e.g.:

matrix = np.array([[1, 0], [0, 1], [1, 1]])
operator = MatVecOperator(matrix)
x = odl.rn(2).one()
y = odl.rn(3).element()

# Out-of-place evaluation
y = operator(x)

# In-place evaluation
operator(x, out=y)

This public calling interface is (duck-)type-checked, so the private methods can safely assume that their input data is of the operator domain element type.

Operator arithmetic

It is common in applications to perform arithmetic with operators, for example the addition of matrices

[A+B]x = Ax + Bx

or multiplication of a functional by a scalar

[\alpha x^*](x) = \alpha x^* (x)

Another example is matrix multiplication, which corresponds to operator composition

[AB](x) = A(Bx)

All available operator arithmetic is shown below. A, B represent arbitrary Operator‘s, f is an Operator whose Operator.range is a Field (sometimes called a functional), and a is a scalar.

Code Meaning Class
(A + B)(x) A(x) + B(x) OperatorSum
(A * B)(x) A(B(x)) OperatorComp
(a * A)(x) a * A(x) OperatorLeftScalarMult
(A * a)(x) A(a * x) OperatorRightScalarMult
(v * f)(x) v * f(x) FunctionalLeftVectorMult
(v * A)(x) v * A(x) OperatorLeftVectorMult
(A * v)(x) A(v * x) OperatorRightVectorMult
not available A(x) * B(x) OperatorPointwiseProduct

There are also a few derived expressions using the above:

Code Meaning
(+A)(x) A(x)
(-A)(x) (-1) * A(x)
(A - B)(x) A(x) + (-1) * B(x)
A**n(x) A(A**(n-1)(x)), A^1(x) = A(x)
(A / a)(x) A((1/a) * x)
(A @ B)(x) (A * B)(x)

Except for composition, operator arithmetic is generally only defined when Operator.domain and Operator.range are either instances of LinearSpace or Field.

Linear spaces

The LinearSpace class represent abstract mathematical concepts of vector spaces. It cannot be used directly but are rather intended to be subclassed by concrete space implementations. The space provides default implementations of the most important vector space operations. See the documentation of the respective classes for more details.

The concept of linear vector spaces in ODL is largely inspired by the Rice Vector Library (RVL).

The abstract LinearSpace class is intended for quick prototyping. It has a number of abstract methods which must be overridden by a subclass. On the other hand, it provides automatic error checking and numerous attributes and methods for convenience.

Abstract methods

In the following, the abstract methods are explained in detail.

Element creation

element(inp=None)

This public method is the factory for the inner LinearSpaceElement class. It creates a new element of the space, either from scratch or from an existing data container. In the simplest possible case, it just delegates the construction to the LinearSpaceElement class.

If no data is provided, the new element is merely allocated, not initialized, thus it can contain any value.

Parameters:
inp : object, optional
A container for values for the element initialization.
Returns:
element : LinearSpaceElement
The new element.
Linear combination

_lincomb(a, x1, b, x2, out)

This private method is the raw implementation (i.e. without error checking) of the linear combination out = a * x1 + b * x2. LinearSpace._lincomb and its public counterpart LinearSpace.lincomb are used to covera range of convenience functions, see below.

Parameters:
a, b : scalars, must be members of the space’s field
Multiplicative scalar factors for input element x1 or x2, respectively.
x1, x2 : LinearSpaceElement
Input elements.
out : LinearSpaceElement
Element to which the result of the computation is written.

Returns: None

Requirements:
  • Aliasing of x1, x2 and out must be allowed.
  • The input elements x1 and x2 must not be modified.
  • The initial state of the output element out must not influence the result.
Underlying scalar field

field

The public attribute determining the type of scalars which underlie the space. Can be instances of either RealNumbers or ComplexNumbers (see Field).

Should be implemented as a @property to make it immutable.

Equality check

__eq__(other)

LinearSpace inherits this abstract method from Set. Its purpose is to check two LinearSpace instances for equality.

Parameters:
other : object
The object to compare to.
Returns:
equals : bool
True if other is the same LinearSpace, False otherwise.
Distance (optional)

_dist(x1, x2)

A raw (not type-checking) private method measuring the distance between two elements x1 and x2.

A space with a distance is called a metric space.

Parameters:
x1,x2 : LinearSpaceElement
Elements whose mutual distance to calculate.
Returns:
distance : float
The distance between x1 and x2, measured in the space’s metric
Requirements:
  • _dist(x, y) == _dist(y, x)
  • _dist(x, y) <= _dist(x, z) + _dist(z, y)
  • _dist(x, y) >= 0
  • _dist(x, y) == 0 (approx.) if and only if x == y (approx.)
Norm (optional)

_norm(x)

A raw (not type-checking) private method measuring the length of a space element x.

A space with a norm is called a normed space.

Parameters:
x : LinearSpaceElement
The element to measure.
Returns:
norm : float
The length of x as measured in the space’s norm.
Requirements:
  • _norm(s * x) = |s| * _norm(x) for any scalar s
  • _norm(x + y) <= _norm(x) + _norm(y)
  • _norm(x) >= 0
  • _norm(x) == 0 (approx.) if and only if x == 0 (approx.)
Inner product (optional)

_inner(x, y)

A raw (not type-checking) private method calculating the inner product of two space elements x and y.

Parameters:
x,y : LinearSpaceElement
Elements whose inner product to calculate.
Returns:
inner : float or complex
The inner product of x and y. If LinearSpace.field is the set of real numbers, inner is a float, otherwise complex.
Requirements:
  • _inner(x, y) == _inner(y, x)^* with ‘*’ = complex conjugation
  • _inner(s * x, y) == s * _inner(x, y) for s scalar
  • _inner(x + z, y) == _inner(x, y) + _inner(z, y)
  • _inner(x, x) == 0 (approx.) if and only if x == 0 (approx.)
Pointwise multiplication (optional)

_multiply(x1, x2, out)

A raw (not type-checking) private method multiplying two elements x1 and x2 point-wise and storing the result in out.

Parameters:
x1, x2 : LinearSpaceElement
Elements whose point-wise product to calculate.
out : LinearSpaceElement
Element to store the result.

Returns: None

Requirements:
  • _multiply(x, y, out) <==> _multiply(y, x, out)
  • _multiply(s * x, y, out) <==> _multiply(x, y, out); out *= s  <==>
    _multiply(x, s * y, out) for any scalar s
  • There is a space element one with out after _multiply(one, x, out) or _multiply(x, one, out) equals x.

Notes

  • A normed space is automatically a metric space with the distance function _dist(x, y) = _norm(x - y).
  • A Hilbert space (inner product space) is automatically a normed space with the norm function _norm(x) = sqrt(_inner(x, x)).
  • The conditions on the pointwise multiplication constitute a unital commutative algebra in the mathematical sense.

References

See Wikipedia’s mathematical overview articles Vector space, Algebra.

Using ODL with NumPy and SciPy

Numpy is the ubiquitous library for array computations in Python, and is used by almost all major numerical packages. It provides optimized Array objects that allow efficient storage of large arrays. It also provides several optimized algorithms for many of the functions used in numerical programming, such as taking the cosine or adding two arrays.

SciPy is a library built on top of NumPy providing more advanced algorithms such as linear solvers, statistics, signal and image processing etc.

Many operations are more naturally performed using NumPy/SciPy than with ODL, and with that in mind ODL has been designed so that interfacing with them is as easy and fast as possible.

Casting vectors to and from arrays

ODL vectors are stored in an abstract way, enabling storage on the CPU, GPU, or perhaps on a cluster on the other side of the world. This allows algorithms to be written in a generalized and storage-agnostic manner. Still, it is often convenient to be able to access the data and look at it, perhaps to initialize a vector, or to call an external function.

To cast a NumPy array to an element of an ODL vector space, you simply need to call the LinearSpace.element method in an appropriate space:

>>> r3 = odl.rn(3)
>>> arr = np.array([1, 2, 3])
>>> x = r3.element(arr)

If the data type and storage methods allow it, the element simply wraps the underlying array using a view:

>>> float_arr = np.array([1.0, 2.0, 3.0])
>>> x = r3.element(float_arr)
>>> x.data is float_arr
True

Casting ODL vector space elements to NumPy arrays can be done in two ways, either through the member function NtuplesBaseVector.asarray, or using numpy.asarray. These are both optimized and if possible return a view:

>>> x.asarray()
array([ 1.,  2.,  3.])
>>> np.asarray(x)
array([ 1.,  2.,  3.])

These methods work with any ODL object represented by an array. For example, in discretizations, a two-dimensional array can be used:

>>> space = odl.uniform_discr([0, 0], [1, 1], [3, 3])
>>> arr = np.array([[1, 2, 3],
...                 [4, 5, 6],
...                 [7, 8, 9]])
>>> x = space.element(arr)
>>> x.asarray()
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]])

Using ODL vectors with NumPy functions

A very convenient feature of ODL is its seamless interaction with NumPy functions. For universal functions (ufuncs) this is supported both via method of the ODL object and by direct application of the NumPy functions. For example, using NumPy:

>>> r3 = odl.rn(3)
>>> x = r3.element([1, 2, 3])
>>> np.negative(x)
rn(3).element([-1.0, -2.0, -3.0])

This method always uses the NumPy implementation, which can involve overhead in case the data is not stored in a CPU space. To always enable optimized code, users can call the member NtuplesBaseVector.ufunc:

>>> x.ufuncs.negative()
rn(3).element([-1.0, -2.0, -3.0])

For other arbitrary functions, ODL vector space elements are generally accepted as input, but the output is often of numpy.ndarray type:

>>> np.convolve(x, x, mode='same')
array([  4.,  10.,  12.])
Implementation notes

The fact that the x.ufuncs.negative() interface is needed is a known issue with NumPy and a fix is underway. As of April 2016, this is not yet available.

NumPy functions as Operators

To solve the above issue, it is often useful to write an Operator wrapping NumPy functions, thus allowing full access to the ODL ecosystem. To wrap the convolution operation, you could write a new class:

>>> class MyConvolution(odl.Operator):
...     """Operator for convolving with a given vector."""
...
...     def __init__(self, vector):
...         """Initialize the convolution."""
...         self.vector = vector
...
...         # Initialize operator base class.
...         # This operator maps from the space of vector to the same space and is linear
...         odl.Operator.__init__(self, domain=vector.space, range=vector.space,
...                               linear=True)
...
...     def _call(self, x):
...         # The output of an Operator is automatically cast to an ODL vector
...         return np.convolve(x, self.vector, mode='same')

This could then be called as an ODL Operator:

>>> op = MyConvolution(x)
>>> op(x)
rn(3).element([4.0, 10.0, 12.0])

Since this is an ODL Operator, it can be used with any of the ODL functionalities such as multiplication with scalar, composition, etc:

>>> scaled_op = 2 * op  # scale by scalar
>>> scaled_op(x)
rn(3).element([8.0, 20.0, 24.0])
>>> y = r3.element([1, 1, 1])
>>> inner_product_op = odl.InnerProductOperator(y)
>>> composed_op = inner_product_op * op  # create composition with inner product with vector [1, 1, 1]
>>> composed_op(x)
26.0

For more information on ODL Operators, how to implement them and their features, see the guide on Operators.

Using ODL with SciPy linear solvers

SciPy includes a series of very competent solvers that may be useful in solving some linear problems. If you have invested some effort into writing an ODL operator, or perhaps wish to use a pre-existing operator then the function as_scipy_operator creates a Python object that can be used in SciPy’s linear solvers. Here is a simple example of solving Poisson’s equation equation on an interval (- \Delta x = rhs):

>>> space = odl.uniform_discr(0, 1, 5)
>>> op = -odl.Laplacian(space)
>>> rhs = space.element(lambda x: (x>0.4) & (x<0.6))  # indicator function on [0.4, 0.6]
>>> result, status = scipy.sparse.linalg.cg(odl.as_scipy_operator(op), rhs)
>>> result
array([ 0.02,  0.04,  0.06,  0.04,  0.02])

Vectorized functions

This section is intended as a small guideline on how to write functions which work with the vectorization machinery by Numpy which is used internally in ODL.

What is vectorization?

In general, vectorization means that a function can be evaluated on a whole array of values at once instead of looping over individual entries. This is very important for performance in an interpreted language like python, since loops are usually very slow compared to compiled languages.

Technically, vectorization in Numpy works through the Universal functions (ufunc) interface. It is fast because all loops over data are implemented in C, and the resulting implementations are exposed to Python for each function individually.

How to use Numpy’s ufuncs?

The easiest way to write fast custom mathematical functions in Python is to use the available ufuncs and compose them to a new function:

def gaussian(x):
    # Negation, powers and scaling are vectorized, of course.
    return np.exp(-x ** 2 / 2)

def step(x):
    # np.where checks the condition in the first argument and
    # returns the second for `True`, otherwise the third. The
    # last two arguments can be arrays, too.
    # Note that also the comparison operation is vectorized.
    return np.where(x[0] <= 0, 0, 1)

This should cover a very large range of useful functions already (basic arithmetic is vectorized, too!). An even larger list of special functions are available in the Scipy package.

Usage in ODL

Python functions are in most cases used as input to a discretization process. For example, we may want to discretize a two-dimensional Gaussian function:

>>> def gaussian2(x):
...     return np.exp(-(x[0] ** 2 + x[1] ** 2) / 2)

on the rectangle [-5, 5] x [-5, 5] with 100 pixels in each dimension. The code for this is simply

>>> # Note that the minimum and maxiumum coordinates are given as
>>> # vectors, not one interval at a time.
>>> discr = odl.uniform_discr([-5, -5], [5, 5], (100, 100))
>>> # This creates an element in the discretized space ``discr``
>>> gaussian_discr = discr.element(gaussian2)

What happens behind the scenes is that discr creates a discretization object which has a built-in method element to turn continuous functions into discrete arrays by evaluating them at a set of grid points. In the example above, this grid is a uniform sampling of the rectangle by 100 points per dimension.

To make this process fast, element assumes that the function is written in a way that not only supports vectorization, but also guarantees that the output has the correct shape. The function receives a meshgrid tuple as input, in the above case consisting of two vectors:

>>> mesh = discr.meshgrid
>>> mesh[0].shape
(100, 1)
>>> mesh[1].shape
(1, 100)

When inserted into the function, the final shape of the output is determined by Numpy’s broadcasting rules. For the Gaussian function, Numpy will conclude that the output shape must be (100, 100) since the arrays in mesh are added after squaring. This size is the same as expected by the discretization.

If a function does not use all components of the input, ODL tries to broadcast the result to the shape of the discretized space:

>>> def gaussian_const_x0(x):
...     return np.exp(-x[1] ** 2 / 2)  # no x[0] -> broadcasting

>>> gaussian_const_x0(mesh).shape
(1, 100)
>>> discr.element(gaussian_const_x0).shape
(100, 100)

Functional

A functional is an operator S: X \to \mathbb{F} that maps that maps from some vector space X to the field of scalars F.

In the ODL Solvers package, functionals are implemented in the Functional class, a subclass to Operator.

From a mathematical perspective, the above is a valid definition of a functional. However, since these functionals are primarily to be used for solving optimization problems, the following assumptions are made:

  • the vector space X is a Hilbert space.
  • the field of scalars F are the real numbers.

The first assumption is made in order to simplify the concept of convex conjugate functional, see convex_conj under implementation for more details, or the Wikipedia articles on convex conjugate and Legendre transformation.

The second assumption is made in order to guarantee that we use a well-ordered set (in contrast to e.g. the complex numbers) over which optimization problems can be meaningfully defined, and that optimal solutions are in fact obtained. See, for example, the Wikipedia articles on field, ordered field and least-upper-bound property.

Note that these conditions are not explicitly checked. However, using the class in violation to the above assumptions might lead to unknown behavior since some of the mathematical results might not hold. Also note that most of the theory, and most solvers, requires the functional to be convex. However this property is not stored or checked anywhere in the class. It is therefore the users responsibility to ensure that a functional has the properties required for a given optimization method.

The intended use of the Functional class is, as mentioned above, to be used when formulating and solving optimization problems. One main difference with the Operator class is thus that it contains notions specially intended for optimization, such as convex conjugate functional and proximal operator. For more information on these concepts, see convex_conj and proximal under Implementation of functionals. There is also a certain type of arithmetics associated with functionals, for more on this see Functional arithmetic.

Implementation of functionals

To define your own functional, start by writing:

class MyFunctional(odl.solvers.Functional):

    """Docstring goes here."""

    def __init__(self, space):
        # Sets `Operator.domain` to `space` and `Operator.range` to `space.field`
        odl.solvers.Functional.__init__(self, space)

    ...

Functional needs to be provided with a space, i.e., the domain on which it is defined, from which it infers the range.

  • space: LinearSpace
    The domain of this functional, i.e., the set of elements to which this functional can be applied.

Moreover, there are two optional parameters that can be provided in the initializer. These are linear, which indicates whether the functional is linear or not, and grad_lipschitz, which is the Lipschitz constant of the gradient.

  • linear: bool, optional
    If True, the functional is considered as linear.
  • grad_lipschitz: float, optional
    The Lipschitz constant of the gradient.

A functional also has three optional properties and one optional method associated with it. The properties are:

  • functional.gradient. This returns the gradient operator of the functional, i.e., the operator that corresponds to the mapping

    x \to \nabla S(x),

    where \nabla S(x) is the the space element representing the Frechet derivative (directional derivative) at the point x

    S'(x)(g) = \langle \nabla S(x), g \rangle \quad \text{for all } g \in X.

    See also Functional.derivative.

  • functional.convex_conj. This is the convex conjugate of the functional, itself again a functional, which is also known as the Legendre transform or Fenchel conjugate. It is defined as

    S^*(x^*) = \sup_{x \in X} \{ \langle x^*,x \rangle - S(x)  \},

    where x^* is an element in X and \langle x^*,x \rangle is the inner product. (Note that x^* should live in the space X^*, the (continuous/normed) dual space of X, however since we assume that X is a Hilbert space we have X^* = X).

  • proximal. This returns a proximal factory for the proximal operator of the functional. The proximal operator is defined as

    \text{prox}_{\sigma S}(x) = \text{arg min}_{y \in X} \{ S(y) + \frac{1}{2\sigma} \|y - x\|_2^2 \}.

The default behavior of these is to raise a NotImplemetedError.

The Functional class also contains default implementations of two helper functions:

  • derivative(point). Given an implementation of the gradient, this method returns the (directional) derivative operator in point. This is the linear operator

    x \to \langle x, \nabla S(p) \rangle,

    where \nabla S(p) is the gradient of the functional in the point p.

  • translated(shift). Given a functional S and a shift y, this method creates the functional S(\cdot - y).

Functional arithmetic

It is common in applications to perform arithmetic operations with functionals, for example adding two functionals S and T:

[S+T](x) = S(x) + T(x),

or multiplication of a functional by a scalar:

[\alpha S](x) = \alpha S (x).

Another example is translating a functional with a vector y:

translate(S(x), y) = S(x - y),

or given an Operator A whose range is the same as the domain of the functional we also have composition:

[S * A](x) = S(A(x)).

In some of these cases, properties and methods such as gradient, convex_conjugate and proximal can be calculated automatically given a default implementation of the corresponding property in S and T.

All available functional arithmetic, including which properties and methods that automatically can be calculated, is shown below. S, T represent Functional‘s with common domain and range, and A an Operator whose range is the same as the domain of the functional. a is a scalar in the field of the domain of S and T, and y is a vector in the domain of S and T.

Code Meaning Class
(S + T)(x) S(x) + T(x) FunctionalSum - Retains Functional.gradient.
(S + a)(x) S(x) + a FunctionalScalarSum - Retains all properties. Note that this never means scaling of the argument.
(S * A)(x) S(A(x)) FunctionalComp - Retains Functional.gradient.
(a * S)(x) a * S(x) FunctionalLeftScalarMult - Retains all properties if a is positive. Otherwise only Functional.gradient and Functional.derivative are retained.
(S * a)(x) S(a * x) FunctionalRightScalarMult - Retains all properties.
(v * S)(x) v * S(x) FunctionalLeftVectorMult - Results in an operator rather than a functional.
(S * v)(x) S(v * x) FunctionalRightVectorMult - Retains gradient and convex conjugate.
f.translated(y) f(. - y) FunctionalTranslation - Retains all properties.

Code example

This section contains an example of an implementation of a functional, namely the functional \|x\|_2^2 + \langle x, y \rangle. Another example can be found functional_basic_example.py, and more implementations of other functionals can be found in default_functionals.py.

"""Example of how to implement and use functionals."""

import odl


# Here we define the functional
class MyFunctional(odl.solvers.Functional):

    """This is my functional: ``||x||_2^2 + <x, y>``."""

    def __init__(self, space, y):
        """Initialize a new instance."""
        # This comand calls the init of Functional and sets a number of
        # parameters associated with a functional. All but domain have default
        # values if not set.
        odl.solvers.Functional.__init__(self, space=space, linear=False,
                                        grad_lipschitz=2)

        # We need to check that linear_term is in the domain. Then we store the
        # value of linear_term for future use.
        if y not in space:
            raise TypeError('linear_term is not in the domain!')
        self.y = y

    # Defining the _call function. This method is used for evaluation of
    # the functional and always needs to be implemented.
    def _call(self, x):
        """Evaluate the functional."""
        return x.norm()**2 + x.inner(self.y)

    # Next we define the gradient. Note that this is a property.
    @property
    def gradient(self):
        """The gradient operator."""

        # First we store the functional in a variable
        functional = self

        # The class corresponding to the gradient operator.
        class MyGradientOperator(odl.Operator):

            """Class implementing the gradient operator."""

            def __init__(self):
                """Initialize a new instance."""
                super().__init__(domain=functional.domain,
                                 range=functional.domain)

            def _call(self, x):
                """Evaluate the gradient."""
                # Here we can access the store functional from a few lines
                # above
                return 2.0 * x + functional.y

        return MyGradientOperator()

    # Next we define the convex conjugate functional.
    @property
    def convex_conj(self):
        """The convex conjugate functional."""
        # This functional is implemented below.
        return MyFunctionalConjugate(space=self.domain, y=self.y)


# Here is the conjugate functional. Note that this is a separate class, in
# contrast to the gradient which was implemented as an inner class. One
# advantage with the inner class it that we don't have to pass as many
# parameters when initializing, on the other hand having separate classes
# normally improves readibility of the code. Both methods are use throughout
# the odl package.
class MyFunctionalConjugate(odl.solvers.Functional):

    """Conjugate functional to ``||x||_2^2 + <x,y>``.

    This funtional has the analytic expression

        ``f^*(x) = ||x-y||^2/4``.
    """

    def __init__(self, space, y):
        """initialize a new instance."""
        odl.solvers.Functional.__init__(self, space=space, linear=False,
                                        grad_lipschitz=2)

        if y not in space:
            raise TypeError('y is not in the domain!')
        self.y = y

    def _call(self, x):
        """Evaluate the functional."""
        return (x - self.y).norm()**2 / 4.0


# Create a functional
space = odl.uniform_discr(0, 1, 3)
linear_term = space.element([1, -4, 7])
my_func = MyFunctional(space=space, y=linear_term)

# Now we evaluate the functional in a random point
point = odl.util.testutils.noise_element(space)
print('Value of the functional in a random point: {}'
      ''.format(my_func(point)))

# Now we use the steepest-decent solver and backtracking linesearch in order to
# find the minimum of the functional.

# Create a starting guess. Also used by the solver to update in-place.
x = space.one()

# Create the linesearch object
line_search = odl.solvers.BacktrackingLineSearch(my_func, max_num_iter=10)

# Call the solver
odl.solvers.steepest_descent(my_func, x, maxiter=10, line_search=line_search)

print('Expected value: {}'.format((-1.0 / 2) * linear_term))
print('Found value: {}'.format(x))

# Create the convex conjugate functional of a scaled and translated functional
scalar = 3.2
translation = space.one()
scal_trans_cc_func = (scalar * my_func).translated(translation).convex_conj

# Evaluating the new functional in the random point.
print('Value of the new functional in a random point: {}'
      ''.format(scal_trans_cc_func(point)))

Using ODL with ProxImaL

Proximal is a Python-embedded modeling language for image optimization problems and can be used with ODL to solve typical inverse problems phrased as optimization problems. The package is especially suited for non-differentiable problems such as total variance denoising.

Here is a minimal example of solving Poisson’s equation equation on an interval with a TV type regularizer (\min_x \ 10||-\Delta x - rhs||_2^2 + ||\nabla x||_1):

>>> space = odl.uniform_discr(0, 1, 5)
>>> op = -odl.Laplacian(space)
>>> proximal_lang_op = odl.as_proximal_lang_operator(op)
>>> rhs = space.element(lambda x: (x>0.4) & (x<0.6))  # indicator function on [0.4, 0.6]
>>> x = proximal.Variable(space.shape)
>>> prob = proximal.Problem([10 * proximal.sum_squares(x - rhs.asarray()),
...                          proximal.norm1(proximal.grad(x))])
>>> opt_val = prob.solve()
>>> print(opt_val)
36.082836566
>>> x.value
array([ 0.02352054,  0.02647946,  0.9       ,  0.02647946,  0.02352054])

Note that this requires the latest version of ProxImaL (version>0.1.4).

Notable differences between ODL and ProxImaL

It may be tempting to try to convert an arbitrary problem from ODL into ProxImaL, but some differences exist.

Norms

Norms in ODL are scaled according to the underlying function space. Hence a sequence of statements converging discretizations give rise to a converging norm:

>>> for n in [2, 10, 100, 10000]:
...     space = odl.uniform_discr(0, 1, n)
...     print('{:.10}'.format(space.element(lambda x: x).norm()))
 0.5590169944
 0.5766281297
 0.5773430523
 0.5773502685
>>> 1 / np.sqrt(3)  # exact result
0.57735026918962584

this is not the case in ProxImaL, where the norm depends on the number of discretization points. Hence a scaling that is correct for a problem in ODL needs not be correct in proximal. This also changes the definition of things like the operator norm.

This also has the added effect of changing the definition of derived features, like the spectral norm of operators.

Spaces

ODL can represent some complicated spaces, like \mathbb{R}^3 \times \mathcal{L}^2(0, 1) through the ProductSpace class:

>>> space = odl.ProductSpace(odl.rn(3), odl.uniform_discr(0, 1, 5))

This can then be used in solvers and other structures. ProxImaL currently lacks an equivalent structure.

Chambolle-Pock solver

The chambolle_pock_solver was introduced in 2011 by Chambolle and Pock in the paper A first-order primal-dual algorithm for convex problems with applications to imaging. It is a method for solving convex non-smooth problems of the form

\min_{x \in X} f(L x) + g(x),

where L is a linear Operator L : X -> Y, X and Y are (discretized) function spaces and g : X \mapsto [0, +\infty] and f : Y \mapsto [0, +\infty] are proper, convex, lower semi-continuous functionals. For more information on the mathematics, please see the mathematical background article on this method.

Using the Chambolle-Pock solver

There are several examples in the examples folder of ODL, including denoising, deblurring and tomography. Here, we will walk through the solution of a typical problem using the Chambolle-Pock solver.

Mathematical problem setup

The problem we’ll be looking at is the TV regularized denoising problem

\min_{x \in X} \left[ d(x) + r(x) + \iota_{[0, \infty]}(x) \right]

with L^2 data discrepancy term for given data y \in X,

d(x) = \frac{1}{2} \|x - y\|_2^2,

TV regularization term

r(x) = \lambda \|\nabla x\|_1

and positivity constraint enforced by the indicator function

\iota_{[0, \infty]}(x) =
\begin{cases}
  0,         & \text{ if } x \geq 0 \text{ everywhere}, \\
  \infty,    & \text{ else }.
\end{cases}

Here, \|\cdot\|_q is the L^q norm (q = 1,2), \nabla the spatial gradient, and \lambda a regularization parameter.

The standard way of fitting this problem into the Chambolle-Pock framework is to summarize both data fit and regularization terms into the composition part f \circ L of the solver, and to set g to the positivity constraint \iota_{[0, \infty]}. By setting L = (I, \nabla): X \to X \times X^d, where I is the identity mapping on X, we can write

d(x) + r(x)
= \left \|
\begin{pmatrix}
  d(x) \\
  p(x)
\end{pmatrix}
\right \|_1
= \left \|
\begin{pmatrix}
  \|x - y\|_2^2 / 2 \\
  \lambda \|\nabla x\|_1
\end{pmatrix}
\right \|_1
= \big[ f \circ L \big](x)

with the functional f: X \times X^d \to \mathbb{R} defined by

f(x, u) = \left \|
\begin{pmatrix}
  \|x - y\|_2^2 / 2 \\
  \lambda \|u\|_1
\end{pmatrix}
\right \|_1
= \frac{1}{2} \|x - y\|_2^2 + \lambda \|u\|_1.

Note that the arguments x, u of f are independent, i.e. the sum of the two functionals is a SeparableSum.

Note

The operator L maps X to the ProductSpace X \times X^d. Such a “one-to-many” type of mapping is also called BroadcastOperator.

Numerical solution using ODL

Now we implement a numerical solution to the above defined problem using the Chambolle-Pock solver in ODL.

Problem setup

The first step in the problem setup is the definition of the spaces in which we want to solve the problem. In this case, we use an L^2 space on the square [0, 100] \times [0, 100]. We choose 256 discretization points per axis:

>>> space = odl.uniform_discr(min_pt=[0, 0], max_pt=[100, 100], shape=[256, 256])

In real problems, the data y would be given by some measurement, but for the purpose of testing the solver, we generate data by creating a modified Shepp-Logan phantom and adding 10% Gaussian noise:

>>> phantom = odl.phantom.shepp_logan(space, modified=True)
>>> data = phantom + odl.phantom.white_noise(space) * 0.1

We now need to define the forward operator L, which we do one constituent at a time:

>>> ident = odl.IdentityOperator(space)
>>> grad = odl.Gradient(space)

To create L, we use the BroadcastOperator class as mentioned above:

>>> L = odl.BroadcastOperator(ident, grad)

We can now proceed to the problem specification. This step requires us to specify the functionals f and g, where the former is the SeparableSum of the squared L^2 distance to y and the (vectorial) L^1 norm. These functionals are available in ODL as L2NormSquared and L1Norm, respectively:

>>> l2_norm_squared = odl.solvers.L2NormSquared(space).translated(data)
>>> l1_norm = 0.0003 * odl.solvers.L1Norm(grad.range)
>>> f = odl.solvers.SeparableSum(l2_norm_squared, l1_norm)

Note

We don’t need to take extra care of the L^1 norm being a vectorial norm since L1Norm also works on product spaces.

Finally, we define the functional for the nonnegativity constraint, available as the functional IndicatorNonnegativity:

>>> g = odl.solvers.IndicatorNonnegativity(space)
Calling the solver

Now that the problem is set up, we need to select some optimization parameters. For the Chambolle-Pock method, there is one main rule that we can use: The product of the primal step \tau, the dual step \sigma and the squared operator norm \|L\|^2 has to be smaller than 1, \tau \sigma \|L\|^2 < 1. Apart from this, there are no clear rules on how to select \tau and \sigma – basically we’re left with trial and error. We decide to pick them both equal to 1 / \|L\|. To calculate an estimate of the operator norm, we have the tool power_method_opnorm which performs the simple power iteration to approximate the largest singular value of L:

>>> op_norm = 1.1 * odl.power_method_opnorm(L, maxiter=4, xstart=phantom)
>>> tau = sigma = 1.0 / op_norm

Finally, we pick a starting point (zero) and run the algorithm:

>>> x = space.zero()
>>> odl.solvers.chambolle_pock_solver(
...     x, f, g, L, tau=tau, sigma=sigma, niter=100)

Now we check the result after 100 iterations and compare it to the original:

>>> fig1 = phantom.show('phantom')
>>> fig2 = data.show('noisy data')
>>> fig3 = x.show('TV denoised result')

This yields the following images:

_images/chambolle_pock_phantom.png _images/chambolle_pock_data.png _images/chambolle_pock_result.png

Mathematical Background

This section explains the mathematical concepts on which ODL is built.

Linear Spaces

Definition and basic properties

A linear space over a field \mathbb{F} is a set \mathcal{X}, endorsed with the operations of vector addition+” and scalar multiplication\cdot” which are required to fullfill certain properties, usually called axioms. To emphasize the importance of all ingredients, vector spaces are often written as tuples (\mathcal{X}, \mathbb{F}, +, \cdot). We always assume that \mathbb{F} = \mathbb{R} or \mathbb{C}.

In the following, we list the axioms, which are required to hold for arbitrary x, y, z \in \mathcal{X} and a, b \in \mathbb{F}.

Associativity of addition (x + y) + z = (x + y) + z
Commutativity of addition x + y = y + x
Existence of a neutral element of addition 0 + x = x + 0 = x
Existence of inverse elements of addition \forall x\ \exists \bar x: \bar x + x = x + \bar x = 0
Compatibility of multiplications a \cdot (b \cdot x) = (ab) \cdot x
Neutral scalar is the neutral element of scalar multiplication 1 \cdot x = x
Distributivity with respect to vector addition a \cdot (x + y) = a \cdot x + a \cdot y
Distributivity with respect to scalar addition (a + b) \cdot x = a \cdot x + b \cdot x

Of course, the inverse element \bar x is usually denoted with -x.

Metric spaces

The vector space (\mathcal{X}, \mathbb{F}, +, \cdot) is called a metric space if it is additionally endorsed with a distance function or metric

d: \mathcal{X} \times \mathcal{X} \to [0, \infty)

with the following properties for all x, y, z \in \mathcal{X}:

\begin{align*}
  & d(x, y) = 0 \quad \Leftrightarrow \quad x = y && \text{(identity of indiscernibles)} \\
  & d(x, y) = d(y, x)  && \text{(symmetry)} \\
  & d(x, y) \leq d(x, z) + d(z, y) && \text{(subadditivity)}
\end{align*}

We call the tuple (\mathcal{X}, \mathbb{F}, +, \cdot, d) a Metric space.

Normed spaces

A function on \mathcal{X} intended to measure lengths of vectors is called a norm

\lVert \cdot \rVert : \mathcal{X} \to [0, \infty)

if it fulfills the following conditions for all x, y \in \mathcal{X} and a \in \mathbb{F}:

\begin{align*}
  & \lVert x \rVert = 0 \Leftrightarrow x = 0 && \text{(positive definiteness)} \\
  & \lVert a \cdot x \rVert = \lvert a \rvert\, \lVert x \rVert && \text{(positive homegeneity)}
  \\
  & \lVert x + y \rVert \leq \lVert x \rVert + \lVert x \rVert && \text{(triangle inequality)}
\end{align*}

A tuple (\mathcal{X}, \mathbb{F}, +, \cdot, \lVert \cdot \rVert) fulfilling these conditions is called Normed vector space. Note that a norm induces a natural metric via d(x, y) = \lVert x - y \rVert.

Inner product spaces

Measure angles and defining notions like orthogonality requires the existence of an inner product

\langle \cdot, \cdot \rangle : \mathcal{X} \times \mathcal{X} \to \mathbb{F}

with the following properties for all x, y, z \in \mathcal{X} and a \in \mathbb{F}:

\begin{align*}
  & \langle x, x \rangle \geq 0 \quad \text{and} \quad \langle x, x \rangle = 0 \Leftrightarrow
  x = 0 && \text{(positive definiteness)} \\
  & \langle a \cdot x + y, z \rangle = a \, \langle x, z \rangle + a \, \langle y, z \rangle &&
  \text{(linearity in the first argument)} \\
  & \langle x, y \rangle = \overline{\langle x, y \rangle} && \text{(conjugate symmetry)}
\end{align*}

The tuple (\mathcal{X}, \mathbb{F}, +, \cdot, \langle \cdot \rangle) is then called an Inner product space. Note that the inner product induces the norm \lVert x \rVert = \sqrt{\langle x, x \rangle}.

Cartesian spaces

We refer to the space \mathbb{F}^n as the n-dimensional Cartesian space over the field \mathbb{F}. We choose this notion since Euclidean spaces are usually associated with the Euclidean norm and distance, which are just (important) special cases. Vector addition and scalar multiplication in \mathbb{F}^n are, of course, realized with entry-wise addition and scalar multiplication.

The natural inner product in \mathbb{F}^n is defined as

\langle x, y \rangle_{\mathbb{F}^n} := \sum_{i=1}^n x_i\, \overline{y_i}

and reduces to the well-known dot product if \mathbb{F} = \mathbb{R}. For the norm, the most common choices are from the family of p-norms

\lVert x \rVert_p &:= \left( \sum_{i=1}^n \lvert x_i \rvert^p \right)^{\frac{1}{p}}
\quad \text{if } p \in [1, \infty) \\[1ex]
\lVert x \rVert_\infty &:= \max\big\{\lvert x_i \rvert\,|\, i \in \{1, \dots, n\} \big\}

with the standard Euclidan norm for p = 2. As metric, one usually takes the norm-induced distance function, although other choices are possible.

Weighted Cartesian spaces

In the standard definition of inner products, norms and distances, all components of a vector are have the same weight. This can be changed by using weighted versions of those functions as described in the following.

Let A \in \mathbb{F}^{n \times n} be a Hermitian square and positive definite matrix, in short A = A^* \succeq 0. Then, a weighted inner product is defined by

\langle x, y \rangle_A := \langle Ax, y \rangle_{\mathbb{F}^n}.

Weighted norms can be defined in different ways. For a general norm \lVert \cdot \rVert, a weighted version is given by

\lVert x \rVert_A := \lVert Ax \rVert

For the p-norms with p < \infty, the definition is usually changed to

\lVert x \rVert_{p, A} := \lVert A^{1/p} x \rVert,

where A^{1/p} is the p-th root of the matrix A. The reason for this definition is that for p = 2, this version is consistent with the inner product since \langle Ax, x \rangle = \langle A^{1/2} x, A^{1/2} x \rangle =
\lVert A^{1/2} x \rVert^2.

Remark on matrices as operators

A matrix M \in \mathbb{F}^{m \times n} can be regarded as a linear operator

\mathcal{M} &: \mathbb{F}^n \to \mathbb{F}^m \\
\mathcal{M}(x) &:= M x

It is well known that in the standard case of a Euclidean space, the adjoint operator is simply defined with the conjugate transposed matrix:

\mathcal{M}^* &: \mathbb{F}^m \to \mathbb{F}^n \\
\mathcal{M}^*(y) &:= M^* y

However if the spaces \mathbb{F}^n and \mathbb{F}^m have weighted inner products, this identification is no longer valid. If \mathbb{F}^{n \times n} \ni A = A^* \succeq 0 and \mathbb{F}^{m \times m} \ni B = B^* \succeq 0 are the weighting matrices of the inner products, we get

\langle \mathcal{M}(x), y \rangle_B
&= \langle B\mathcal{M}(x), y \rangle_{\mathbb{F}^m}
= \langle M x, B y \rangle_{\mathbb{F}^m}
= \langle x, M^* B y \rangle_{\mathbb{F}^n} \\
&= \langle A^{-1} A x, M^* B y \rangle_{\mathbb{F}^n}
= \langle A x, A^{-1} M^* B y \rangle_{\mathbb{F}^n} \\
&= \langle x, A^{-1} M^* B y \rangle_A

Thus, the adjoint of the matrix operator between the weighted spaces is rather given as \mathcal{M}^*(y) = A^{-1} M^* B y.

Discretizations

Mathematical background

In mathematics, the term discretization stands for the transition from abstract, continuous, often infinite-dimensional objects to concrete, discrete, finite-dimensional counterparts. We define discretizations as tuples encompassing all necessary aspects involved in this transition. Let \mathcal{X} be an arbitrary set, \mathbb{F}^n be the set of n-tuples where each component lies in \mathbb{F}. We define two mappings

\mathcal{R}_\mathcal{X}: \mathcal{X} \to \mathbb{F}^n,

\mathcal{E}_\mathcal{X}: \mathbb{F}^n \to \mathcal{X},

which we call sampling and interpolation, respectively. Then, the discretization of \mathcal{X} with respect to \mathbb{F}^n and the above operators is defined as the tuple

\mathcal{D}(\mathcal{X}) = (\mathcal{X}, \mathbb{F}^n,
\mathcal{R}_\mathcal{X}, \mathcal{E}_\mathcal{X}).

The following abstract diagram visualizes a discretization:

_images/discr.png

TODO: write up in more detail

Example

Let \mathcal{X} = C([0, 1]) be the space of real-valued continuous functions on the interval [0, 1], and let x_1 < \dots < x_n be ordered sampling points in [0, 1].

Restriction operator:

We define the grid collocation operator as

\mathcal{C}: \mathcal{X} \to \mathbb{R}^n,

\mathcal{C}(f) := \big(f(x_1), \dots, f(x_n)\big).

The abstract object in this case is the input function f, and the operator evaluates this function at the given points, resulting in a vector in \mathbb{R}^n.

This operator is implemented as PointCollocation.

Extension operator:

Let discrete values \bar f \in \mathbb{R}^n be given. Consider the linear interpolation of those values at a point x \in [0, 1]:

I(\bar f; x) := (1 - \lambda(x)) f_i + \lambda(x) f_{i+1},

\lambda(x) = \frac{x - x_i}{x_{i+1} - x_i},

where i is the index such that x \in [x_i, x_{i+1}).

Then we can define the linear interpolation operator as

\mathcal{L} : \mathbb{R}^n \to C([0, 1]),

\mathcal{L}(\bar f) := I(\bar f; \cdot),

where I(\bar f; \cdot) stands for the function x \mapsto I(\bar f; x).

Hence, this operator maps the finite array \bar f \in \mathbb{R}^n to the abstract interpolating function I(\bar f; \cdot).

This interpolation scheme is implemented in the LinearInterpolation operator.

Useful Wikipedia articles

Resizing Operators

Introduction

In ODL, resizing of a discretized function is understood as the operation of shrinking or enlarging its domain in such a way that the size of the partition cells do not change. This “constant cell size” restriction is intentional since it ensures that the underlying operation can be implemented as array resizing without resampling, thus keeping those two functionalities separate (see Resampling).

Basic setting

Let now \mathbb{R}^n with n \in \mathbb{N} be the space of one-dimensional real vectors encoding values of a function defined on an interval [a, b] \subset \mathbb{R} (see Discretizations for details). Since values are not manipulated, the generalization to complex-valued functions is straightforward.

Restriction operator

We consider the space \mathbb{R}^m for an m < n \in \mathbb{N} and define the restriction operator

(1)R : \mathbb{R}^n \to \mathbb{R}^m, \quad R(x) := (x_p, \dots, x_{p+m-1})

with a given index 0 \leq p \leq n - m - 1. Its adjoint with respect to the standard inner product is easy to determine:

\langle R(x), y \rangle_{\mathbb{R}^m}
&= \sum_{j=0}^{m-1} R(x)_j\, y_j
= \sum_{j=0}^{m-1} x_{p+j}\, y_j
= \sum_{j=p}^{p+m-1} x_j\, y_{j-p} \\
&= \sum_{i=0}^{n-1} x_i\, R^*(y)_i

with the zero-padding operator

(2)R^*(y)_i :=
\begin{cases}
y_{i-p} & \text{if } p \leq i \leq p + m - 1, \\
0       & \text{else.}
\end{cases}

In practice, this means that a new zero vector of size n is created, and the values y are filled in from index p onwards. It is also clear that the operator R is right-invertible by R^*, i.e. R R^* = \mathrm{Id}_{\mathbb{R}^m}. In fact, any operator of the form R^* + P, where P is linear and P(x)_i = 0 for i \not \in \{p, \dots, p+m-1\} acts as a right inverse for R. On the other hand, R has no left inverse since it has a non-trivial kernel (null space) \mathrm{ker} R = \{x \in \mathbb{R}^n\,|\,x_i = 0 \text{ for } i = p, \dots, p+m-1\}.

Extension operators

Now we study the opposite case of resizing, namely extending a vector. We thus choose m > n and consider different cases of enlarging a given vector x \in \mathbb{R}^n to a vector in \mathbb{R}^m. The start index is again denoted by p and needs to fulfill 0 \leq p \leq m - n - 1, such that a vector of length n “fits into” a vector of length m when starting at index p.

It should be noted that all extension operators mentioned here are of the above form R^* + P with P acting on the “outer” indices only. Hence they all act as a right inverses for the restriction operator. This property can also be read as the fact that all extension operators are left-inverted by the restriction operator R.

Moreover, the “mixed” case, i.e. the combination of restriction and extension which would occur e.g. for a constant index shift x \mapsto (0, \dots, 0, x_0, \dots, x_{n-p-1}), is not considered here. It can be represented by a combination of the two “pure” operations.

Zero padding

In this most basic padding variant, one fills the missing values in the target vector with zeros, yielding the operator

(3)E_{\mathrm{z}} : \mathbb{R}^n \to \mathbb{R}^m, \quad E_{\mathrm{z}}(x)_j :=
\begin{cases}
x_{j-p}, & \text{if } p \leq j \leq p + n - 1, \\
0      , & \text{else}.
\end{cases}

Note that this is the adjoint of the restriction operator R defined in (1). Hence, its adjoint is given by the restriction, E_{\mathrm{z}}^* = R.

Constant padding

In constant padding with constant c, the extra zeros in (3) are replaced with c. Hence, the operator performing constant padding can be written as E_{\mathrm{c}} = E_{\mathrm{z}} + P_{\mathrm{c}}, where the second summand is given by

P_{\mathrm{c}}(x) =
\begin{cases}
0      , & \text{if } p \leq j \leq p + n - 1, \\
c      , & \text{else}.
\end{cases}

Note that this operator is not linear, and its derivative is the zero operator, hence the derivative of the constant padding operator is E_{\mathrm{c}}' = E_{\mathrm{z}}.

Periodic padding

This padding mode continues the original vector x periodically in both directions. For reasons of practicability, at most one whole copy is allowed on both sides, which means that the numbers n, m and p need to fulfill p \leq n (“left” padding amount) and m - (p + n) \leq n (“right” padding amount). The periodic padding operator is then defined as

(4)E_{\mathrm{p}}(x)_j :=
\begin{cases}
x_{j-p + n}, & \text{if } 0 \leq j \leq p - 1,     \\
x_{j-p},     & \text{if } p \leq j \leq p + n - 1, \\
x_{j-p - n}, & \text{if } p + n \leq j \leq m - 1.
\end{cases}

Hence, one can at most get 3 full periods with m = 3n and p = n. Again, this operator can be written as E_{\mathrm{p}} = E_{\mathrm{z}} + P_{\mathrm{p}} with an operator

P_{\mathrm{p}}(x)_j :=
\begin{cases}
x_{j-p + n}, & \text{if } 0 \leq j \leq p - 1,     \\
0,           & \text{if } p \leq j \leq p + n - 1, \\
x_{j-p - n}, & \text{if } p + n \leq j \leq m - 1.
\end{cases}

For the adjoint of P_{\mathrm{p}}, we calculate

\langle P_{\mathrm{p}}(x), y \rangle_{\mathbb{R}^m}
&= \sum_{j=0}^{p-1} x_{j-p+n}\, y_j + \sum_{j=p+n}^{m-1} x_{j-p-n}\, y_j \\
&= \sum_{i=n-p}^{n-1} x_i\, y_{i+p-n} + \sum_{i=0}^{m-n-p-1} x_i\, y_{i+p+n} \\
&= \sum_{i=0}^{n-1} x_i\, \big( P_{\mathrm{p},1}^*(y) + P_{\mathrm{p},2}^*(y) \big)

with

P_{\mathrm{p},1}^*(y)_i :=
\begin{cases}
y_{i+p-n}, & \text{if } n - p \leq i \leq n - 1, \\
0,         & \text{else},
\end{cases}

and

P_{\mathrm{p},2}^*(y)_i :=
\begin{cases}
y_{i+p+n}, & \text{if } 0 \leq i \leq m - n - p - 1, \\
0,         & \text{else}.
\end{cases}

In practice, this means that that besides copying the values from the indices p, \dots, p+n-1 of a vector y \in \mathbb{R}^m to a new vector x \in \mathbb{R}^n, the values corresponding to the other indices are added to the vector x as follows. The first m - n - p - 1 entries of y (negative means 0) are added to the last m - n - p - 1 entries of x, in the same ascending order. The last p entries of y are added to the first p entries of x, again keeping the order. This procedure can be interpreted as “folding back” the periodized structure of y into a single period x by adding the values from the two side periods.

Symmetric padding

In symmetric padding mode, a given vector is extended by mirroring at the outmost nodes to the desired extent. By convention, the outmost values are not repeated, and as in periodic mode, the input vector is re-used at most once on both sides. Since the outmost values are not doubled, the numbers n, m and p need to fulfill the relations p \leq n - 1 (“left” padding amount) and m - (p + n) \leq n - 1 (“right” padding amount). Now the symmetric padding operator is defined as

(5)E_{\mathrm{s}}(x)_j :=
\begin{cases}
x_{p-j},      & \text{if } 0 \leq j \leq p - 1,      \\
x_{j-p},      & \text{if } p \leq j \leq p + n - 1,  \\
x_{2n-2+p-j}, & \text{if } p + n \leq j \leq m - 1.
\end{cases}

This operator is the sum of the zero-padding operator E_{\mathrm{z}} and

P_{\mathrm{s}}(x)_j :=
\begin{cases}
x_{p-j},      & \text{if } 0 \leq j \leq p - 1,      \\
0,            & \text{if } p \leq j \leq p + n - 1,  \\
x_{2n-2+p-j}, & \text{if } p + n \leq j \leq m - 1.
\end{cases}

For its adjoint, we compute

\langle P_{\mathrm{s}}(x), y \rangle_{\mathbb{R}^m}
&= \sum_{j=0}^{p-1} x_{p-j}\, y_j + \sum_{j=p+n}^{m-1} x_{2n-2+p-j}\, y_j \\
&= \sum_{i=1}^p x_i\, y_{p-i} + \sum_{i=2n-1+p-m}^{n-2} x_i\, y_{2n-2+p-i} \\
&= \sum_{i=0}^{n-1} x_i\, \big( P_{\mathrm{s},1}^*(y) + P_{\mathrm{s},2}^*(y) \big)

with

P_{\mathrm{s},1}^*(y)_i :=
\begin{cases}
y_{p-i},   & \text{if } 1 \leq i \leq p, \\
0,         & \text{else},
\end{cases}

and

P_{\mathrm{s},2}^*(y)_i :=
\begin{cases}
y_{2n-2+p-i}, & \text{if } 2n - 1 + p - m \leq i \leq n - 2, \\
0,            & \text{else}.
\end{cases}

Note that the index condition m - (p + n) \leq n - 1 is equivalent to 2n - 1 + p - m \geq 0, hence the index range in the definition of P_{\mathrm{s},2}^* is well-defined.

Practically, the evaluation of E_{\mathrm{s}}^* consists in copying the “main” part of y \in \mathbb{R}^m corresponding to the indices p, \dots, p + n - 1 to x \in \mathbb{R}^n and updating the vector additively as follows. The values at indices 1 to p are updated with the values of y mirrored at the index position p, i.e. in reversed order. The values at the indices 2n - 1 + p - m to n - 2 are updated with the values of y mirrored at the position 2n + 2 - p, again in reversed order. This procedure can be interpreted as “mirroring back” the outer two parts of the vector y at the indices p and 2n + 2 - p, adding those parts to the “main” vector.

Order 0 padding

Padding with order 0 consistency means continuing the vector constantly beyond its boundaries, i.e.

(6)E_{\mathrm{o0}}(x)_j :=
\begin{cases}
x_0,     & \text{if } 0 \leq j \leq p - 1,      \\
x_{j-p}, & \text{if } p \leq j \leq p + n - 1,  \\
x_{n-1}, & \text{if } p + n \leq j \leq m - 1.
\end{cases}

This operator is the sum of the zero-padding operator and

P_{\mathrm{o0}}(x)_j :=
\begin{cases}
x_0,     & \text{if } 0 \leq j \leq p - 1,      \\
0,       & \text{if } p \leq j \leq p + n - 1,  \\
x_{n-1}, & \text{if } p + n \leq j \leq m - 1.
\end{cases}

We calculate the adjoint of P_{\mathrm{o0}}:

\langle P_{\mathrm{o0}}(x), y \rangle_{\mathbb{R}^m}
&= \sum_{j=0}^{p-1} x_0\, y_j + \sum_{j=p+n}^{m-1} x_{n-1}\, y_j \\
&= x_0 \sum_{j=0}^{p-1} y_j + x_{n-1} \sum_{j=p+n}^{m-1} y_j \\
&= x_0 M_{\mathrm{l},0}(y) + x_{n-1} M_{\mathrm{r},0}(y)

with the zero’th order moments

M_{\mathrm{l},0}(y) := \sum_{j=0}^{p-1} y_j, \quad M_{\mathrm{r},0}(y) := \sum_{j=p+n}^{m-1} y_j.

Hence, we get

P_{\mathrm{o0}}^*(y)_i :=
\begin{cases}
M_{\mathrm{l},0}(y), & \text{if } i = 0,     \\
M_{\mathrm{r},0}(y), & \text{if } i = n - 1, \\
0,                   & \text{else},
\end{cases}

with the convention that the sum of the two values is taken in the case that $n = 1$, i.e. both first cases are the same. Hence, after constructing the restriction x \in \mathbb{R}^n of a vector y \in \mathbb{R}^m to the main part p, \dots, p + n - 1, the sum of the entries to the left are added to x_0, and the sum of the entries to the right are added to x_{n-1}.

Order 1 padding

In this padding mode, a given vector is continued with constant slope instead of constant value, i.e.

(7)E_{\mathrm{o1}}(x)_j :=
\begin{cases}
 x_0 + (j - p)(x_1 - x_0),                     & \text{if } 0 \leq j \leq p - 1,      \\
 x_{j-p},                                      & \text{if } p \leq j \leq p + n - 1,  \\
 x_{n-1} + (j - p - n + 1)(x_{n-1} - x_{n-2}), & \text{if } p + n \leq j \leq m - 1.
\end{cases}

We can write this operator as E_{\mathrm{o1}} = E_{\mathrm{o0}} + S_{\mathrm{o1}} with the order-1 specific part

S_{\mathrm{o1}}(x)_j :=
\begin{cases}
(j - p)(x_1 - x_0),                 & \text{if } 0 \leq j \leq p - 1,      \\
0,                                  & \text{if } p \leq j \leq p + n - 1,  \\
(j - p - n + 1)(x_{n-1} - x_{n-2}), & \text{if } p + n \leq j \leq m - 1.
\end{cases}

For its adjoint, we get

\langle S_{\mathrm{o1}}(x), y \rangle_{\mathbb{R}^m}
&= \sum_{j=0}^{p-1} (j - p)(x_1 - x_0)\, y_j +
   \sum_{j=p+n}^{m-1} (j - p - n + 1)(x_{n-1} - x_{n-2})\, y_j \\
&= x_0 (-M_{\mathrm{l}}(y)) + x_1 M_{\mathrm{l}}(y) +
   x_{n-2}(-M_{\mathrm{r}}(y)) + x_{n-1} M_{\mathrm{r}}(y)

with the first order moments

M_{\mathrm{l},1}(y) := \sum_{j=0}^{p-1} (j - p)\, y_j, \quad
M_{\mathrm{r},1}(y) := \sum_{j=p+n}^{m-1} (j - p - n + 1)\, y_j.

Hence, the order-1 specific operator has the adjoint

S_{\mathrm{o1}}^*(y)_i :=
\begin{cases}
-M_{\mathrm{l},1}(y), & \text{if } i = 0,     \\
M_{\mathrm{l},1}(y),  & \text{if } i = 1,     \\
-M_{\mathrm{r},1}(y), & \text{if } i = n - 2, \\
M_{\mathrm{r},1}(y),  & \text{if } i = n - 1, \\
0,                  & \text{else},
\end{cases}

with the convention of summing values for overlapping cases, i.e. if i \in \{1, 2\}. In practice, the adjoint for the order 1 padding case is applied by computing the zero’th and first order moments of y and adding them to the two outmost entries of x according to the above rule.

Generalization to arbitrary dimension

Fortunately, all operations are completely separable with respect to (coordinate) axes, i.e. resizing in higher-dimensional spaces can be written as a series of one-dimensional resizing operations. One particular issue should be mentioned with the extension operators and their adjoints, though. When extending a small, e.g., two-dimensional array to a larger size, there is an ambiguity in how the corner blocks should be handled. One possibility would be use the small array size for the extension in both axes, which would leave the corner blocks untouched (initialized to 0 usually):

_images/resize_small.svg

However, this is not the behavior one would often want in practice. Instead, it is much more reasonable to also fill the corners in the same way the “inner” parts have been extended:

_images/resize_large.svg

This latter behavior is implemented in the resizing operators in ODL.

The adjoint operators of these “corner-filling” resizing operator are given by reversing the unfolding pattern, i.e. by “folding in” the large array axis by axis according to the adjoint formula for the given padding mode. This way, the corners also contribute to the final result, which leads to the correct adjoint of the 2D resizing operator. Of course, the same principle can easily be generalized to arbitrary dimension.

On the different notions of derivative

The concept of a derivative is one of the core concepts of mathematical analysis analysis, and it is essential whenever a linear approximation of a function in some point is required. Since the notion of derivative has different meanings in different contexts, this guide has been written to introduce the different derivative concepts used in ODL.

In short, different notions of derivatives that will be discussed here are:

  • Derivative. When we write “derivative” in ODL code and documentation, we mean the derivative of an Operator A : \mathcal{X} \rightarrow \mathcal{Y} w.r.t to a disturbance in x, i.e a linear approximation of A(x + h) for small h. The derivative in a point x is an Operator A'(x) : \mathcal{X} \rightarrow \mathcal{Y}.
  • Gradient. If the operator A is a functional, i.e. A : \mathcal{X} \rightarrow \mathbb{R}, then the gradient is the direction in which A increases the most. The gradient in a point x is a vector [\nabla A](x) in \mathcal{X} such that A'(x)(y) = \langle [\nabla A](x), y \rangle. The gradient operator is the operator x \rightarrow [\nabla A](x).
  • Hessian. The hessian in a point x is the derivative operator of the gradient operator, i.e. H(x) = [\nabla A]'(x).
  • Spatial Gradient. The spatial gradient is only defined for spaces \mathcal{F}(\Omega, \mathbb{F}) whose elements are functions over some domain \Omega \subset \mathbb{R}^d taking values in \mathbb{R} or \mathbb{C}. It can be seen as a vectorized version of the usual gradient, taken in each point in \Omega.
  • Subgradient. The subgradient extends the notion of derivative to any convex functional and is used in some optimization solvers where the objective function is not differentiable.

Derivative

The derivative is usually introduced for functions f: \mathbb{R} \rightarrow \mathbb{R} via the limit

f'(x) = \lim_{h \rightarrow 0} \frac{f(x + h) - f(x)}{h}.

Here we say that the derivative of f in x is f'(x).

This limit makes sense in one dimension, but once we start considering functions in higher dimension we get into trouble. Consider f: \mathbb{R}^n \rightarrow \mathbb{R}^m – what would h mean in this case? An extension is the concept of a directional derivative. The derivative of f in x in direction d is f'(x)(d):

f'(x)(d) = \lim_{h \rightarrow 0} \frac{f(x + dh) - f(x)}{h}.

Here we see (as implied by the notation) that f'(x) is actually an operator

f'(x) : \mathbb{R}^n \rightarrow \mathbb{R}^m.

We can rewrite this using the explicit requirement that f'(x) is a linear approximation of f at x, i.e.

\lim_{\| d \| \rightarrow 0} \frac{\| f(x + d) - f(x) - f'(x)(d) \|}{\| d \|} = 0

This notion naturally extends to an Operator f : \mathcal{X} \rightarrow \mathcal{Y} between Banach spaces \mathcal{X} and \mathcal{Y} with norms \| \cdot \|_\mathcal{X} and \| \cdot \|_\mathcal{Y}, respectively. Here f'(x) is defined as the linear operator (if it exists) that satisfies

\lim_{\| d \| \rightarrow 0} \frac{\| f(x + d) - f(x) - f'(x)(d) \|_\mathcal{Y}}{\| d \|_\mathcal{X}} = 0

This definition of the derivative is called the Fréchet derivative.

The Gateaux derivative

The concept of directional derivative can also be extended to Banach spaces, giving the Gateaux derivative. The Gateaux derivative is more general than the Fréchet derivative, but is not always a linear operator. An example of a function that is Gateaux differentiable but not Fréchet differentiable is the absolute value function. For this reason, when we write “derivative” in ODL, we generally mean the Fréchet derivative, but in some cases the Gateaux derivative can be used via duck-typing.

Rules for the Fréchet derivative

Many of the usual rules for derivatives also hold for the Fréchet derivative, i.e.

  • Linearity

    (a f + b y)'(x)(y) = a f'(x)(y) + b g'(x)(y)

  • Chain rule

    (g \circ f)'(x)(y) = g'(f(x))(f'(x)(y))

  • Linear operators are their own derivatives. If f linear, then

    f'(x)(y) = f(y)

Implementations in ODL
  • The derivative is implemented in ODL for Operator‘s via the Operator.derivative method.
  • It can be numerically computed using the NumericalDerivative operator.
  • Many of the operator arithmetic classes implement the usual rules for the Fréchet derivative, such as the chain rule, distributivity over addition etc.

Gradient

In the classical setting of functionals f : \mathbb{R}^n \rightarrow \mathbb{R}, the gradient is the vector

\nabla f =
\begin{bmatrix}
    \dfrac{\partial f}{\partial x_1}
    \dots
    \dfrac{\partial f}{\partial x_n}
\end{bmatrix}

This can be generalized to the setting of functionals f : \mathcal{X} \rightarrow \mathbb{R} mapping elements in some Banach space \mathcal{X} to the real numbers by noting that the Fréchet derivative can be written as

f'(x)(y) = \langle y, [\nabla f](x) \rangle,

where [\nabla f](x) lies in the dual space of \mathcal{X}, denoted \mathcal{X}^*. For most spaces in ODL, the spaces are Hilbert spaces where \mathcal{X} = \mathcal{X}^* by the Riesz representation theorem and hence f'(x) \in \mathcal{X}.

We call the (possibly nonlinear) operator x \rightarrow [\nabla f](x) the Gradient operator of f.

Implementations in ODL
  • The gradient is implemented in ODL Functional‘s via the Functional.gradient method.
  • It can be numerically computed using the NumericalGradient operator.

Hessian

In the classical setting of functionals f : \mathbb{R}^n \rightarrow \mathbb{R}, the Hessian in a point x is the matrix H(x) such that

H(x) =
\begin{bmatrix}
\dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1\,\partial x_n} \\
\dfrac{\partial^2 f}{\partial x_2\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2\,\partial x_n} \\
\vdots & \vdots & \ddots & \vdots \\
\dfrac{\partial^2 f}{\partial x_n\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_n\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2}
\end{bmatrix}

with the derivatives are evaluated in the point x. It has the property that that the quadratic variation of f is

f(x + d) = f(x) + \langle d, [\nabla f](x)\rangle + \langle d, [H(x)](d)\rangle + o(\|d\|^2)

but also that the derivative of the gradient operator is

\nabla f(x + d) = [\nabla f](x) + [H(x)](d) + o(\|d\|)

If we take this second property as the definition of the Hessian, it can easily be generalized to the setting of functionals f : \mathcal{X} \rightarrow \mathbb{R} mapping elements in some Hilbert space \mathcal{X} to the real numbers.

Implementations in ODL

The Hessian is not explicitly implemented anywhere in ODL. Instead it can be used in the form of the derivative of the gradient operator. This is however not implemented for all functionals.

  • For an example of a functional whose gradient has a derivative, see RosenbrockFunctional.
  • It can be computed by taking the NumericalDerivative of the gradient, which can in turn be computed using the NumericalGradient.

Spatial Gradient

The spatial gradient of a function f \in \mathcal{F}(\Omega, \mathbb{R}) is an element in the function space \mathcal{F}(\Omega, \mathbb{R}^n) such that for any x, d \in \Omega.

\lim_{h \rightarrow 0} \frac{\| f(x + h d) - f(x) - \langle h d, \text{grad} f \rangle \|}{h} = 0

Implementations in ODL
  • The spatial gradient is implemented in ODL in the Gradient operator.
  • Several related operators such as the PartialDerivative and Laplacian are also available.

Subgradient

The Subgradient (also subderivative or subdifferential) of a convex function f : \mathcal{X} \rightarrow \mathbb{R}, mapping a Banach space \mathcal{X} to \mathbb{R}, is defined as the set-valued function \partial f : \mathcal{X} \rightarrow 2^{\mathcal{X}^*} whose values are:

[\partial f](x_0) = \{c \in \mathcal{X}^* \ s.t. \ f(x) - f(x_0) \geq \langle c , x - x_0 \rangle \forall x \in \mathcal{X} \}

for functions that are differentiable in the usual sense, this reduces to the usual gradient.

Implementations in ODL

The subgradient is not explicitly implemented in odl, but is implicitly used in the proximal operators. See Proximal Operators for more information.

Transformations

This section contains the mathematical descriptions of (integral) transforms implemented in ODL.

Fourier Transform

Background
Definition and basic properties

The Fourier Transform (FT) of a function f belonging to the Lebesgue Space L^1(\mathbb{R}, \mathbb{C}) is defined as

(1)\widehat{f}(\xi) = \mathcal{F}(f)(\xi) = (2\pi)^{-\frac{1}{2}}
\int_{\mathbb{R}} f(x)\ e^{-i x \xi} \, \mathrm{d}x.

(Note that this definition differs from the one in the linked article by the placement of the factor 2\pi.) By unique continuation, the bounded FT operator can be extended to L^p(\mathbb{R}, \mathbb{C}) for p \in [1, 2], yielding a mapping

\mathcal{F}: L^p(\mathbb{R}, \mathbb{C}) \longrightarrow L^q(\mathbb{R}, \mathbb{C}),
\quad q = \frac{p}{p-1},

where q is the conjugate exponent of p (for p=1 one sets q=\infty). Finite exponents larger than 2 also allow the extension of the operator but require the notion of Distributions to characterize its range. See [SW1971] for further details.

The inverse of \mathcal{F} on its range is given by the formula

(2)\widetilde{\phi}(x) = \mathcal{F}^{-1}(\phi)(x) = (2\pi)^{-\frac{1}{2}}
\int_{\mathbb{R}} \phi(\xi)\ e^{i \xi x}\, \mathrm{d}\xi.

For p = 2, the conjugate exponent is q = 2, and the FT is a unitary operator on L^2(\mathbb{R}) according to Parseval’s Identity

\int_{\mathbb{R}} \lvert f(x)\rvert^2\, \mathrm{d}x =
\int_{\mathbb{R}} \lvert \widetilde{f}(\xi) \rvert^2\, \mathrm{d}\xi,

which implies that its adjoint is its inverse, \mathcal{F}^* = \mathcal{F}^{-1}.

Further Properties

(3)\mathcal{F}^{-1}(\phi) = \mathcal{F}(\check\phi) = \mathcal{F}(\phi)(-\cdot)
= \overline{\mathcal{F}(\overline{\phi})} = \mathcal{F}^3(\phi),
\quad \check\phi(x) = \phi(-x),

\mathcal{F}\big(f(\cdot - b)\big)(\xi) = e^{-i b \xi} \widehat{f}(\xi),

\mathcal{F}\big(f(a \cdot)\big)(\xi) = a^{-1} \widehat{f}(a^{-1}\xi),

\frac{\mathrm{d}}{\mathrm{d} \xi} \widehat{f}(\xi) = \mathcal{F}(-i x f)(\xi)

\mathcal{F}(f')(\xi) = i \xi \widehat{f}(\xi).

The first identity implies in particular that for real-valued f, it is \overline{\mathcal{F}(\phi)}(\xi) = \mathcal{F}(\phi)(-\xi), i.e. the FT is completely known already from the its values in a half-space only. This property is later exploited to reduce storage.

In d dimensions, the FT is defined as

\mathcal{F}(f)(\xi) = (2\pi)^{-\frac{d}{2}}
\int_{\mathbb{R}^d} f(x)\ e^{-i x^{\mathrm{T}}\xi} \, \mathrm{d}x

with the usual inner product x^{\mathrm{T}}\xi = \sum_{k=1}^d x_k \xi_k in \mathbb{R}^d. The identities (3) also hold in this case with obvious modifications.

Discretized Fourier Transform
General case

The approach taken in ODL for the discretization of the FT follows immediately from the way Discretizations are defined, but the original inspiration for it came from the book [Pre+2007], Section 13.9 “Computing Fourier Integrals Using the FFT”.

Discretization of the Fourier transform operator means evaluating the Fourier integral (1) on a discretized function

(4)f(x) = \sum_{k=0}^{n-1} f_k \phi_k(x)

with coefficients \bar f = (f_0, \dots, f_{n-1}) \in \mathbb{C}^n and functions \phi_0, \dots, \phi_{n-1}. This approach follows from the way , but can be We consider in particular functions generated from a single kernel \phi via

\phi_k(x) = \phi\left( \frac{x - x_k}{s_k} \right),

where x_0 < \dots < x_{n-1} are sampling points and s_k > 0 scaling factors. Using the shift and scaling properties in (3) yields

(5)\widehat{f}(\xi) = \sum_{k=0}^{n-1} f_k \widehat{\phi_k}(\xi) =
\sum_{k=0}^{n-1} f_k\, s_k \widehat{\phi}(s_k\xi) e^{-i x_k \xi}.

There exist methods for the fast approximation of such sums for a general choice of frequency samples \xi_m, e.g. NFFT.

Regular grids

For regular grids

(6)x_k = x_0 + ks, \quad \xi_j = \xi_0 + j\sigma,

the evaluation of the integral can be written in the form which uses trigonometric sums as computed in FFTW or in Numpy:

(7)\hat f_j = \sum_{k=0}^{n-1} f_k e^{-i 2\pi jk/n}.

Hence, the Fourier integral evaluation can be built around established libraries with simple pre- and post-processing steps.

With regular grids, the discretized integral (5) evaluated at \xi = \xi_j, can be expanded to

\widehat{f}(\xi_j) = s \widehat{\phi}(s\xi_j) e^{-i x_0\xi_j}
\sum_{k=0}^{n-1} f_k\, e^{-i k s \xi_0}\, e^{-i jk s\sigma}

To reach the form (7), the factor depending on both indices j and k must agree with the corresponding factor in the FFT sum. This is achieved by setting

(8)\sigma = \frac{2\pi}{ns},

finally yielding the representation

(9)\hat f_j = \widehat{f}(\xi_j) = s \widehat{\phi}(s\xi_j) e^{-i x_0\xi_j}
\sum_{k=0}^{n-1} f_k\, e^{-i k s \xi_0}\, e^{-i 2\pi jk/n}.

Choice of \xi_0

There is a certain degree of freedom in the choice of the most negative frequency \xi_0. Usually one wants to center the Fourier space grid around zero since most information is typically concentrated there. Point-symmetric grids are the standard choice, however sometimes one explicitly wants to include (for even n) or exclude (for odd n) the zero frequency from the grid, which is achieved by shifting the frequency xi_0 by -\sigma/2. This results in two possible choices

\xi_{0, \mathrm{n}} = -\frac{\pi}{s} + \frac{\pi}{sn} \quad \text{(no shift)},

\xi_{0, \mathrm{s}} = -\frac{\pi}{s} \quad \text{(shift)}.

For the shifted frequency, the pre-processing factor in the sum in (9) can be simplified to

e^{-i k s \xi_0} = e^{i k \pi} = (-1)^k,

which is favorable for real-valued input \bar f since this first operation preserves this property. For half-complex transforms, shifting is required.

The factor \widehat{\phi}(s\xi_j)

In (9), the FT of the kernel \phi appears as post-processing factor. We give the explicit formulas for the two standard discretizations currently used in ODL, which are nearest neighbor interpolation

\phi_{\mathrm{nn}}(x) =
\begin{cases}
    1, & \text{if } -1/2 \leq x < 1/2, \\
    0, & \text{else,}
\end{cases}

and linear interpolation

\phi_{\mathrm{lin}}(x) =
\begin{cases}
    1 - \lvert x \rvert, & \text{if } -1 \leq x \leq 1, \\
    0, & \text{else.}
\end{cases}

Their Fourier transforms are given by

\widehat{\phi_{\mathrm{nn}}}(\xi) = (2\pi)^{-1/2} \mathrm{sinc}(\xi/2),

\widehat{\phi_{\mathrm{lin}}}(\xi) = (2\pi)^{-1/2} \mathrm{sinc}^2(\xi/2).

Since their arguments s\xi_j = s\xi_0 + 2\pi/n lie between -\pi and \pi, these functions introduce only a slight taper towards higher frequencies given the fact that the first zeros lie at \pm 2\pi.

Inverse transform

According to (2), the inverse of the continuous Fourier transform is given by the same formula as the forward transform (1), except for a switched sign in the complex exponential. Hence, this operator can rather be viewed as a variation of the forward FT, and it is implemented via a sign parameter in FourierTransform.

The inverse of the discretized formula (9) is instead gained directly using the identity

(10)\sum_{j=0}^{N-1} e^{i 2\pi \frac{(l-k)j}{N}}
&= \sum_{j=0}^{N-1} \Big( e^{i 2\pi \frac{(l-k)}{N}} \Big)^j =
\begin{cases}
  N, & \text{if } l = k, \\
  \frac{1 - e^{i 2\pi (l-k)}}{1 - e^{i 2\pi (l-k)/N}} = 0, & \text{else}
\end{cases}\\
&= N\, \delta_{l, k}.

By dividing (9) with the factor

\alpha_j = s\widehat{\psi}(s\xi_j)\, e^{- i x_0 \xi_j}

before the sum, multiplying with the exponential factor e^{i 2\pi \frac{lj}{N}} and summing over j, the coefficients f_k can be recovered:

\sum_{j=0}^{N-1} \hat f_j\, \frac{1}{\alpha_j}\, e^{i 2\pi \frac{lj}{N}}
&= \sum_{j=0}^{N-1} \sum_{k=0}^{N-1} \bar f_k\, e^{- i 2\pi \frac{jk}{N}}
e^{i 2\pi \frac{lj}{N}}

&= \sum_{k=0}^{N-1} \bar f_k\, N \delta_{l,k}

&= N\, \bar f_l.

Hence, the inversion formula for the discretized FT reads as

(11)f_k = e^{i k s\xi_0}\, \frac{1}{N} \sum_{j=0}^{N-1} \hat f_j
\, \frac{1}{s\widehat{\psi}(s\xi_j)}\, e^{i x_0\xi_j}\, e^{i 2\pi \frac{kj}{N}},

which can be calculated in the same manner as the forward FT, basically by switching the roles of pre- and post-processing steps and flipping the sign in the complex exponentials.

Adjoint operator

If the FT is defined between the complex Hilbert spaces L^2(\mathbb{R}, \mathbb{C}), one can easily show that the operator is unitary, and therefore its adjoint is equal to the inverse.

However, if the domain is a real space, L^2(\mathbb{R}, \mathbb{C}), one cannot even speak of a linear operator since the property

\mathcal{F}(\alpha f) = \alpha \mathcal{F}(f)

cannot be tested for all \alpha \in \mathbb{C} as required by the right-hand side, since on the left-hand side, \alpha f needs to be real. This issue can be remedied by identifying the real and imaginary parts in the range with components of a product space element:

\widetilde{\mathcal{F}}: L^2(\mathbb{R}, \mathbb{R}) \longrightarrow
\big[L^2(\mathbb{R}, \mathbb{R})\big]^2,

\widetilde{\mathcal{F}}(f) = \big(\Re \big(\mathcal{F}(f)\big), \Im \big(\mathcal{F}(f)\big)\big) =
\big( \mathcal{F}_{\mathrm{c}}(f), -\mathcal{F}_{\mathrm{s}}(f) \big),

where \mathcal{F}_{\mathrm{c}} and \mathcal{F}_{\mathrm{s}} are the sine and cosine transforms, respectively. Those two operators are self-adjoint between real Hilbert spaces, and thus the adjoint of the above defined transform is given by

\widetilde{\mathcal{F}}^*: \big[L^2(\mathbb{R}, \mathbb{R})\big]^2 \longrightarrow
L^2(\mathbb{R}, \mathbb{R})

\widetilde{\mathcal{F}}^*(g_1, g_2) = \mathcal{F}_{\mathrm{c}}(g_1) -
\mathcal{F}_{\mathrm{s}}(g_2).

If we compare this result to the “naive” approach of taking the real part of the inverse of the complex inverse transform, we get

\begin{align*}
    \Re\big( \mathcal{F}^*(g) \big)
    &= \Re\big( \mathcal{F}_{\mathrm{c}}(g) + i \mathcal{F}_{\mathrm{s}}(g) \big)\\
    &= \Re\big( \mathcal{F}_{\mathrm{c}}(\Re g) + i \mathcal{F}_{\mathrm{c}}(\Im g)
    + i \mathcal{F}_{\mathrm{c}}(\Re g) - \mathcal{F}_{\mathrm{c}}(\Im g) \big)\\
    &= \mathcal{F}_{\mathrm{c}}(\Re g) - \mathcal{F}_{\mathrm{c}}(\Im g).
\end{align*}

Hence, by identifying g_1 = \Re g and g_2 = \Im g, we see that the result is the same. Therefore, using the naive approach for the adjoint operator is justified by this argument.

Solvers

Section about solvers for optimization problems in ODL and related topics.

Chambolle-Pock algorithm

This page introduces the mathematics behind the Chambolle-Pock algorithm. For an applied point of view, please see the user’s guide to this method.

The general problem

The Chambolle-Pock (CP) algorithm, as proposed in [CP2011a], is a first order primal-dual hybrid-gradient method for non-smooth convex optimization problems with known saddle-point structure

\max_{y \in Y} \min_{x \in X} \big( \langle L x, y\rangle_Y + g(x) - f^*(y) \big) ,

where X and Y are Hilbert spaces with inner product \langle\cdot,\cdot\rangle and norm \|.\|_2 = \langle\cdot,\cdot\rangle^{1/2}, L is a continuous linear operator L: X \to Y, g: X \to [0,+\infty] and f: Y \to [0,+\infty] are proper, convex and lower semi-continuous functionals, and f^* is the convex (or Fenchel) conjugate of f, (see convex conjugate).

The saddle-point problem is a primal-dual formulation of the primal minimization problem

\min_{x \in X} \big( g(x) + f(L x) \big).

The corresponding dual maximization problem is

\max_{y \in Y} \big( g^*(-L^* x) - f^*(y) \big)

with L^* being the adjoint of the operator L.

The algorithm

The CP algorithm basically consists in alternating a gradient-like ascent in the dual variable y and a gradient-like descent in the primal variable x. Additionally, an over-relaxation in the primal variable is performed.

Initialization

Choose \tau > 0, \sigma > 0, \theta \in [0,1], x_0 \in X, y_0 \in Y, \bar x_0 = x_0

Iteration

For n > 0 update x_n, y_n, and \bar x_n as follows:

y_{n+1}         &= \text{prox}_{\sigma f^*}(y_n + \sigma L \bar x_n),

x_{n+1}         &= \text{prox}_{\tau g}(x_n - \tau  L^* y_{n+1}),

\bar x_{n+1}    &= x_{n+1} + \theta (x_{n+1} - x_n),

Here, \text{prox} stands for proximal operator.

Step sizes

A simple choice of step size parameters is \tau = \sigma < \frac{1}{\|L\|}, since the requirement \sigma \tau \|L\|^2 < 1 guarantees convergence of the algorithm. Of course, this does not imply that this choice is anywhere near optimal, but it can serve as a good starting point.

Acceleration

If g or f^* is uniformly convex, convergence can be accelerated using variable step sizes as follows:

Replace \tau \to \tau_n, \sigma \to \sigma_n, and \theta \to \theta_n and choose \tau_0 \sigma_0 \|L\|^2 < 1 and \gamma > 0. After the update of the primal variable x_{n+1} and before the update of the relaxation variable \bar x_{n+1} use the following update scheme for relaxation and step size parameters:

\theta_n        &= \frac{1}{\sqrt{1 + 2 \gamma \tau_n}},

\tau_{n+1}      &= \theta_n \tau_n,

\sigma_{n+1}    &= \frac{\sigma_n}{\theta_n}.

Instead of choosing step size parameters, preconditioning techniques can be employed, see [CP2011b]. In this case the steps \tau and \sigma are replaced by symmetric and positive definite matrices T and \Sigma, respectively, and convergence holds for \| \Sigma^{1/2}\,L\, T^{1/2}\|^2 < 1.

For more on proximal operators and algorithms see [PB2014]. The implementation of the CP algorithm in ODL is along the lines of [Sid+2012].

Proximal Operators

Definition

Let f be a proper convex function mapping the normed space X to the extended real number line (-\infty, +\infty]. The proximal operators of the functional f is mapping from X\mapsto X. It is denoted as \mathrm{prox}_\tau[f](x) with x\in X and defined by

\mathrm{prox}_\tau[f](x) = \arg\;\min_{y\in Y}\;f(y)+\frac{1}{2\tau} \|x-y\|_2^2

The shorter notation \mathrm{prox}_{\tau\,f}(x)) is also common.

Properties

Some properties which are useful to create or compose proximal operators:

Separable sum

If f is separable across variables, i.e. f(x,y)=g(x)+h(y), then

\mathrm{prox}_\tau[f](x, y) = (\mathrm{prox}_\tau[g](x), \mathrm{prox}_\tau[h](y))

Post-composition

If g(x)=\alpha f(x)+a with \alpha > 0, then

\mathrm{prox}_\tau[g](x) = \mathrm{prox}_{\alpha\tau}[f](x)

Pre-composition

If g(x)=f(\beta x+b) with \beta\ne 0, then

\mathrm{prox}_\tau[g](x) = \frac{1}{\beta} (\mathrm{prox}_{\beta^2\tau}[f](\beta x+b)-b)

Moreau decomposition

This is also know as the Moreau identity

x = \mathrm{prox}_\tau[f](x) + \frac{1}{\tau}\,\mathrm{prox}_{1/\tau}[f^*] (\frac{x}{\tau})

where f^* is the convex conjugate of f.

Convec conjugate

The convex conjugate of f is defined as

f^*(y) = \sup_{x\in X} \langle y,x\rangle - f(x)

where \langle\cdot,\cdot\rangle denotes inner product. For more on convex conjugate and convex analysis see [Roc1970] or Wikipedia.

For more details on proximal operators including how to evaluate the proximal operator of a variety of functions see [PB2014].

Indicator function

Indicator functions are typically used to incorporate constraints. The indicator function for a given set S is defined as

\mathrm{ind}_{S}(x) =\begin{cases}
0 & x \in S  \\ \infty &
x\ \notin S
\end{cases}

Special indicator functions

Indicator for a box centered at origin and with width 2 a:

\mathrm{ind}_{\mathrm{box}(a)}(x) = \begin{cases}
0 & \|x\|_\infty \le a\\
\infty & \|x\|_\infty > a
\end{cases}

where \|\cdot\|_\infty denotes the maximum-norm.

Contributing to ODL

Introduction

Great that you want to help making ODL better! There are basically two ways how you can contribute: as a user or as a developer.

The best way to make contributions as a user is to play around with the software, try to use it for your purposes and get back to us if you encounter problems or miss features. When this happens, take a look at our issue tracker, and if there is no existing issue dealing with your problem, create a new one. Don’t be shy – use the issue tracker to ask questions, too.

If you are a developer and want to contribute code, you may want to read through the subsequent instructions. If you are experienced with Git, you may want to skip directly to the development workflow section.

In order to properly follow the ODL style, we recommend that you read the How to document and Testing in ODL sections.

Contents

Extending ODL

ODL is written to be easy to extend with new functionality and classes, and new content is welcome. With that said, not everything fits inside the main library, and some ideas are better realized as extension packages, i.e., packages that use the core ODL library and extend it with experimental features. This lowers the requirement on code maturity, completeness of documentation, unit tests etc. on your side and allows the core library to stay slim and develop faster.

There are several ways to extend ODL, some of which are listed below.

Adding Fn spaces

The abstract spaces FnBase and NtuplesBase are the workhorses of the ODL space machinery. They are used in both the discrete R^n case, as well as data representation for discretized function spaces such as L^2([0, 1]) in the DiscretizedSpace class. These are in general created through the rn and uniform_discr functions who take an impl parameter, allowing users to select the backend to use.

In the core ODL package, there is only a single backend available: NumpyFn/NumpyNtuples, given by impl='numpy', which is the default choice. Users can add CUDA support by installing the add-on library odlcuda, which contains the additional spaces CudaFn/CudaNtuples. By using the rn/uniform_discr functions, users can then seamlessly change the backend of their spaces.

As an advanced user, you may need to add additional spaces of this type that can be used inside ODL, perhaps to add MPI support. There are a few steps to do this:

  • Create a new library with a setuptools installer.
  • Add the spaces that you want to add to the library. The space needs to inherit from NtuplesBase or FnBase, respectively, and implement all of the abstract methods in those spaces. See the spaces for further information on the specific methods that need to be implemented.
  • Add the methods ntuples_impls() and fn_impls() to a file odl_plugin.py in your library. These should return a dict mapping names to implementations.
  • Add the following to your library’s setup.py setup call: entry_points={'odl.space': ['odl_cuda = odlcuda.odl_plugin'], where you replace odlcuda with the name of your plugin.

How to document

ODL is documented using Sphinx and a modified version of numpydoc. An example documentation is given below.

class MyClass(object):

    """Calculate important things.

    The first line summarizes the class, after that comes a blank
    line followed by a more detailed description (both optional).
    Confine the docstring to 72 characters per line. In general, try
    to follow `PEP257`_ in the docstring style.

    Docstrings can have sections with headers, signalized by a
    single-dash underline, e.g. "References". Check out
    `Numpydoc`_ for the recognized section labels.

    References
    ----------
    .. _PEP257: https://www.python.org/dev/peps/pep-0257/
    .. _Numpydoc: https://github.com/numpy/numpy/blob/master/doc/\
HOWTO_DOCUMENT.rst.txt
    """

    def __init__(self, c, parameter=None):
        """Initializer doc goes here.

        Parameters
        ----------
        c : float
            Constant to scale by.
        parameter : float, optional
            Some extra parameter.
        """
        self.c = c
        self.parameter = parameter

    def my_method(self, x, y):
        """Calculate ``c * (x + y)``.

        The first row is a summary, after that comes
        a more detailed description.

        Parameters
        ----------
        x : float
            First summand.
        y : float
            Second summand.

        Returns
        -------
        scaled_sum : float
            Result of ``c * (x + y)``.

        Examples
        --------
        Examples should be working pieces of code and are checked with
        ``doctest`` for consistent output.

        >>> obj = MyClass(5)
        >>> obj(3, 5)
        8.0
        """
        return self.c * (x + y)

     def my_other_method(self):
         """Print the parameter.

         See also
         --------
         my_method : some calculation, but not much
         """
         print(self.parameter)
Some short tips
  • Text within backticks: `some_target` will create a link to the target (e.g. `numpy.ndarray`).
  • Make sure that the first line is short and descriptive.
  • Examples are often better than long descriptions.
  • Numpy and ODL are both imported by default in doctests, so there is no need for import numpy as np or import odl.
Quick summary of PEP257
  • Write docstrings always with triple double quotes """, even one-liners.
  • Class docstrings are separated from the class definition line by a blank line, functions and methods begin directly in the next line.
  • Use imperative style (“Calculate”, not “Calculates”) in the summary (=first) line and end it with a full stop. Do not add a space after the opening triple quotes.
  • For one-liners: put the closing quotes on the same line. Otherwise: make a new line for the closing quotes.
  • Document at least all public methods and attributes.
Advanced

This section covers advanced topics for developers that need to change internals of the documentation.

Re-generating the doc

The HTML documentation is generated by running make html in the doc/ folder. Autosummary currently does not support nested modules, so to handle this, we auto-generate .rst files for each module. This is done in each invocation of make html. If results are inconsistent after changing code (or switching branches), e.g. warnings about missing modules appear, run make clean an build the docs from scratch with make html.

Modifications to numpydoc

Numpydoc has been modified in the following ways:

  • The numpy sphinx domain has been removed.
  • More extra_public_methods have been added.
  • :autoclass: summaries now link to full name, which allows subclassing between packages.

Testing in ODL

ODL is tested using pytest and has four main types of tests and verification measures that are supposed to test the code in different ways. These are listed below along with the command to run them.

Name Command Description
Unit tests pytest Test “micro-features” of the code, like testing that each parameter combination works, that the correct exceptions are raised, that the code works correctly in corner cases and with a wide range of input.
Large-scale pytest --largescale Verify that the functionality works well in more complicated and realistic conditions with potentially large input data and longer running times.
Doctests pytest Run the code in the docstring examples and check the output against the documented ones. Mainly intended to validate the examples.
Examples pytest --examples Run all examples in the examples folder. These are copy-paste friendly examples on how to use a function in a more complete context.
Documentation pytest --documentation Run all examples in the documentation. Examples are occasionally used in the documentation to demonstrate a concept. Here we only check that the code is valid and actually runs.
Unit tests

All unit tests in ODL are contained in the `tests`_ folder, where each ODL sub-package has a test file on its own. Any major ODL functionality should have unit tests covering all of the use cases that are implied in the documentation. In addition to this, the tests should be quick to run, preferably at most a few milliseconds per test. If the test suite takes too long to run, users and developers won’t run them as often as necessary to make sure that they didn’t break any functionality.

A short example of testing a function is given below. For more information consult the pytest documentation and look at existing tests in the test folder.

import pytest


def myfunction(x):
    """Convert ``x`` to a integer and add 1"""
    return int(x) + 1


def test_myfunction():
    # Test basic functionality
    assert myfunction(1) == 2
    assert myfunction(-3) == -2
    assert myfunction(10) == 11

    # Test when called with float
    assert myfunction(1.5) == 2

    # Test when called with string
    assert myfunction('1') == 2

    # Verify that bad input throws a proper error
    with pytest.raises(TypeError):
        myfunction([])

    with pytest.raises(ValueError):
        myfunction('non-integer')

    with pytest.raises(TypeError):
        myfunction(object())

    with pytest.raises(OverflowError):
        myfunction(float('inf'))
Large-scale

Large-scale test verify that functions work well even in realistic conditions and with an even wider range of input than in the standard unit tests. They live in the largescale subfolder of the test folder. Not all functionality needs largescale tests, in fact, most doesn’t. This type of test makes most sense for (1) functionality that has a complex implementation where it’s easy to make mistakes that the code slow (regression tests) and (2) features that take too much time to be tested broadly in the standard suite. For the second type, the unit tests should include only a couple of tests that can run fast, and the full range of inputs can be tested in the large-scale suite.

It may also be the case that some functions accept a very large number of possible input configurations, in this case, testing the most common configuration in the regular unittest and testing the others in a largescale test is acceptable.

Doctests

Doctests are the simplest type of test used in ODL, and are snippets of code that document the usage of functions and classes and can be run as small tests at the same time. They can be included by using the Examples header in an usual docstring, as shown below:

def myfunction(x):
    """Convert ``x`` to a integer and add 1.

    Examples
    --------
    For integers, the function simply adds 1:

    >>> myfunction(1)
    2

    The function also works with floats:

    >>> myfunction(1.3)
    2
    """
    return int(x) + 1

Despite simply looking like documentation, doctests are actual pieces of python code and will be executed when the pytest command is invoked. See the doctest documentation for more information.

All ODL source files should also contain the lines:

if __name__ == '__main__':
    from odl.util.testutils import run_doctests
    run_doctests()

which mean that if a ODL source file is executed in isolation, all the doctests in the file are run. This can be useful during development in order to quickly see if some functionality works as expected.

Examples

Examples, while not technically tests in the traditional sense, still constitute a part of the test framework for ODL by showing how different parts of ODL work together and by ensuring that functions that depend on each other work as expected. The main purpose of the examples is however to show ODL from a users perspective and particular care should be taken to keep them readable and working since this is often the first thing users see when they start using ODL.

It is even possible to run all examples as part of the test suite by running pytest --examples, but be aware that this requires all ODL dependencies to be installed and that plotting windows can be opened during execution.

Consult the examples directory for an impression of the style in which ODL examples are written.

The ODL release process

This document is intended to give precise instructions on the process of making a release. Its purpose is to avoid broken packages, broken documentation and many other things that can go wrong as a result of mistakes during the release process. Since this is not everyday work and may be done under the stress of a (self-imposed) deadline, it is clearly beneficial to have a checklist to hold on to.

Note

The instructions in this document are written from the perspective of Linux and may need adaption for other platforms.

1. Agree on a release schedule

This involves the “what” and “when” of the release process and fixes a feature set that is supposed to be included in the new version. The steps are:

  • Open an issue on the issue tracker using the title Release X.Y.Z (insert numbers, of course).
  • Discuss and agree on a set of open PRs that should be merged and issues that should be resolved before making a release.
2. Make sure tests succeed and docs are built properly

When all required PRs are merged, ensure that the latest master branch is sane. Travis CI checks every PR, but certain things like CUDA cannot be tested there and must therefore undergo tests on a local machine, for at least Python 2.7 and one version of Python 3.

  • Make a new test conda environment and install all dependencies:

    conda create -n release36 python=3.6 nomkl numpy scipy future matplotlib pytest
    source activate release36
    pip install pyfftw pywavelets
    cd /path/to/odl_repo
    git checkout master
    git pull
    pip install -e .
    
  • Run the tests with pytest, including doctests, examples documentation and large-scale tests:

    pytest --doctest-modules --examples --doctest-doc --largescale
    
  • Do the same for Python 2.7.

  • Make sure the tests also run on the platforms you’re currently not testing on. Ask a buddy maintainer if necessary.

  • Build the documentation. This requires sphinx and the sphinxext submodule:

    conda install sphinx sphinx_rtd_theme
    git submodule update --init --recursive
    cd doc && make clean
    cd source && python generate_doc.py
    cd ..
    make html 2>&1 |\
      grep -E "SEVERE|ERROR|WARNING" |\
      grep -E -v "more than one target found for|__eq__|document isn't included in any toctree"
    

    The last command builds the documentation and filters from the output all irrelevant warnings, letting through only the “proper” warnings and errors. If possible, fix these remaining issues.

  • Glance the built documentation (usually in doc/_build) for obvious errors.

3. Make a release branch off master

When all tests succeed and the docs are fine, start a release branch. Do not touch any actual code on this branch other than indicated below!

  • Create a branch off current master with the name release-X.Y.Z, inserting the correct version number, of course.

    git checkout master
    git pull
    git checkout -b release-X.Y.Z
    git push -u my_fork release-X.Y.Z
    

    Like any regular branch that should result in a PR, the release branch is pushed to a fork.

4. Bump the master branch to the next development version

To ensure a higher version number for installations from the git master branch, the version number must be increased before merging the release branch.

  • On the master branch, change the version string in odl/__init__.py to the next revision larger than the upcoming release version, plus 'dev0'. For example, if the release version string is '0.5.3', use '0.5.4.dev0'.

    To make sure you don’t miss any other location (or the information here is outdated), perform a search:

    cd doc && make clean && cd ..  # remove the local HTML doc first
    grep -Ir "0\.5\.4" . | grep -E -v "\.git|release_notes\.rst|odl\.egg-info"
    
  • In the file conda/meta.yaml, change the version string after version: to the same as above, but without the dev0 tag. In the example above, this would mean to change it from "0.5.3" to "0.5.4".

    If necessary, change git_rev value to master, although that should already be the case.

  • Commit the changes, using a message like REL: bump version to X.Y.Z.dev0.

  • Make a PR with just this change and merge it after review. It must be merged before the release branch.

5. Compile and publish the release

Back on the release branch with a git checkout release-X.Y.Z, it is now time to prepare the release documents, increment the version number and make a release on GitHub.

Note

The release notes should actually be a running document where everybody who files a PR also makes an entry into the release notes file. If not, tough on you – it is your duty now to make up for all that missed work. Maybe you’ll remind your co-workers to do this in their next PR.

  • Compile the release notes. They should contain all user-visible changes, including performance improvements and other niceties – internal stuff like test modifications don’t belong here. The changes should be summarized in one or two sentences on top, perhaps mentioning the most notable ones in this release. Check the Release Notes file for details on sections, formatting etc.

  • Increment the version number in odl/__init__.py and conda/meta.yaml. As in Section 4, perform a search to make sure you didn’t miss a version info location.

  • Change the git_rev field in conda/meta.yaml to 'vX.Y.Z', using the upcoming version number. This is the git tag you will create when making the release on GitHub.

  • Commit the changes, using a message like REL: bump version to X.Y.Z.

  • Make a PR and fix review comments. When doing so, keep the REL: bump version to X.Y.Z commit last, for example by using git commit --amend for fixes, or by squashing the commits on the release branch.

    Note

    Getting a review is useful, but not mandatory since the only changes on the branch are the release notes and the version bump. It is important that the version information is correct, but errors in the release notes are less grave since they are not distributed with the package and can thus be fixed afterwards.

  • Push the release branch to the main repository so that it is possible to make a GitHub release from it:

    git push origin release-X.Y.Z
    
  • Go to the Releases page on GitHub. Click on Draft a new release and select the release-X.Y.Z branch from the dropdown menu, not master. Use vX.Y.Z as release tag (numbers inserted, of course).

  • Paste the short summary from the release notes file (converting from RST to Markdown) but don’t insert the details.

  • Add a link to the release notes documentation page, as in earlier releases. Later on, when the documentation with the new release notes is online, you can edit this link to point to the exact section.

Note

If you encounter an issue (like a failing test) that needs immediate fix, stop at that point, fix the issue on a branch, make a PR and merge it after review. After that, rebase the release branch(es) on the new master and continue.

6. Create packages for PyPI and Conda

The packages should be built on the release branch to make sure that the version information is correct.

  • Making the packages for PyPI is straightforward. However, make sure you delete old build directories since they can pollute new builds:

    rm build/ -rf
    python setup.py sdist
    python setup.py bdist_wheel
    

    The packages are by default stored in a dist folder.

  • To build the conda packages, you should not work in a specific environment but rather exit to the root environment. There, install the conda-build tool for building packages:

    source deactivate
    conda install conda-build
    
  • Invoke the following command to build a package for your platform and all supported Python versions:

    conda build conda/ --python 2.7
    conda build conda/ --python 3.4
    conda build conda/ --python 3.5
    conda build conda/ --python 3.6
    ...
    
  • Assuming this succeeds, enter the directory one above where the conda package was stored (as printed in the output). For example, if the package was stored as $HOME/miniconda3/conda-bld/linux-64/odl-X.Y.Z-py36_0.bz2, issue the command

    cd $HOME/miniconda3/conda-bld/
    

    In this directory, for each Python version “translate” the package to all platforms since ODL is actually platform-independent:

    conda convert --platform all <package>
    

    Replace <package> by the package file as built by the previous conda build command.

7. Test installing the local packages and check them

Before actually uploading packages to “official” servers, first install the local packages and run the unit tests.

  • Install directly from the source package (*.tar.gz) or the wheel (*.whl) into a new conda environment:

    source deactivate
    conda create -n pypi_install pytest python=X.Y  # choose Python version
    source activate pypi_install
    cd /path/to/odl_repo
    cd dist
    pip install <pkg_filename>
    python -c "import odl; odl.test()"
    
  • Install and test the local conda packages in a new conda environment:

    source deactivate
    conda create -n conda_install python=X.Y  # choose Python version
    source activate conda_install
    conda install --use-local nomkl odl
    python -c "import odl; odl.test()"
    
8. Upload the packages to the official locations

Installing the packages works, now it’s time to put them out into the wild.

  • Install the twine package for uploading packages to PyPI in your working environment:

    source deactivate
    source activate release36
    pip install twine
    
  • Upload the source package and the wheel to the PyPI server using twine:

    cd /path/to/odl_repo
    twine upload -u odlgorup dist/<pkg_filename>
    

    This requires the access credentials for the odlgroup user on PyPI – the maintainers have them.

  • Upload the conda packages to the odlgroup channel in the Anaconda cloud. The upload requires the anaconda-client package:

    conda install anaconda-client
    cd $HOME/miniconda3/conda-bld
    anaconda upload -u odlgroup `find . -name "odl-X.Y.Z*"`
    

    For this step, you need the access credentials for the odlgroup user on the Anaconda server. Talk to the maintainers to get them.

9. Merge the release branch

Now the release branch can finally be merged. The sole purpose of this step is to update the release notes on master and potentially get the last minute changes.

  • The release branch will have conflicts with master since both have modified the version information. Resolve them in favor of the changes made on master. In particular, make sure that the changes in Section 4 stay intact.
  • Merge the PR for the release.
Done!

Time to clean up, i.e., remove temporary conda environments, run conda build purge, remove files in dist and build generated for the PyPI packages, etc.

Working with ODL source code

Contents:

Introduction

These pages describe a Git and GitHub workflow for the ODL project.

This is not a comprehensive Git reference, it is just a workflow for our own project, tailored to the GitHub hosting service. You may well find better or quicker ways of getting stuff done with Git, but these instructions should get you started.

For general resources for learning Git, see Git resources.

Install Git

Go to https://git-scm.com/book/en/v2/Getting-Started-Installing-Git for the official and up-to-date instructions on how to install Git on your platform.

Following the latest source

These are the instructions if you just want to follow the latest ODL source, but you don’t need to do any development for now.

The steps are:

Get a local copy of the code

From the command line:

$ git clone https://github.com/odlgroup/odl.git

You now have a copy of the code tree in the new odl directory.

Updating the code

From time to time you may want to pull down the latest code. Do this with

$ cd odl
$ git pull

The tree in odl will now have the latest changes from the initial repository.

Making a patch

You’ve discovered a bug or something else you want to change in ODL — excellent!

You’ve worked out a way to fix it — even better!

You want to tell us about it — best of all!

The easiest way is to make a patch or set of patches. Here we explain how. Making a patch is simplest and quickest, but if you’re going to be doing anything more than simple quick things, please consider following the Git for development model instead.

Making patches
Overview
  # Tell Git who you are
$ git config --global user.email you@yourdomain.example.com
$ git config --global user.name "Your Name Comes Here"

  # Get the repository if you don't have it already
$ git clone https://github.com/odlgroup/odl.git

  # Make a branch for your patching
$ cd odl
$ git branch the-fix-im-thinking-of
$ git checkout the-fix-im-thinking-of

  # hack, hack, hack

  # Tell Git about any new files you've made
$ git add somewhere/tests/test_my_bug.py
  # Commit work in progress as you go
$ git commit -am "TST: add tests for Funny bug"

  # hack hack, hack

$ git commit -am "BUG: add fix for Funny bug"

  # Make the patch files
$ git format-patch -M -C master

Then, send the generated patch files to the ODL mailing list — where we will thank you warmly.

In detail
  1. Tell Git who you are so it can label the commits you’ve made:

    $ git config --global user.email you@yourdomain.example.com
    $ git config --global user.name "Your Name Comes Here"
    
  2. If you don’t already have one, clone a copy of the ODL repository:

    $ git clone https://github.com/odlgroup/odl.git
    $ cd odl
    
  3. Make a ‘feature branch’. This will be where you work on your bug fix. It’s nice and safe and leaves you with access to an unmodified copy of the code in the main branch (master).

    $ git branch the-fix-im-thinking-of
    $ git checkout the-fix-im-thinking-of
    
  4. Do some edits, and commit them as you go:

      # hack, hack, hack
    
      # Tell Git about any new files you've made
    $ git add somewhere/tests/test_my_bug.py
      # commit work in progress as you go
    $ git commit -am "TST: add tests for Funny bug"
      # hack hack, hack
    $ git commit -am "BUG: add fix for Funny bug"
    

    Note the -am options to commit. The m flag just signals that you’re going to type a message on the command line. The a flag — you can just take on faith — or see why the -a flag?.

  5. When you are finished, check you have committed all your changes:

    $ git status
    
  6. Finally, turn your commits into patches. You want all the commits since you branched off from the master branch:

    $ git format-patch -M -C master
    

    You will now have several files named after the commits:

    0001-TST-add-tests-for-Funny-bug.patch
    0002-BUG-add-fix-for-Funny-bug.patch
    

    Send these files to the ODL mailing list.

When you are done, to switch back to the main copy of the code, just return to the master branch:

$ git checkout master
Moving from patching to development

If you find you have done some patches, and you have one or more feature branches, you will probably want to switch to development mode. You can do this with the repository you have.

Fork the ODL repository on GitHub — see Making your own copy (fork) of ODL. Then:

  # checkout and refresh master branch from main repo
$ git checkout master
$ git pull origin master
  # rename pointer to main repository to 'upstream'
$ git remote rename origin upstream
  # point your repo to default read / write to your fork on GitHub
$ git remote add myfork git@github.com:your-user-name/odl.git
  # push up any branches you've made and want to keep
$ git push myfork the-fix-im-thinking-of

Now you can follow the Development workflow.

Git for development

Contents:

Making your own copy (fork) of ODL

You need to do this only once. The instructions here are very similar to the instructions at http://help.github.com/forking/ — please see that page for more detail. We’re repeating some of it here just to give the specifics for the ODL project, and to suggest some default names.

Set up and configure a GitHub account

If you don’t have a GitHub account yet, go to the GitHub page and create one.

After that, you need to configure your account to allow write access — see the “Generating SSH keys” help on the GitHub Help pages.

Create your own forked copy of ODL
  1. Log into your GitHub account.

  2. Go to the ODL repository page at ODL GitHub.

  3. Click on the fork button:

    _images/fork_button.jpg

    Now, after a short pause and some “Hardcore forking action”, you should find yourself at the home page for your own forked copy of ODL.

Set up your fork

First follow the instructions for Making your own copy (fork) of ODL.

Overview
$ git clone git@github.com:your-user-name/odl.git
$ cd odl
$ git remote add upstream https://github.com/odlgroup/odl.git
In detail
Clone your fork
  1. Clone your fork to the local computer with

    $ git clone git@github.com:your-user-name/odl.git
    
  2. Investigate. Change directory to your new repo: cd odl. Then git branch -a to show you all branches. You’ll get something like this:

    * master
    remotes/origin/master
    

    This tells you that you are currently on the master branch, and that you also have a remote connection to origin/master. What remote repository is remote/origin? Try git remote -v to see the URLs for the remote. They will point to your GitHub fork.

    Now you want to connect to the upstream ODL GitHub repository, so you can merge in changes from trunk.

Linking your repository to the upstream repo
$ cd odl
$ git remote add upstream https://github.com/odlgroup/odl.git

upstream here is just the arbitrary name we’re using to refer to the main ODL repository at ODL GitHub.

Note that we’ve used https:// for the URL rather than git@. The https:// URL is read-only. This means we that we can’t accidentally (or deliberately) write to the upstream repo, and we are only going to use it to merge into our own code.

Just for your own satisfaction, show yourself that you now have a new “remote”, with git remote -v show, giving you something like:

upstream     https://github.com/odlgroup/odl.git (fetch)
upstream     https://github.com/odlgroup/odl.git (push)
origin       git@github.com:your-user-name/odl.git (fetch)
origin       git@github.com:your-user-name/odl.git (push)
Configure Git
Overview

Your personal Git configurations are saved in the .gitconfig file in your home directory.

Here is an example .gitconfig file:

[user]
        name = Your Name
        email = you@yourdomain.example.com

[alias]
        ci = commit -a
        co = checkout
        st = status
        stat = status
        br = branch
        wdiff = diff --color-words
        show-conflicts = !git --no-pager diff --name-only --diff-filter=U

[core]
        editor = nano

[merge]
        summary = true

You can edit this file directly or you can use the git config --global command:

$ git config --global user.name "Your Name"
$ git config --global user.email you@yourdomain.example.com
$ git config --global alias.ci "commit -a"
$ git config --global alias.co checkout
$ git config --global alias.st "status"
$ git config --global alias.stat "status"
$ git config --global alias.br branch
$ git config --global alias.wdiff "diff --color-words"
$ git config --global alias.show-conflicts "!git --no-pager diff --name-only --diff-filter=U"
$ git config --global core.editor nano
$ git config --global merge.summary true

To set up on another computer, you can copy your ~/.gitconfig file, or run the commands above.

In detail
user.name and user.email

It is good practice to tell Git who you are, for labeling any changes you make to the code. The simplest way to do this is from the command line:

$ git config --global user.name "Your Name"
$ git config --global user.email you@yourdomain.example.com

This will write the settings into your Git configuration file, which should now contain a user section with your name and email:

[user]
      name = Your Name
      email = you@yourdomain.example.com

Of course you’ll need to replace Your Name and you@yourdomain.example.com with your actual name and email address.

Aliases

You might well benefit from some aliases to common commands.

For example, you might well want to be able to shorten git checkout to git co. Or you may want to alias git diff --color-words (which gives a nicely formatted output of the diff) to git wdiff. Another useful alias is git show-conflicts which displays files that are currently in conflict.

The git config --global commands

$ git config --global alias.ci "commit -a"
$ git config --global alias.co checkout
$ git config --global alias.st "status -a"
$ git config --global alias.stat "status -a"
$ git config --global alias.br branch
$ git config --global alias.wdiff "diff --color-words"
$ git config --global alias.show-conflicts "!git --no-pager diff --name-only --diff-filter=U"

will create an alias section in your .gitconfig file with contents like this:

[alias]
        ci = commit -a
        co = checkout
        st = status -a
        stat = status -a
        br = branch
        wdiff = diff --color-words
        show-conflicts = !git --no-pager diff --name-only --diff-filter=U
Editor

You may also want to make sure that your favorite editor is used:

$ git config --global core.editor nano
Merging

To enforce summaries when doing merges (~/.gitconfig file again):

[merge]
   log = true

Or from the command line:

$ git config --global merge.log true
Fancy log output

This is a very nice alias to get a fancy log output; it should go in the alias section of your .gitconfig file:

fancylog = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)[%an]%Creset' --abbrev-commit --date=relative

You use the alias with:

$ git fancylog

and it gives graph / text output something like this (but with color!):

* a105abc - (HEAD -> master, upstream/master) Revert "MAINT: replace deprecated pngmath extension by imgmath" (50 minutes ago) [Holger Kohr]
* 05168c9 - MAINT: replace deprecated pngmath extension by imgmath (53 minutes ago) [Holger Kohr]
* f654c3d - DOC: update README and description in setup.py a bit (19 hours ago) [Holger Kohr]
*   d097c7b - Merge pull request #436 from odlgroup/issue-435__parallel2d_rotation (19 hours ago) [Holger Kohr]
|\
| * 180ba96 - (upstream/issue-435__parallel2d_rotation, issue-435__parallel2d_rotation) TST: Add test for angle conventions of projectors (24 hours ago) [Jonas Adler]
| * de2ab55 - BUG: fix behaviour of show with nonuniform data (26 hours ago) [Jonas Adler]
| * a979666 - BUG: fix rotation by 90 degrees for 2d parallel (27 hours ago) [Holger Kohr]
|/
*   ecfd306 - Merge pull request #444 from odlgroup/issue-443__uniform_partition (29 hours ago) [Holger Kohr]
|\
| * 024552f - MAINT: replace 10 ** -10 with 1e-10 in domain_test.py (29 hours ago) [Holger Kohr]
| * 032b89d - ENH: allow single tuple for nodes_on_bdry in uniform_sampling for 1d (29 hours ago) [Holger Kohr]
| * 85dda52 - ENH: add atol to IntervalProd.contains_all (29 hours ago) [Holger Kohr]
| * bdaef8c - ENH: make uniform_partition more flexible (29 hours ago) [Holger Kohr]
| * 72b4bd5 - MAINT: use odl.foo instead of from odl import foo in partition_test.py (2 days ago) [Holger Kohr]
| * 11ec155 - MAINT: fix typo in grid.py (2 days ago) [Holger Kohr]
| * dabc917 - MAINT: change tol parameter in IntervalProd to atol (2 days ago) [Holger Kohr]
* |   e59662c - Merge pull request #439 from odlgroup/issue-409__element_noop (29 hours ago) [Jonas Adler]
|\ \
| |/
|/|
| * 1d41554 - API: enforce element(vec) noop (8 days ago) [Jonas Adler]
* |   34d4e74 - Merge pull request #438 from odlgroup/issue-437__discr_element_broadcast (8 days ago) [Jonas Adler]
|\ \
| |/
|/|
| * e09bfa9 - ENH: allow broadcasting in discr element (8 days ago) [Jonas Adler]

Thanks to Yury V. Zaytsev for posting it.

Development workflow

You already have your own forked copy of the ODL repository by following Making your own copy (fork) of ODL. You have Set up your fork. You have configured Git according to Configure Git. Now you are ready for some real work.

Workflow summary

In what follows we’ll refer to the upstream ODL master branch, as “trunk”.

  • Don’t use your master branch for anything. Consider deleting it.
  • When you are starting a new set of changes, fetch any changes from trunk, and start a new feature branch from that.
  • Make a new branch for each separable set of changes — “one task, one branch” (see IPython Git workflow).
  • Name your branch for the purpose of the changes - e.g. issue-128__performance_tests or refactor_array_tests. Use the issue-<number>__ prefix for existing issues.
  • If you are fixing a bug or implement a new feature, consider creating an issue on the ODL issue tracker first.
  • Never merge trunk or any other branches into your feature branch while you are working.
  • If you do find yourself merging from trunk, Rebase on trunk instead.
  • Ask on the ODL mailing list if you get stuck.
  • Ask for code review!

This way of working helps to keep the project well organized, with readable history. This in turn makes it easier for project maintainers (that might be you) to see what you’ve done, and why you did it.

See Linux Git workflow and IPython Git workflow for some explanation.

Consider deleting your master branch

It may sound strange, but deleting your own master branch can help reduce confusion about which branch you are on. See deleting master on GitHub for details.

Update the mirror of trunk

First make sure that Linking your repository to the upstream repo is done.

From time to time you should fetch the upstream (trunk) changes from GitHub:

$ git fetch upstream

This will pull down any commits you don’t have, and set the remote branches to point to the right commit. For example, “trunk” is the branch referred to by (remote/branchname) upstream/master - and if there have been commits since you last checked, upstream/master will change after you do the fetch.

Make a new feature branch

When you are ready to make some changes to the code, you should start a new branch. Branches that are for a collection of related edits are often called “feature branches”.

Making an new branch for each set of related changes will make it easier for someone reviewing your branch to see what you are doing.

Choose an informative name for the branch to remind yourself and the rest of us what the changes in the branch are for, for example add-ability-to-fly or issue-42__fix_all_bugs.

Is your feature branch mirroring an issue on the ODL issue tracker? Then prepend your branch name with the prefix issue-<number>__, where <number> is the ticket number of the issue you are going to work on. If there is no existing issue that corresponds to the code you’re about to write, consider creating a new one. In case you are fixing a bug or implementing a feature, it is best to get in contact with the maintainers as early as possible. Of course, if you are only playing around, you don’t need to create an issue and can name your branch however you like.

  # Update the mirror of trunk
$ git fetch upstream
  # Make new feature branch starting at current trunk
$ git branch my-new-feature upstream/master
$ git checkout my-new-feature

Generally, you will want to keep your feature branches on your public GitHub fork of ODL. To do this, you git push this new branch up to your GitHub repo. Generally (if you followed the instructions in these pages, and by default), Git will have a link to your GitHub repo, called origin. You push up to your own repo on GitHub with

$ git push origin my-new-feature

In git >= 1.7 you can ensure that the link is correctly set by using the --set-upstream option:

$ git push --set-upstream origin my-new-feature

From now on Git will know that my-new-feature is related to the my-new-feature branch in the GitHub repo.

The editing workflow
Overview
  # hack hack
$ git add my_new_file
$ git commit -m "BUG: fix all bugs"
$ git push
In more detail
  1. Make some changes.

  2. See which files have changed with git status (see git status). You’ll see a listing like this one:

    On branch my-new-feature
    Changed but not updated:
      (use "git add <file>..." to update what will be committed)
      (use "git checkout -- <file>..." to discard changes in working directory)
    
            modified:   README
    
    Untracked files:
      (use "git add <file>..." to include in what will be committed)
    
            INSTALL
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
  3. Check what the actual changes are with git diff (see git diff).

  4. Add any new files to version control git add new_file_name (see git add).

  5. To commit all modified files into the local copy of your repo, do git commit -am "A commit message". Note the -am options to commit. The m flag just signals that you’re going to type The commit message on the command line. The a flag — you can just take on faith — or see why the -a flag? — and the helpful use-case description in the tangled working copy problem. The git commit manual page might also be useful.

  6. To push the changes up to your forked repo on GitHub, perform a git push (see git push).

The commit message

Bear in mind that the commit message will be part of the history of the repository, shown by typing git log, so good messages will make the history readable and searchable. Don’t see the commit message as an annoyance, but rather as an important part of your contribution.

We appreciate if you follow the following style:

  1. Start your commit with an acronym, e.g., BUG, TST or STY to indicate what kind of modification you make.

  2. Write a one-line summary of your modification no longer than 50 characters. If you have a hard time summarizing you changes, maybe you need to split up the commit into parts.

    Use imperative style, i.e. write add super feature or fix horrific bug rather than added, fixed .... This saves two characters for something else.

    Don’t use markdown. You can refer to issues by writing #12. You can even have GitHub automatically close an issue by writing closes #12. This happens once your commit has made its way into master (usually after merging the pull request).

  3. (optional) Write an extended summary. Describe why these changes are necessary and what the new code does better than the old one.

Ask for your changes to be reviewed or merged

When you are ready to ask for someone to review your code and consider a merge:

  1. Go to the URL of your forked repo, say http://github.com/your-user-name/odl.

  2. Use the “Switch branches/tags” dropdown menu near the top left of the page to select the branch with your changes:

    _images/branch-dropdown.png
  3. Click on the “New Pull Request” button:

    _images/new-pull-request-button.png

    Enter a title for the set of changes, and some explanation of what you’ve done. Say if there is anything you’d like particular attention for - like a complicated change or some code you are not happy with.

    If you don’t think your request is ready to be merged, just say so in your pull request message. This is still a good way of getting some preliminary code review.

    See also: https://help.github.com/articles/using-pull-requests/

Some other things you might want to do
Delete a branch on GitHub
$ git checkout master
  # delete branch locally
$ git branch -D my-unwanted-branch
  # delete the remote branch on GitHub
$ git push origin :my-unwanted-branch

Note the colon : before test-branch.

See also: http://github.com/guides/remove-a-remote-branch

Several people sharing a single repository

If you want to work on some stuff with other people, where you are all committing into the same repository, or even the same branch, then just share it via GitHub.

First fork ODL into your account, as from Making your own copy (fork) of ODL.

Then, go to your forked repository GitHub page, say http://github.com/your-user-name/odl.

Click on “Settings” -> “Collaborators” button, and invite other people the repo as a collaborator. Once they have accepted the invitation, they can do

$ git clone git@githhub.com:your-user-name/odl.git

Remember that links starting with git@ use the ssh protocol and are read-write; links starting with https:// are read-only.

Your collaborators can then commit directly into that repo with the usual

$ git commit -am "ENH: improve code a lot"
$ git push origin master  # pushes directly into your repo

See also: https://help.github.com/articles/inviting-collaborators-to-a-personal-repository/

Explore your repository

To see a graphical representation of the repository branches and commits, use a Git GUI like gitk shipped with Git or QGit included in KDE:

$ gitk --all

To see a linear list of commits for this branch, invoke

$ git log

You can also look at the Network graph visualizer for your GitHub repo.

Finally the Fancy log output fancylog alias will give you a reasonable text-based graph of the repository.

Rebase on trunk

Let’s say you thought of some work you’d like to do. You Update the mirror of trunk and Make a new feature branch called cool-feature. At this stage trunk is at some commit, let’s call it E. Now you make some new commits on your cool-feature branch, let’s call them A, B, C. Maybe your changes take a while, or you come back to them after a while. In the meantime, trunk has progressed from commit E to commit (say) G:

      A---B---C cool-feature
     /
D---E---F---G trunk

Now you consider merging trunk into your feature branch, and you remember that this page sternly advises you not to do that, because the history will get messy. Most of the time you can just ask for a review, and not worry that trunk has got a little ahead. But sometimes, the changes in trunk might affect your changes, and you need to harmonize them. In this situation, you may prefer to do a rebase.

Rebase takes your changes (A, B, C) and replays them as if they had been made to the current state of trunk. In other words, in this case, it takes the changes represented by A, B, C and replays them on top of G. After the rebase, your history will look like this:

              A'--B'--C' cool-feature
             /
D---E---F---G trunk

See rebase without tears for more detail.

To do a rebase on trunk:

  # Update the mirror of trunk
$ git fetch upstream

  # go to the feature branch
$ git checkout cool-feature

  # make a backup in case you mess up
$ git branch tmp cool-feature

  # rebase cool-feature onto trunk
git rebase --onto upstream/master upstream/master cool-feature

In this situation, where you are already on branch cool-feature, the last command can be written more succinctly as

$ git rebase upstream/master

When all looks good you can delete your backup branch:

$ git branch -D tmp

If it doesn’t look good you may need to have a look at Recovering from mess-ups.

If you have made changes to files that have also changed in trunk, this may generate merge conflicts that you need to resolve - see the git rebase manual page for some instructions at the end of the “Description” section. There is some related help on merging in the Git user manual - see resolving a merge.

Recovering from mess-ups

Sometimes, you mess up merges or rebases. Luckily, in Git it is relatively straightforward to recover from such mistakes.

If you mess up during a rebase:

$ git rebase --abort

If you notice you messed up after the rebase:

  # reset branch back to the saved point
$ git reset --hard tmp

If you forgot to make a backup branch:

  # look at the reflog of the branch
$ git reflog show cool-feature

8630830 cool-feature@{0}: commit: BUG: io: close file handles immediately
278dd2a cool-feature@{1}: rebase finished: refs/heads/my-feature-branch onto 11ee694744f2552d
26aa21a cool-feature@{2}: commit: BUG: lib: make seek_gzip_factory not leak gzip obj
...


  # reset the branch to where it was before the botched rebase
$ git reset --hard cool-feature@{2}
Rewriting commit history

Note

Do this only for your own feature branches.

There’s an embarrassing typo in a commit you made? Or perhaps the you made several false starts you would like the posterity not to see.

This can be fixed via interactive rebasing.

Suppose that the commit history looks like this:

$ git log --oneline
eadc391 Fix some remaining bugs
a815645 Modify it so that it works
2dec1ac Fix a few bugs + disable
13d7934 First implementation
6ad92e5 * masked is now an instance of a new object, MaskedConstant
29001ed Add pre-nep for a copule of structured_array_extensions.
...

and 6ad92e5 is the last commit in the cool-feature branch. Suppose we want to make the following changes:

  • Rewrite the commit message for 13d7934 to something more sensible.
  • Combine the commits 2dec1ac, a815645, eadc391 into a single one.

We do as follows:

  # make a backup of the current state
$ git branch tmp HEAD
  # interactive rebase
$ git rebase -i 6ad92e5

This will open an editor with the following text in it:

pick 13d7934 First implementation
pick 2dec1ac Fix a few bugs + disable
pick a815645 Modify it so that it works
pick eadc391 Fix some remaining bugs

# Rebase 6ad92e5..eadc391 onto 6ad92e5
#
# Commands:
#  p, pick = use commit
#  r, reword = use commit, but edit the commit message
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#  f, fixup = like "squash", but discard this commit's log message
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#

To achieve what we want, we will make the following changes to it:

r 13d7934 First implementation
pick 2dec1ac Fix a few bugs + disable
f a815645 Modify it so that it works
f eadc391 Fix some remaining bugs

This means that (i) we want to edit the commit message for 13d7934, and (ii) collapse the last three commits into one. Now we save and quit the editor.

Git will then immediately bring up an editor for editing the commit message. After revising it, we get the output:

[detached HEAD 721fc64] FOO: First implementation
 2 files changed, 199 insertions(+), 66 deletions(-)
[detached HEAD 0f22701] Fix a few bugs + disable
 1 files changed, 79 insertions(+), 61 deletions(-)
Successfully rebased and updated refs/heads/my-feature-branch.

and the history looks now like this:

0f22701 Fix a few bugs + disable
721fc64 ENH: Sophisticated feature
6ad92e5 * masked is now an instance of a new object, MaskedConstant

If it went wrong, recovery is again possible as explained above.

Maintainer workflow

This page is for maintainers — those of us who merge our own or other peoples’ changes into the upstream repository.

As a maintainer, you are completely on top of the basic stuff in Development workflow, of course.

The instructions in Linking your repository to the upstream repo add a remote that has read-only access to the upstream repo. Being a maintainer, you’ve got read-write access.

It’s good to have your upstream remote under a scary name, to remind you that it’s a read-write remote:

$ git remote add upstream-rw git@github.com:odlgroup/odl.git
$ git fetch upstream-rw
Integrating changes

Let’s say you have some changes that need to go into trunk (upstream-rw/master).

The changes are in some branch that you are currently on. For example, you are looking at someone’s changes like this:

$ git remote add someone https://github.com/someone/odl.git
$ git fetch someone
$ git branch cool-feature --track someone/cool-feature
$ git checkout cool-feature

So now you are on the branch with the changes to be incorporated upstream. The rest of this section assumes you are on this branch.

A few commits

If there are only a few commits, consider rebasing to upstream:

  # Fetch upstream changes
$ git fetch upstream-rw
  # rebase
$ git rebase upstream-rw/master
A long series of commits

If there are a longer series of related commits, consider a merge instead:

$ git fetch upstream-rw
$ git merge --no-ff upstream-rw/master

The merge will be detected by GitHub, and should close any related pull requests automatically.

Note the --no-ff above. This forces Git to make a merge commit, rather than doing a fast-forward, so that this set of commits branch off trunk and then rejoin the main history with a merge, rather than appearing to have been made directly on top of trunk.

Check the history

Now, in either case, you should check that the history is sensible and you have the right commits:

$ git log --oneline --graph
$ git log -p upstream-rw/master..

The first line above just shows the history in a compact way, with a text representation of the history graph. The second line shows the log of commits excluding those that can be reached from trunk (upstream-rw/master), and including those that can be reached from current HEAD (implied with the .. at the end). So, it shows the commits unique to this branch compared to trunk. The -p option shows the diff for these commits in patch form.

Push to trunk
$ git push upstream-rw my-new-feature:master

This pushes the my-new-feature branch in this repository to the master branch in the upstream-rw repository.

Git resources
Tutorials and summaries
Advanced Git workflow

There are many ways of working with Git; here are some posts on the rules of thumb that other projects have come up with:

  • Linus Torvalds on Git management
  • Linus Torvalds on Linux Git workflow. Summary; use the Git tools to make the history of your edits as clean as possible; merge from upstream edits as little as possible in branches where you are doing active development.
Manual pages online

You can get these on your own machine with (e.g) git help push or (same thing) git push --help, but, for convenience, here are the online manual pages for some common commands:

Frequently asked questions

Abbreviations: Q uestion – P roblem – S olution

General errors

  1. Q: When importing odl, the following error is shown:

    File "/path/to/odl/odl/__init__.py", line 36
    
      from . import diagnostics
    
      ImportError: cannot import diagnostics
    

    However, I did not change anything in diagnostics? Where does the error come from?

    P: Usually, this error originates from invalid code in a completely different place. You may have edited or added a module and broken the import chain in some way. Unfortunately, the error message is always as above, not specific to the invalid module.

    Another more subtle reason can be related to old bytecode files. When you for the first time import (=execute) a module or execute a script, a bytecode file is created, basically to speed up execution next time. If you installed odl with pip -e (--editable), these files can sometimes interfere with changes to your codebase.

    S: Here are two things you can do to find the error more quickly.

    1. Delete the bytecode files. In a standard GNU/Linux shell, you can simply invoke (in your odl working directory)

      find . -name *.pyc | xargs rm
      
    2. Execute the modules you changed since the last working (importable) state. In most IDEs, you have the possibility to run a currently opened file. Alternatively, you can run on the command line

      python path/to/your/module.py
      

      This will yield a specific error message for an erroneous module that helps you debugging your changes.

  2. Q: When adding two space elements, the following error is shown:

    TypeError: unsupported operand type(s) for +: 'DiscreteLpElement' and 'DiscreteLpElement'
    

    This seems completely illogical since it works in other situations and clearly must be supported. Why is this error shown?

    P: The elements you are trying to add are not in the same space. For example, the following code triggers the same error:

    >>> x = odl.uniform_discr(0, 1, 10).one()
    >>> y = odl.uniform_discr(0, 1, 11).one()
    >>> x - y
    

    In this case, the problem is that the elements have a different number of entries. Other possible issues include that they are discretizations of different sets, have different data types (dtype), or implementation (for example CUDA/CPU).

    S: The elements need to somehow be cast to the same space. How to do this depends on the problem at hand. To find what the issue is, inspect the space properties of both elements. For the above example, we see that the issue lies in the number of discretization points:

    >>> x.space
    odl.uniform_discr(0, 1, 10)
    >>> y.space
    odl.uniform_discr(0, 1, 11)
    
    • In the case of spaces being discretizations of different underlying spaces, a transformation of some kind has to be applied (for example by using an operator). In general, errors like this indicates a conceptual issue with the code, for example a “we identify X with Y” step has been omitted.
    • If the dtype or impl do not match, they need to be cast to each one of the others. The most simple way to do this is by using the DiscreteLpElement.astype method.
  3. Q: I have installed ODL with the pip install --editable option, but I still get an AttributeError when I try to use a function/class I just implemented. The use-without-reinstall thing does not seem to work. What am I doing wrong?

    P: You probably use an IDE like Spyder with integrated editor, console, etc. While your installation of the ODL package sees the changes immediately, the console still sees the version of the package before the changes since it was opened.

    S: Simply close the current console and open a new one.

Usage

  1. Q: I want to write an Operator with two input arguments, for example

    op(x, y) := x + y

    However, ODL only supports single arguments. How do I do this?

    P: Mathematically, such an operator is defined as

    \mathcal{A}: \mathcal{X}_1 \times \mathcal{X}_2
\rightarrow \mathcal{Z}

    ODL adhers to the strict definition of this and hence only takes one parameter x \in \mathcal{X}_1 \times \mathcal{X}_2. This product space element x is then a tuple of elements x = (x_1, x_2),
x_1 \in \mathcal{X}_1, x_2 \in \mathcal{X}_2.

    S: Make the domain of the operator a ProductSpace if \mathcal{X}_1 and \mathcal{X}_2 are LinearSpace‘s, or a CartesianProduct if they are mere Set‘s. Mathematically, this corresponds to

    op([x, y]) := x + y

    Of course, a number of input arguments larger than 2 can be treated analogously.

Glossary

array-like
Any data structure which can be converted into a numpy.ndarray by the numpy.array constructor. Includes all NtuplesBaseVector based classes.
convex conjugate

The convex conjugate (also called Fenchel conjugate) is an important tool in convex optimization. For a functional f, the convex conjugate f^* is the functional

f^*(x^*) = \sup_x \big( \langle x, x^* \rangle - f(x) \big).

discretization
Structure to handle the mapping between abstract objects (e.g. functions) and concrete, finite realizations. It encompasses an abstract Set, a finite data container (NtuplesBaseVector in general) and the mappings between them, sampling and interpolation.
domain
Set of elements to which an operator can be applied.
dtype
Short for data type, indicates the way data is represented internally. For example float32 means 32-bit floating point numbers. See numpy dtype for more details.
element
Saying that x is an element of a given Set my_set means that x in my_set evaluates to True. The term is typically used as “element of <set>” or “<set>” element. When referring to a LinearSpace like, e.g., DiscreteLp, an element is of the corresponding type LinearSpaceElement, i.e. DiscreteLpElement in the above example. Elements of a set can be created by the Set.element method.
element-like

Any data structure which can be converted into an element of a Set by the Set.element method. For example, an rn(3) element-like is any array-like object with 3 real entries.

Example: `DiscreteLp` element-like means that DiscreteLp.element can create a DiscreteLpElement from the input.

in-place evaluation
Operator evaluation method which uses an existing data container to store the result. Usually more efficient than out-of-place evaluation since no new memory is allocated and no data is copied.
interpolation
Operator in a discretization mapping a concrete (finite-dimensional) object to an abstract (infinite-dimensional) one. Example: LinearInterpolation.
meshgrid
Tuple of arrays defining a tensor grid by all possible combinations of entries, one from each array. In 2 dimensions, for example, the arrays [1, 2] and [-1, 0, 1] define the grid points (1, -1), (1, 0), (1, 1), (2, -1), (2, 0), (2, 1).
operator
Mathematical notion for a mapping between arbitrary vector spaces. This includes the important special case of an operator taking a (discretized) function as an input and returning another function. For example, the Fourier Transform maps a function to its transformed version. Operators of this type are the most prominent use case in ODL. See the in-depth guide on operators for details on their implementation.
order
Ordering of the axes in a multi-dimensional array with linear (one-dimensional) storage. For C ordering ('C'), the last axis has smallest stride (varies fastest), and the first axis has largest stride (varies slowest). Fortran ordering ('F') is the exact opposite.
out-of-place evaluation
Operator evaluation method which creates a new data container to store the result. Usually less efficient than in-place evaluation since new memory is allocated and data needs to be copied.
proximal

Given a proper convex functional S, the proximal operator is defined by

\text{prox}_S(v) = \arg\min_x \big( S(x) + \frac{1}{2}||x - v||_2^2 \big)

The term “proximal” is also occasionally used instead of ProxImaL, then refering to the proximal modelling language for the solution of convex optimization problems.

proximal factory
A proximal factory associated with a functional S is a callable, which returns the proximal of the scaled functional \sigma S when called with a scalar \sigma. This is used due to the fact that optimization methods often use \text{prox}_{\sigma S} for varying \sigma.
range
Set of elements to which an operator maps, i.e. in which the result of an operator evaluation lies.
sampling
Operator in a discretization mapping an abstract (infinite-dimensional) object to a concrete (finite-dimensional) one. Example: PointCollocation.
vectorization

Ability of a function to be evaluated on a grid in a single call rather than looping over the grid points. Vectorized evaluation gives a huge performance boost compared to Python loops (at least if there is no JIT) since loops are implemented in optimized C code.

The vectorization concept in ODL differs slightly from the one in NumPy in that arguments have to be passed as a single tuple rather than a number of (positional) arguments. See numpy vectorization for more details.

Release Notes

Upcoming release

ODL 0.6.0 Release Notes (2017-04-20)

Besides many small improvements and additions, this release is the first one under the new Mozilla Public License 2.0 (MPL-2.0).

New features

  • The Kaczmarz method has been added to the solvers (PR 840).
  • Most immutable types now have a __hash__ method (PR 840).
  • A variant of the Conjugate Gradient solver for non-linear problems has been added (PR 554).
  • There is now an example for tomographic reconstruction using Total Generalized Variation (TGV). (PR 883).
  • Power spaces can now be created using the ** operator, e.g., odl.rn(3) ** 4. Likewise, product spaces can be created using multiplication *, i.e., odl.rn(3) * odl.rn(4) (PR 882).
  • A SamplingOperator for the extraction of values at given indices from arrays has been added, along with its adjoint WeightedSumSamplingOperator (PR 940).
  • Callbacks can now be composed with operators, which can be useful, e.g., for transforming the current iterate before displaying it (PR 954).
  • RayTransform (and thus also fbp_op) can now be directly used on spaces of complex functions (PR 970).

Improvements

  • In CallbackPrintIteration, a step number between displays can now be specified (PR 871).
  • OperatorPointwiseProduct got its missing derivative (PR 877).
  • SeparableSum functionals can now be indexed to retrieve the constituents (PR 898).
  • Better self-printing of callbacks (PR 881).
  • ProductSpaceOperator and subclasses now have size and __len__, and the parent also has shape. Also self-printing of these operators is now better (PR 901).
  • Arithmetic methods of LinearSpace have become more permissive in the sense that operations like space_element + raw_array now works if the array can be cast to an element of the same space (PR 902).
  • There is now a (work-in-progress) document on the release process with the aim to avoid errors (PR 872).
  • The MRC extended header implementation is now much simpler (PR 917).
  • The show_discrete_data workhorse is now more robust towards arrays with inf and nan entries regarding colorbar settings (PR 921).
  • The title in CallbackShow are now interpreted as format string with iteration number inserted, which enables updating the figure title in real time (PR 923).
  • Installation instructions have been arranged in a better way, grouped after different ways of installing (PR 884).
  • A performance comparison example pure ASTRA vs. ODL with ASTRA for 3d cone beam has been added (PR 912).
  • OperatorComp avoids an operator evaluation in derivative in the case when the left operator is linear (PR 957).
  • FunctionalComp now has a default implementation of gradient.derivative if the operator in the composition is linear (PR 956).
  • The saveto parameter of CallbackShow can now be a callable that returns the file name to save to when called on the current iteration number (PR 955).

Changes

  • The sphinxext submodule has been from upstream (PR 846).
  • The renames TensorGrid -> RectGrid and uniform_sampling -> uniform_grid have been made, and separate class RegularGrid has been removed in favor of treating regular grids as a special case of RectGrid. Instances of RectGrid have a new property is_uniform for this purpose. Furthermore, uniformity of RectPartition and RectGrid is exposed as property per axis using is_uniform_byaxis (PR 841).
  • extent of grids and partitions is now a property instead of a method (PR 889).
  • The number of iterations in solvers is no longer optional since the old default 1 didn’t make much sense (PR 888).
  • The nlevels argument of WaveletTransform is now optional, and the default is the maximum number of levels as determined by the new function pywt_max_nlevels (PR 880).
  • MatVecOperator is now called MatrixOperator and has been moved to the tensor_ops module. This solves a circular dependency issue with ODL subpackages (PR 911).
  • All step parameters of callbacks are now called just step (PR 929).
  • The impl name for the scikit-image back-end in RayTransform has been changed from scikit to skimage (PR 970).
  • ODL is now licensed under the Mozilla Public License 2.0 (PR 977).

Bugfixes

  • Fix an argument order error in the gradient of QuadraticForm (PR 868).
  • Lots of small documentation fixes where ”, optional” was forgotten in the Parameters section (PR 554).
  • Fix an indexing bug in the indicate_proj_axis phantom (PR 878).
  • Fix wrong inheritance order in FileReaderRawBinaryWithHeader that lead to wrong header_size (PR 893).
  • Comparison of arbitrary objects in Python 2 is now disabled for a some ODL classes where it doesn’t make sense (PR 933).
  • Fix a bug in the angle calculation of the scikit-image back-end for Ray transforms (PR 947).
  • Fix issue with wrong integer type in as_scipy_operator (PR 960).
  • Fix wrong scaling in RayTransform and adjoint with unweighted spaces (PR 958).
  • Fix normalization bug of min_pt and max_pt parameters in RectPartition (PR 971).
  • Fix an issue with *args in CallbackShow that lead to the title argument provided twice (PR 981).
  • Fix an unconditional pytest import that lead to an ImportError if pytest was not installed (PR 982).

ODL 0.5.3 Release Notes (2017-01-17)

Lots of small improvements and feature additions in this release. Most notable are the remarkable performance improvements to the ASTRA bindings (up to 10x), the addition of fbp_op to create filtered back-projection operators with several filter and windowing options, as well as further performance improvements to operator compositions and the show methods.

New features

  • Add the SeparableSum(func, n) syntax for n-times repetition of the same summand (PR 685).
  • Add the Ordered Subsets MLEM solver odl.solvers.osmlem for faster EM reconstruction (PR 647).
  • Add GroupL1Norm and IndicatorGroupL1UnitBall for mixed L1-Lp norm regularization (PR 620).
  • Add fbp_op helper to create filtered back-projection operators for a range of geometries (PR 703).
  • Add 2-dimensional FORBILD phantom (PR 694, PR 804, PR 820).
  • Add IndicatorZero functional in favor of of ConstantFunctionalConvexConj (PR 707).
  • Add reader for MRC data files and for custom binary formats with fixed header (PR 716).
  • Add NuclearNorm functional for multi-channel regularization (PR 691).
  • Add CallbackPrint for printing of intermediate results in iterative solvers (PR 691).
  • Expose Numpy ufuncs as operators in the new ufunc_ops subpackage (PR 576).
  • Add ScalingFunctional and IdentityFunctional (PR 576).
  • Add RealPart, ImagPart and ComplexEmbedding operators (PR 706).
  • Add PointwiseSum operator for vector fields (PR 754).
  • Add LineSearchFromIterNum for using a pre-defined mapping from iteration number to step size (PR 752).
  • Add axis_labels option to DiscreteLp for custom labels in plots (PR 770).
  • Add Defrise phantom for cone beam geometry testing (PR 756).
  • Add filter option to fbp_op and tam_danielson_window and parker_weighting helpers for helical/cone geometries (PR 756, PR 806, PR 825).
  • Add ISTA (proximal_gradient) and FISTA (accelerated_proximal_gradient) algorithms, among others useful for L1 regularization (PR 758).
  • Add salt_pepper_noise helper function (PR 758).
  • Expose FBP filtering as operator fbp_filter_op (PR 780).
  • Add parallel_beam_geometry helper for creation of simple test geometries (PR 775).
  • Add MoreauEnvelope functional for smoothed regularization (PR 763).
  • Add saveto option to CallbackShow to store plots of iterates (PR 708).
  • Add CallbackSaveToDisk and CallbackSleep (PR 798).
  • Add a utility signature_string for robust generation of strings for repr or str (PR 808).

Improvements

  • New documentation on the operator derivative notion in ODL (PR 668).
  • Add largescale tests for the convex conjugates of functionals (PR 744).
  • Add domain parameter to LinDeformFixedTempl for better extensibility (PR 748).
  • Add example for sparse tomography with TV regularization using the Douglas-Rachford solver (PR 746).
  • Add support for 1/r^2 scaling in cone beam backprojection with ASTRA 1.8 using a helper function for rescaling (PR 749).
  • Improve performance of operator scaling in certain cases (PR 576).
  • Add documentation on testing in ODL (PR 704).
  • Replace occurrences of numpy.matrix objects (PR 778).
  • Implement Numpy-style indexing for ProductSpaceElement objects (PR 774).
  • Greatly improve efficiency of show by updating the figure in place instead of re-creating (PR 789).
  • Improve efficiency of operator derivatives by short-circuiting in case of a linear operator (PR 796).
  • Implement simple indexing for ProducSpaceOperator (PR 815).
  • Add caching to ASTRA projectors, thus making algorithms run much faster (PR 802).

Changes

  • Rename vector_field_space to tangent_bundle in vector spaces (more adequate for complex spaces) (PR 702).
  • Rename show parameter of show methods to force_show (PR 771).
  • Rename elem.ufunc to elem.ufuncs where implemented (PR 809).
  • Remove “Base” from weighting base classes and rename weight parameter to weighting for consistency (PR 810).
  • Move tensor_ops module from odl.discr to odl.operator for more general application (PR 813).
  • Rename ellipse to ellipsoid in names intended for 3D cases (PR 816).
  • Pick the fastest available implementation in RayTransform by default instead of astra_cpu (PR 826).

Bugfixes

  • Prevent ASTRA cubic voxel check from failing due to numerical rounding errors (PR 721).
  • Implement the missing __ne__ in RectPartition (PR 748).
  • Correct adjoint of WaveletTransform (PR 758).
  • Fix issue with creation of phantoms in a space with degenerate shape (PR 777).
  • Fix issue with Windows paths in collect_ignore.
  • Fix bad dict lookup with RayTransform.adjoint.adjoint.
  • Fix rounding issue in a couple of indicator functionals.
  • Several bugfixes in show methods.
  • Fixes to outdated example code.

ODL 0.5.2 Release Notes (2016-11-02)

Another maintenance release that fixes a number of issues with installation and testing, see issue 674, issue 679, and PR 692 and PR 696.

ODL 0.5.1 Release Notes (2016-10-24)

This is a maintenance release since the test suite was not bundled with PyPI and Conda packages as intended already in 0.5.0. From this version on, users can run python -c "import odl; odl.test()" with all types of installations (from PyPI, Conda or from source).

ODL 0.5.0 Release Notes (2016-10-21)

This release features a new important top level class Functional that is intended to be used in optimization methods. Beyond its parent Operator, it provides special methods and properties like gradient or proximal which are useful in advanced smooth or non-smooth optimization schemes. The interfaces of all solvers in odl.solvers have been updated to make use of functionals instead of their proximals, gradients etc. directly.

Further notable changes are the implementation of an as_writable_array context manager that exposes arbitrary array storage as writable Numpy arrays, and the generalization of the wavelet transform to arbitrary dimensions.

See below for a complete list of changes.

New features

  • Add Functional class to the solvers package. (PR 498) Functional is a subclass of odl Operator and intended to help in formulating and solving optimization problems. It contains optimization specific features like proximal and convex_conj, and built-in intelligence for handling things like translation, scaling of argument or scaling of functional. * Migrate all solvers to work with Functional‘s instead of raw proximals etc. (PR 587) * FunctionalProduct and FunctionalQuotient which allow evaluation of the product/quotient of functions and also provides a gradient through the Leibniz/quotient rules. (PR 586) * FunctionalDefaultConvexConjugate which acts as a default for Functional.convex_conj, providing it with a proximal property. (PR 588) * IndicatorBox and IndicatorNonnegativity which are indicator functions on a box shaped set and the set of nonnegative numbers, respectively. They return 0 if all points in a vector are inside the box, and infinity otherwise. (PR 589) * Add Functional``s for ``KullbackLeibler and KullbackLeiblerCrossEntropy, together with corresponding convex conjugates (PR 627). Also add proximal operator for the convex conjugate of cross entropy Kullback-Leibler divergence, called proximal_cconj_kl_cross_entropy (PR 561)
  • Add ResizingOperator for shrinking and extending (padding) of discretized functions, including a variety of padding methods. (PR 499)
  • Add as_writable_array that allows casting arbitrary array-likes to a numpy array and then storing the results later on. This is intended to be used with odl vectors that may not be stored in numpy format (like cuda vectors), but can be used with other types like lists. (PR 524)
  • Allow ASTRA backend to be used with arbitrary dtypes. (PR 524)
  • Add reset to SolverCallback that resets the callback to its initial state. (issue 552)
  • Add nonuniform_partition utility that creates a partition with non-uniformly spaced points. This is useful e.g. when the angles of a tomography problem are not exactly uniform. (PR 558)
  • Add Functional class to the solvers package. Functional is a subclass of odl Operator and intended to help in formulating and solving optimization problems. It contains optimization specific features like proximal and convex_conj, and built-in intelligence for handling things like translation, scaling of argument or scaling of functional. (PR 498)
  • Add FunctionalProduct and FunctionalQuotient which allow evaluation of the product/quotient of functions and also provides a gradient through the Leibniz/quotient rules. (PR 586)
  • Add FunctionalDefaultConvexConjugate which acts as a default for Functional.convex_conj, providing it with a proximal property. (PR 588)
  • Add IndicatorBox and IndicatorNonnegativity which are indicator functions on a box shaped set and the set of nonnegative numbers, respectively. They return 0 if all points in a vector are inside the box, and infinity otherwise. (PR 589)
  • Add proximal operator for the convex conjugate of cross entropy Kullback-Leibler divergence, called proximal_cconj_kl_cross_entropy (PR 561)
  • Add Functional‘s for KullbackLeibler and KullbackLeiblerCrossEntropy, together with corresponding convex conjugates (PR 627)
  • Add tutorial style example. (PR 521)
  • Add MLEM solver. (PR 497)
  • Add MatVecOperator.inverse. (PR 608)
  • Add the Rosenbrock standard test functional. (PR 602)
  • Add broadcasting of vector arithmetic involving ProductSpace vectors. (PR 555)
  • Add phantoms.poisson_noise. (PR 630)
  • Add NumericalGradient and NumericalDerivative that numerically compute gradient and derivative of Operator‘s and Functional‘s. (PR 624)

Improvements

  • Add intelligence to power_method_opnorm so it can terminate early by checking if consecutive iterates are close. (PR 527)
  • Add BroadcastOperator(op, n), ReductionOperator(op, n) and DiagonalOperator(op, n) syntax. This is equivalent to BroadcastOperator(*([op] * n)) etc, i.e. create n copies of the operator. (PR 532)
  • Allow showing subsets of the whole volume in DiscreteLpElement.show. Previously this allowed slices to be shown, but the new version allows subsets such as 0 < x < 3 to be shown as well. (PR 574)
  • Add Solvercallback.reset() which allows users to reset a callback to its initial state. Applicable if users want to reuse a callback in another solver. (PR 553)
  • WaveletTransform and related operators now work in arbitrary dimensions. (PR 547)
  • Several documentation improvements. Including:
    • Move documentation from _call to __init__. (PR 549)
    • Major review of minor style issues. (PR 534)
    • Typeset math in proximals. (PR 580)
  • Improved installation docs and update of Chambolle-Pock documentation. (PR 121)

Changes

  • Change definition of LinearSpaceVector.multiply to match the definition used by Numpy. (PR 509)
  • Rename the parameters padding_method in diff_ops.py and mode in wavelet.py to pad_mode. The parameter padding_value is now called pad_const. (PR 511)
  • Expose ellipse_phantom and shepp_logan_ellipses to odl.phantom. (PR 529)
  • Unify the names of minimum (min_pt), maximum (max_pt) and middle (mid_pt) points as well as number of points (shape) in grids, interval products and factory functions for discretized spaces. (PR 541)
  • Remove simple_operator since it was never used and did not follow the ODL style. (PR 543) The parameter padding_value is now called pad_const.
  • Remove Interval, Rectangle and Cuboid since they were confusing (Capitalized name but not a class) and barely ever used. Users should instead use IntervalProd in all cases. (PR 537)
  • The following classes have been renamed (PR 560):
    • LinearSpaceVector -> LinearSpaceElement
    • DiscreteLpVector -> DiscreteLpElement
    • ProductSpaceVector -> ProductSpaceElement
    • DiscretizedSetVector -> DiscretizedSetElement
    • DiscretizedSpaceVector -> DiscretizedSpaceElement
    • FunctionSetVector -> FunctionSetElement
    • FunctionSpaceVector -> FunctionSpaceElement
  • Change parameter style of differential operators from having a pad_mode and a separate edge_order argument that were mutually exclusive to a single pad_mode that covers all cases. Also added several new pad modes to the differential operators. (PR 548)
  • Switch from RTD documentation hosting to gh-pages and let Travis CI build and deploy the documentation. (PR 536)
  • Update name of proximal_zero to proximal_const_func. (PR 582)
  • Move unit tests from top level test/ to odl/test/ folder and distribute them with the source. (PR 638)
  • Update pytest dependency to [>3.0] and use new featuers. (PR 653)
  • Add pytest option --documentation to test all doctest examples in the online documentation.
  • Remove the pip install odl[all] option since it fails by default.

Bugfixes

  • Fix python -c "import odl; odl.test()" not working on Windows. (PR 508)
  • Fix a TypeError being raised in OperatorTest when running optest.ajoint() without specifying an operator norm. (PR 525)
  • Fix scaling of scikit ray transform for non full scan. (PR 523)
  • Fix bug causing classes to not be vectorizable. (PR 604)
  • Fix rounding problem in some proximals (PR 661)

ODL 0.4.0 Release Notes (2016-08-17)

This release marks the addition of the deform package to ODL, adding functionality for the deformation of DiscreteLp elements.

New features

  • Add deform package with linearized deformations (PR 488)
  • Add option to interface with ProxImaL solvers using ODL operators. (PR 494)

ODL 0.3.1 Release Notes (2016-08-15)

This release mainly fixes an issue that made it impossible to pip install odl with version 0.3.0. It also adds the first really advanced solvers based on forward-backward and Douglas-Rachford splitting.

New features

  • New solvers based on the Douglas-Rachford and forward-backward splitting schemes. (PR 478, PR 480)
  • NormOperator and DistOperator added. (PR 487)
  • Single-element NtuplesBase vectors can now be converted to float, complex etc. (PR 493)

Improvements

  • DiscreteLp.element() now allows non-vectorized and 1D scalar functions as input. (PR 476)
  • Speed improvements in the unit tests. (PR 479)
  • Uniformization of __init__() docstrings and many further documentation and naming improvements. (PR 489, PR 482, PR 491)
  • Clearer separation between attributes that are intended as part of the subclassing API and those that are not. (PR 471)
  • Chambolle-Pock solver accepts also non-linear operators and has better documentation now. (PR 490)
  • Clean-up of imports. (PR 492)
  • All solvers now check that the given start value x is in op.domain. (PR 502)
  • Added test for in-place evaluation of the ray transform. (PR 500)

Bugfixes

  • Axes in show() methods of several classes now use the correct corner coordinates, the old ones were off by half a grid cell in some situations. (PR 477).
  • Catch case in power_method_opnorm() when iteration goes to zero. (PR 495)

ODL 0.3.0 Release Notes (2016-06-29)

This release marks the removal of odlpp from the core library. It has instead been moved to a separate library, odlcuda.

New features

  • To enable cuda backends for the odl spaces, an entry point 'odl.space' has been added where external libraries can hook in to add FnBase and NtuplesBase type spaces.
  • Add pytest fixtures 'fn_impl' and 'ntuple_impl' to the test config conf.py. These can now be accessed from any test.
  • Allow creation of general spaces using the fn, cn and rn factories. These functions now take an impl parameter which defaults to 'numpy' but with odlcuda installed it may also be set to 'cuda'. The old numpy specific Fn, Cn and Rn functions have been removed.

Changes

  • Moved all CUDA specfic code out of the library into odlcuda. This means that cu_ntuples.py and related files have been removed.
  • Rename ntuples.py to npy_ntuples.py.
  • Added Numpy to the numy based spaces. They are now named NumpyFn and NumpyNtuples.
  • Prepended npy_ to all methods specific to ntuples such as weightings.

ODL 0.2.4 Release Notes (2016-06-28)

New features

Bugfixes

  • Fix bug in submarine phantom with non-centered space (PR 469).
  • Fix crash when plotting in 1d (commit 3255fa3).

Changes

  • Move phantoms to new module odl.phantom (PR 469).
  • Rename RectPartition.is_uniform to RectPartition.is_uniform (PR 468).

ODL 0.2.3 Release Notes (2016-06-12)

New features

  • uniform_sampling now supports the nodes_on_bdry option introduced in RectPartition (PR 308).
  • DiscreteLpVector.show has a new coords option that allows to slice by coordinate instead of by index (PR 309).
  • New uniform_discr_fromintv to discretize an existing IntervalProd instance (PR 318).
  • The operator.oputils module has a new function as_scipy_operator which exposes a linear ODL operator as a scipy.sparse.linalg.LinearOperator. This way, an ODL operator can be used seamlessly in SciPy’s sparse solvers (PR 324).
  • New Resampling operator to resample data between different discretizations (PR 328).
  • New PowerOperator taking the power of an input function (PR 338).
  • First pointwise operators acting on vector fields: PointwiseInner and PointwiseNorm (PR 346).
  • Examples for FBP reconstruction (PR 364) and TV regularization using the Chambolle-Pock method (PR 352).
  • New scikit-image based implementation of RayTransform for 2D parallel beam tomography (PR 352).
  • RectPartition has a new method append for simple extension (PR 370).
  • The ODL unit tests can now be run with odl.test() (PR 373).
  • Proximal of the Kullback-Leibler data discrepancy functional (PR 289).
  • Support for SPECT using ParallelHoleCollimatorGeometry (PR 304).
  • A range of new proximal operators (PR 401) and some calculus rules (PR 422) have been added, e.g. the proximal of the convex conjugate or of a translated functional.
  • Functions with parameters can now be sampled by passing the parameter values to the sampling operator. The same is true for the element method of a discrete function space (PR 406).
  • ProducSpaceOperator can now be indexed directly, returning the operator component(s) corresponding to the index (PR 407).
  • RectPartition now supports “almost-fancy” indexing, i.e. indexing via integer, slice, tuple or list in the style of NumPy (PR 386).
  • When evaluating a FunctionSetVector, the result is tried to be broadcast if necessary (PR 438).
  • uniform_partition now has a more flexible way of initialization using begin, end, num_nodes and cell_sides (3 of 4 required) (PR 444).

Improvements

  • Product spaces now utilize the same weighting class hierarchy as Rn type spaces, which makes the weight handling much more transparent and robust (PR 320).
  • Major refactor of the diagnostics module, with better output, improved derivative test and a simpler and more extensible way to generate example vectors in spaces (PR 338).
  • 3D Shepp-Logan phantom sliced in the middle is now exactly the same as the 2D Shepp-Logan phantom (PR 368).
  • Improved usage of test parametrization, making decoration of each test function obsolete. Also the printed messages are better (PR 371).
  • OperatorLeftScalarMult and OperatorRightScalarMult now have proper inverses (PR 388).
  • Better behavior of display methods if arrays contain inf or NaN (PR 376).
  • Adjoints of Fourier transform operators are now correctly handled (PR 396).
  • Differential operators now have consistent boundary behavior (PR 405).
  • Repeated scalar multiplication with an operator accumulates the scalars instead of creating a new operator each time (PR 429).
  • Examples have undergone a major cleanup (PR 431).
  • Addition of __len__ at several places where it was missing (PR 425).

Bugfixes

  • The result of the evaluation of a FunctionSpaceVector is now automatically cast to the correct output data type (PR 331).
  • inf values are now properly treated in BacktrackingLineSearch (PR 348).
  • Fix for result not being written to a CUDA array in interpolation (PR 361).
  • Evaluation of FunctionSpaceVector now works properly in the one-dimensional case (PR 362).
  • Rotation by 90 degrees / wrong orientation of 2D parallel and fan beam projectors and back-projectors fixed (PR 436).

Changes

  • odl.set.pspace was moved to odl.space.pspace (PR 320)
  • Parameter ord in norms etc. has been renamed to exponent (PR 320)
  • restriction and extension operators and parameters have been renamed to sampling and interpolation, respectively (PR 337).
  • Differential operators like Gradient and Laplacian have been moved from odl.discr.discr_ops to odl.discr.diff_ops (PR 377)
  • The initialization patterns of Gradient and Divergence were unified to allow specification of domain or range or both (PR 377).
  • RawDiscretization and Discretization were renamed to DiscretizedSet and DiscretizedSpace, resp. (PR 406).
  • Diagonal “operator matrices” are now implemented with a class DiagonalOperator instead of the factory function diagonal_operator (PR 407).
  • The ...Partial classes have been renamed to Callback.... Parameters of solvers are now callback instead of partial (PR 430).
  • Occurrences of dom and ran as initialization parameters of operators have been changed to domain and range throughout (PR 433).
  • Assignments x = x.space.element(x) are now required to be no-ops (PR 439)

ODL 0.2.2 Release Notes (2016-03-11)

From this release on, ODL can be installed through pip directly from the Python package index.

ODL 0.2.1 Release Notes (2016-03-11)

Fix for the version number in setup.py.

ODL 0.2 Release Notes (2016-03-11)

This release features the Fourier transform as major addition, along with some minor improvements and fixes.

New Features

  • Add FourierTransform and DiscreteFourierTransform, where the latter is the fully discrete version not accounting for shift and scaling, and the former approximates the integral transform by taking shifted and scaled grids into account. (PR 120)
  • The weighting attribute in FnBase is now public and can be used to initialize a new space.
  • The FnBase classes now have a default_dtype static method.
  • A discr_sequence_space has been added as a simple implementation of finite sequences with multi-indexing.
  • DiscreteLp and FunctionSpace elements now have real and imag with setters as well as a conj() method.
  • FunctionSpace explicitly handles output data type and allows this attribute to be chosen during initialization.
  • FunctionSpace, FnBase and DiscreteLp spaces support creation of a copy with different data type via the astype() method.
  • New conj_exponent() utility to get the conjugate of a given exponent.

Improvements

  • Handle some not-so-unlikely corner cases where vectorized functions don’t behave as they should. In particular, make 1D functions work when expressions like t[t > 0] are used.
  • x ** 0 evaluates to the one() space element if implemented.

Changes

  • Move fast_1d_tensor_mult to the numerics.py module.

ODL 0.1 Release Notes (2016-03-08)

First official release.

References

[BB1988]Barzilai, J, and Borwein, J M. Two-point step size gradient methods. IMA Journal of Numerical Analysis, 8 (1988), pp 141–148.
[BC2011]Bauschke, H H, and Combettes, P L. Convex analysis and monotone operator theory in Hilbert spaces. Springer, 2011.
[BC2015]Bot, R I, and Csetnek, E R. On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems. Optimization, 64.1 (2015), pp 5–23.
[BH2013]Bot, R I, and Hendrich, C. A Douglas-Rachford type primal-dual method for solving inclusions with mixtures of composite and parallel-sum type monotone operators. SIAM Journal on Optimization, 23.4 (2013), pp 2541–2565.
[Bro1965]Broyden, C G. A class of methods for solving nonlinear simultaneous equations. Mathematics of computation, 33 (1965), pp 577–593.
[BV2004]Boyd, S, and Vandenberghe, L. Convex optimization. Cambridge university press, 2004.
[Che+2015]Cheng, A, Henderson, R, Mastronarde, D, Ludtke, S J, Schoenmakers, R H M, Short, J, Marabini, R, Dallakyan, S, Agard, D, and Winn, M. MRC2014: Extensions to the MRC format header for electron cryo-microscopy and tomography. Journal of Structural Biology, 129 (2015), pp 146–150.
[CP2011a]Chambolle, A and Pock, T. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision, 40 (2011), pp 120-145.
[CP2011b]Chambolle, A and Pock, T. Diagonal preconditioning for first order primal-dual algorithms in convex optimization. 2011 IEEE International Conference on Computer Vision (ICCV), 2011, pp 1762-1769.
[CP2011c]Combettes, P L, and Pesquet, J-C. Proximal splitting methods in signal processing. In: Bauschke, H H, Burachik, R S, Combettes, P L, Elser, V, Luke, D R, and Wolkowicz, H. Fixed-point algorithms for inverse problems in science and engineering, Springer, 2011.
[GNS2009]Griva, I, Nash, S G, and Sofer, A. Linear and nonlinear optimization. Siam, 2009.
[Hei+2016]Heide, F et al. ProxImaL: Efficient Image Optimization using Proximal Algorithms. ACM Transactions on Graphics (TOG), 2016.
[KP2015]Komodakis, N, and Pesquet, J-C. Playing with Duality: An overview of recent primal-dual approaches for solving large-scale optimization problems. IEEE Signal Processing Magazine, 32.6 (2015), pp 31–54.
[Kva1991]Kvaalen, E. A faster Broyden method. BIT Numerical Mathematics 31 (1991), pp 369–372.
[Lue1969]Luenberger, D G. Optimization by vector space methods. Wiley, 1969.
[Okt2015]Oktem, O. Mathematics of electron tomography. In: Scherzer, O. Handbook of Mathematical Methods in Imaging. Springer, 2015, pp 937–1031.
[PB2014]Parikh, N, and Boyd, S. Proximal Algorithms. Foundations and Trends in Optimization, 1 (2014), pp 127-239.
[Pre+2007]Press, W H, Teukolsky, S A, Vetterling, W T, and Flannery, B P. Numerical Recipes in C - The Art of Scientific Computing (Volume 3). Cambridge University Press, 2007.
[Ray1997]Raydan, M. The Barzilai and Borwein method for the large scale unconstrained minimization problem. SIAM J. Optim., 7 (1997), pp 26–33.
[Roc1970]Rockafellar, R. T. Convex analysis. Princeton University Press, 1970.
[Sid+2012]Sidky, E Y, Jorgensen, J H, and Pan, X. Convex optimization problem prototyping for image reconstruction in computed tomography with the Chambolle-Pock algorithm. Physics in Medicine and Biology, 57 (2012), pp 3065-3091.
[SW1971]Stein, E, and Weiss, G. Introduction to Fourier Analysis on Euclidean Spaces. Princeton University Press, 1971.
[Val2014]Valkonen, T. A primal-dual hybrid gradient method for non-linear operators with applications to MRI. Inverse Problems, 30 (2014).
[Du+2016]J. Duran, M. Moeller, C. Sbert, and D. Cremers. Collaborative Total Variation: A General Framework for Vectorial TV Models SIAM Journal of Imaging Sciences 9(1): 116–151, 2016.

Indices and tables