powerbox

https://img.shields.io/pypi/v/powerbox.svg https://travis-ci.org/steven-murray/powerbox.svg?branch=master https://coveralls.io/repos/github/steven-murray/powerbox/badge.svg?branch=master https://api.codacy.com/project/badge/Grade/5853411c78444a5a9c6ec4058c6dbda9

Make arbitrarily structured, arbitrary-dimension boxes and log-normal mocks.

powerbox is a pure-python code for creating density grids (or boxes) that have an arbitrary two-point distribution (i.e. power spectrum). Primary motivations for creating the code were the simple creation of log-normal mock galaxy distributions, but the methodology can be used for other applications.

Features

  • Works in any number of dimensions.
  • Really simple.
  • Arbitrary isotropic power-spectra.
  • Create Gaussian or Log-Normal fields
  • Create discrete samples following the field, assuming it describes an over-density.
  • Measure power spectra of output fields to ensure consistency.
  • Seamlessly uses pyFFTW if available for ~double the speed.

Installation

Clone/Download then python setup.py install. Or just pip install powerbox.

Acknowledgment

If you find powerbox useful in your research, please cite http://ascl.net/1805.001

Contents

Examples

To help get you started using powerbox, we’ve compiled a few simple examples. Other examples can be found in the API documentation for each object or by looking at some of the tests.

Getting Started with Powerbox

In [3]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

There are two useful classes in powerbox: the basic PowerBox, and one for log-normal fields: LogNormalPowerBox. You can import them like this:

In [1]:
from powerbox import PowerBox, LogNormalPowerBox

Once imported, to see all the options, just use help(PowerBox).

Create a 2D Gaussian field with power-law power-spectrum

For a basic 2D Gaussian field with a power-law power-spectrum, one can use the following:

In [10]:
pb = PowerBox(N=512,                     # Number of grid-points in the box
              dim=2,                     # 2D box
              pk = lambda k: 0.1*k**-2., # The power-spectrum
              boxlength = 1.0,           # Size of the box (sets the units of k in pk)
              seed = 1010)               # Set a seed to ensure the box looks the same every time (optional)

plt.imshow(pb.delta_x,extent=(0,1,0,1))
plt.colorbar()
plt.show()
_images/demos_getting_started_7_0.png

The delta_x output is always zero-mean, so it can be interpreted as an over-density field, \(\rho(x)/\bar{\rho} -1\). The caveat to this is that an overdensity field is physically invalid below -1. To ensure the physical validity of the field, the option ensure_physical can be set, which clips the field:

In [11]:
pb = PowerBox(N=512,                     # Number of grid-points in the box
              dim=2,                     # 2D box
              pk = lambda k: 0.1*k**-2., # The power-spectrum
              boxlength = 1.0,           # Size of the box (sets the units of k in pk)
              seed = 1010,               # Set a seed to ensure the box looks the same every time (optional)
              ensure_physical=True)      # ** Ensure the delta_x is a physically valid over-density **

plt.imshow(pb.delta_x,extent=(0,1,0,1))
plt.colorbar()
plt.show()
_images/demos_getting_started_9_0.png

If you are actually dealing with over-densities, then this clipping solution is probably a bit hacky. What you want is a log-normal field…

Create a 2D Log-Normal field with power-law power spectrum

The LogNormalPowerBox class is called in exactly the same way, but the resulting field has a log-normal pdf with the same power spectrum.

In [12]:
lnpb = LogNormalPowerBox(N=512,                     # Number of grid-points in the box
                         dim=2,                     # 2D box
                         pk = lambda k: 0.1*k**-2., # The power-spectrum
                         boxlength = 1.0,           # Size of the box (sets the units of k in pk)
                         seed = 1010)               # Use the same seed as our powerbox

plt.imshow(lnpb.delta_x,extent=(0,1,0,1))
plt.colorbar()
plt.show()
_images/demos_getting_started_13_0.png

Again, the delta_x is zero-mean, but has a longer positive tail due to the log-normal nature of the distribution. This means it is always greater than -1, so that the over-density field is always physical.

Create some discrete samples on the field

powerbox lets you easily create samples that follow the field:

In [18]:
fig, ax = plt.subplots(1,2, sharex=True,sharey=True,gridspec_kw={"hspace":0}, subplot_kw={"ylim":(0,1),"xlim":(0,1)}, figsize=(10,5))

# Create a discrete sample using the PowerBox instance.
samples = pb.create_discrete_sample(nbar=50000,      # nbar specifies the number density
                                    min_at_zero=True  # by default the samples are centred at 0. This shifts them to be positive.
                                   )
ln_samples = lnpb.create_discrete_sample(nbar=50000, min_at_zero=True)

# Plot the samples
ax[0].scatter(samples[:,0],samples[:,1], alpha=0.2,s=1)
ax[1].scatter(ln_samples[:,0],ln_samples[:,1],alpha=0.2,s=1)
plt.show()
_images/demos_getting_started_17_0.png

Within each grid-cell, the placement of the samples is uniformly random. The samples can instead be placed on the cell edge by setting randomise_in_cell to False.

Check the power-spectrum of the field

powerbox also contains a function for computing the (isotropic) power-spectrum of a field. This function accepts either a box defining the field values at every co-ordinate, or a set of discrete samples. In the latter case, the routine returns the power spectrum of over-densities, which matches the field that produced them. Let’s go ahead and compute the power spectrum of our boxes, both from the samples and from the fields themselves:

In [19]:
from powerbox import get_power
In [24]:
# Only two arguments required when passing a field
p_k_field, bins_field = get_power(pb.delta_x, pb.boxlength)
p_k_lnfield, bins_lnfield = get_power(lnpb.delta_x, lnpb.boxlength)

# The number of grid points are also required when passing the samples
p_k_samples, bins_samples = get_power(samples, pb.boxlength,N=pb.N)
p_k_lnsamples, bins_lnsamples = get_power(ln_samples, lnpb.boxlength,N=lnpb.N)

Now we can plot them all together to ensure they line up:

In [26]:
plt.plot(bins_field, 0.1*bins_field**-2., label="Input Power")

plt.plot(bins_field, p_k_field,label="Normal Field Power")
plt.plot(bins_samples, p_k_samples,label="Normal Sample Power")
plt.plot(bins_lnfield, p_k_lnfield,label="Log-Normal Field Power")
plt.plot(bins_lnsamples, p_k_lnsamples,label="Log-Normal Sample Power")

plt.legend()
plt.xscale('log')
plt.yscale('log')

_images/demos_getting_started_24_0.png

Create a log-normal mock dark-matter distribution

In [3]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

In this demo, we create a mock dark-matter distribution, based on the cosmological power spectrum. To generate the power-spectrum we use the hmf code (https://github.com/steven-murray/hmf).

The box can be set up like this:

In [5]:
from hmf import MassFunction
from scipy.interpolate import InterpolatedUnivariateSpline as spline
import numpy as np
from powerbox import LogNormalPowerBox

# Set up a MassFunction instance to access its cosmological power-spectrum
mf = MassFunction(z=0)

# Generate a callable function that returns the cosmological power spectrum.
spl = spline(np.log(mf.k),np.log(mf.power),k=2)
power = lambda k : np.exp(spl(np.log(k)))

# Create the power-box instance. The boxlength is in inverse units of the k of which pk is a function, i.e.
# Mpc/h in this case.
pb = LogNormalPowerBox(N=256, dim=3, pk = power, boxlength= 100.)

Now we can make a plot of a slice of the density field:

In [6]:
plt.imshow(np.mean(pb.delta_x[:100,:,:],axis=0),extent=(0,100,0,100))
plt.colorbar()
plt.show()
_images/demos_cosmological_fields_5_0.png

And we can also compare the power-spectrum of the output field to the input power:

In [7]:
from powerbox import get_power

p_k, kbins = get_power(pb.delta_x,pb.boxlength)
plt.plot(mf.k,mf.power,label="Input Power")
plt.plot(kbins,p_k,label="Sampled Power")
plt.xscale('log')
plt.yscale('log')
plt.legend()
plt.show()
_images/demos_cosmological_fields_7_0.png

Furthermore, we can sample a set of discrete particles on the field and plot them:

In [10]:
particles = pb.create_discrete_sample(nbar=0.1,min_at_zero=True)

plt.figure(figsize=(8,8))
plt.scatter(particles[:,0],particles[:,1],s=np.sqrt(100./particles[:,2]),alpha=0.2)
plt.xlim(0,100)
plt.ylim(0,100)
plt.show()
_images/demos_cosmological_fields_9_0.png

Or plot them in 3D!

In [17]:
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10,10))
ax = fig.add_subplot(111, projection='3d')

ax.scatter(particles[:,0], particles[:,1], particles[:,2],s=1,alpha=0.2)
plt.show()
_images/demos_cosmological_fields_11_0.png

Then check that the power-spectrum of the sample matches the input:

In [19]:
p_k_sample, kbins_sample = get_power(particles, pb.boxlength,N=pb.N)

plt.plot(mf.k,mf.power,label="Input Power")
plt.plot(kbins_sample,p_k_sample,label="Sampled Power Discrete")
plt.xscale('log')
plt.yscale('log')
plt.legend()
plt.show()
_images/demos_cosmological_fields_13_0.png

Changing Fourier Conventions

The powerbox package allows for arbitrary Fourier conventions. Since (continuous) Fourier Transforms can be defined using different definitions of the frequency term, and varying normalisations, we allow any consistent combination of these, using the same parameterisation that Mathematica uses, i.e.:

\[F(k) = \left(\frac{|b|}{(2\pi)^{1-a}}\right)^{n/2} \int f(r) e^{-i b\mathbf{k}\cdot\mathbf{r}} d^n\mathbf{r}\]

for the forward-transform and

\[f(r) = \left(\frac{|b|}{(2\pi)^{1+a}}\right)^{n/2} \int F(k) e^{+i b\mathbf{k}\cdot\mathbf{r}} d^n \mathbf{k}\]

for its inverse. Here \(n\) is the number of dimensions in the transform, and \(a\) and \(b\) are free to be any real number. Within powerbox, \(b\) is taken to be positive.

The most common choice of parameters is \((a,b) = (0,2\pi)\), which are the parameters that for example numpy uses. In cosmology (a field which powerbox was written in the context of), a more usual choice is \((a,b)=(1,1)\), and so these are the defaults within the PowerBox classes.

In this notebook we provide some examples on how to deal with these conventions.

Doing the DFT right.

In many fields, we are concerned primarily with the continuous FT, as defined above. However, typically we must evaluate this numerically, and therefore use a DFT or FFT. While the conversion between the two is simple, it is all too easy to forget which factors to normalise by to get back the analogue of the continuous transform.

That’s why in powerbox we provide some fast fourier transform functions that do all the hard work for you. They not only normalise everything correctly to produce the continuous transform, they also return the associated fourier-dual co-ordinates. And they do all this for arbitrary conventions, as defined above. And they work for any number of dimensions.

Let’s take a look at an example, using a simple Gaussian field in 2D:

\[f(x) = e^{-\pi r^2},\]

where \(r^2 = x^2 + y^2.\)

The Fourier transform of this field, using the standard mathematical convention is:

\[\int e^{-\pi r^2} e^{-2\pi i k\cdot x} d^2x = e^{-\pi k^2},\]

where \(k^2 = k_x^2 + k_y^2\).

In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

from powerbox import fft,ifft
from powerbox.powerbox import _magnitude_grid
In [7]:
# Parameters of the field
L = 10.
N = 512
dx = L/N

x = np.arange(-L/2,L/2,dx)[:N] # The 1D field grid
r = _magnitude_grid(x,dim=2)   # The magnitude of the co-ordinates on a 2D grid
field = np.exp(-np.pi*r**2)    # Create the field

# Generate the k-space field, the 1D k-space grid, and the 2D magnitude grid.
k_field, k, rk = fft(field,L=L,          # Pass the field to transform, and its size
                     ret_cubegrid=True   # Tell it to return the grid of magnitudes.
                    )

# Plot the field minus the analytic result
plt.imshow(np.abs(k_field)-np.exp(-np.pi*rk**2),extent=(k.min(),k.max(),k.min(),k.max()))
plt.colorbar()
Out[7]:
<matplotlib.colorbar.Colorbar at 0x7fcedc018690>
_images/demos_dft_6_1.png

We can now of course do the inverse transform, to ensure that we return the original:

In [10]:
x_field, x_, rx = ifft(k_field, L = L,   # Note we can pass L=L, or Lk as the extent of the k-space grid.
                       ret_cubegrid=True)

plt.imshow(np.abs(x_field)-field,extent=(x.min(),x.max(),x.min(),x.max()))
plt.colorbar()
plt.show()
_images/demos_dft_8_0.png

We can also check that the xgrid returned is the same as the input xgrid:

In [11]:
x_ -x
Out[11]:
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])
Changing the convention

Suppose we instead required the transform

\[\int e^{-\pi r^2} e^{-i \nu \cdot x} d^2x = e^{-\nu^2/4\pi}.\]

This is the same transform but with the Fourier-convention \((a,b) = (1,1)\). We would do this like:

In [14]:
# Generate the k-space field, the 1D k-space grid, and the 2D magnitude grid.
k_field, k, rk = fft(field,L=L,          # Pass the field to transform, and its size
                     ret_cubegrid=True,   # Tell it to return the grid of magnitudes.
                     a=1,b=1             # SET THE FOURIER CONVENTION
                    )

# Plot the field minus the analytic result
plt.imshow(np.abs(k_field)-np.exp(-1./(4*np.pi)*rk**2),extent=(k.min(),k.max(),k.min(),k.max()))
plt.colorbar()
Out[14]:
<matplotlib.colorbar.Colorbar at 0x7fcec6148e50>
_images/demos_dft_13_1.png

Again, specifying the inverse transform with these conventions gives back the original:

In [15]:
x_field, x_, rx = ifft(k_field, L = L,   # Note we can pass L=L, or Lk as the extent of the k-space grid.
                       ret_cubegrid=True,
                       a=1,b=1
                      )

plt.imshow(np.abs(x_field)-field,extent=(x.min(),x.max(),x.min(),x.max()))
plt.colorbar()
plt.show()
_images/demos_dft_15_0.png
Mixing up conventions

It may be that sometimes the forward and inverse transforms in a certain problem will have different conventions. Say the forward transform has parameters \((a,b)\), and the inverse has parameters \((a',b')\). Then first taking the forward transform, and then inverting it (in \(n\)-dimensions) would yield:

\[\left( \frac{b'}{b(2\pi)^{a'-a}}\right)^{n/2} f\left(\frac{b'r}{b}\right),\]

and doing it the other way would yield:

\[\left( \frac{b}{b'(2\pi)^{a'-a}}\right)^{n/2} F\left(\frac{bk}{b'}\right).\]

The fft and ifft functions handle these easily. For example, if \((a,b) = (0,2\pi)\) and \((a',b') = (0,1)\), then the 2D forward-then-inverse transform should be

\[f(r/(2\pi))/ 2\pi,\]

and the inverse-then-forward should be

\[2\pi F(2\pi k).\]
In [21]:
# Import a handy function to do radial binning
from powerbox import angular_average

# Do the forward transform
k_field,k,rk = fft(field,L=L,a=0,b=2*np.pi, ret_cubegrid=True)

# Do the inverse transform, ensuring the boxsize is correct
mod_field,modx,modr = ifft(k_field,Lk=-2*k.min(),a=0,b=1, ret_cubegrid=True)

mod_field, bins = angular_average(mod_field, modr, 300)

plt.plot(bins,mod_field, label="Numerical",lw=3,ls='--')
plt.plot(bins,np.exp(-np.pi*(bins/(2*np.pi))**2)/(2*np.pi),label="Analytic")
plt.legend()
plt.yscale('log')
plt.xscale('log')
plt.ylim(1e-7,3)
plt.show()
_images/demos_dft_18_0.png
Using Different Conventions in Powerbox

These fourier-transform wrappers are used inside powerbox to do the heavy lifting. That means that one can pass a power spectrum which has been defined with arbitrary conventions, and receive a fully consistent box back.

Let’s say, for example, that the fourier convention in your field was to use \((a,b)=(0,1)\), so that the power spectrum of a 2D field, \(\delta_x\) was given by

\[P(k) = \frac{1}{2\pi} \int \delta_x e^{-ikx} d^2x.\]

We now wish to create a realisation with a power spectrum following these conventions. Let’s say the power spectrum is \(P(k) = 0.1k^{-2}\).

In [39]:
from powerbox import PowerBox

pb = PowerBox(N=512,dim=2,pk = lambda k : 0.1*k**-3.,
              a=0, b=1,        # Set the Fourier convention
              boxlength=50.0   # Has units of inverse k
              )

plt.imshow(pb.delta_x,extent=(0,50,0,50))
plt.colorbar()
plt.show()
_images/demos_dft_21_0.png

When we check the power spectrum, we also have to remember to set the Fourier convention accordingly:

In [40]:
from powerbox import get_power

power, kbins = get_power(pb.delta_x,pb.boxlength, a= 0,b =1)

plt.plot(kbins,power,label="Numerical")
plt.plot(kbins,0.1*kbins**-3.,label="Analytic")
plt.legend()
plt.xscale('log')
plt.yscale('log')
plt.show()
_images/demos_dft_23_0.png

License

Copyright (c) 2016 Steven Murray

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Changelog

v0.5.3 [22 May 2018]

Bugfixes - Fixed a bug introduced in v0.5.1 where using bin_ave=False in angular_average_nd would fail.

v0.5.2 [17 May 2018]

Enhancements - Added ability to calculate the variance of an angularly averaged quantity. - Removed a redundant calculation of the bin weights in angular_average

Internals - Updated version numbers of dev requirements.

v0.5.1 [4 May 2018]

Enhancements - Added ability to not have dimensionless power spectra from get_power. - Also return linearly-spaced radial bin edges from angular_average_nd - Python 3 compatibility

Bugfixes - Fixed bug where field was modified in-place unexpectedly in angular_average - Now correctly flattens weights before getting the field average in angular_average_nd

v0.5.0 [7 Nov 2017]

Features - Input boxes to get_power no longer need to have same length on every dimension. - New angular_average_nd function to average over first n dimensions of an array.

Enhancements - Huge (5x or so) speed-up for angular_average function (with resulting speedup for get_power). - Huge memory reduction in fft/ifft routines, with potential loss of some speed (TODO: optimise) - Better memory consumption in PowerBox classes, at the expense of an API change (cached properties no

longer cached, or properties).
  • Modified fftshift in dft to handle astropy Quantity objects (bit of a hack really)

Bugfixes - Fixed issue where if the boxlength was passed as an integer (to fft/ifft), then incorrect results occurred. - Fixed issue where incorrect first_edge assignment in get_power resulted in bad power spectrum. No longer require this arg.

v0.4.3 [29 March 2017]

Bugfixes - Fixed volume normalisation in get_power.

v0.4.2 [28 March 2017]

Features - Added ability to cross-correlate boxes in get_power.

v0.4.1

Bugfixes - Fixed cubegrid return value for dft functions when input boxes have different sizes on each dimension.

v0.4.0

Features - Added fft/ifft wrappers which consistently return fourier transforms with arbitrary Fourier conventions. - Boxes now may be composed with arbitrary Fourier conventions. - Documentation!

Enhancements - New test to compare LogNormalPowerBox with standard PowerBox. - New project structure to make for easier location of functions. - Code quality improvements - New tests, better coverage.

Bugfixes - Fixed incorrect boxsize for an odd number of cells - Ensure mean density is correct in LogNormalPowerBox

v0.3.2

Bugfixes - Fixed bug in pyFFTW cache setting

v0.3.1

Enhancements - New interface with pyFFTW to make fourier transforms ~twice as fast. No difference to the API.

v0.3.0

Features - New functionality in get_power function to measure power-spectra of discrete samples.

Enhancements - Added option to not store discrete positions in class (just return them) - get_power now more streamlined and intuitive in its API

v0.2.3 [11 Jan 2017]

Enhancements - Improved estimation of power (in get_power) for lowest k bin.

v0.2.2 [11 Jan 2017]

Bugfixes - Fixed a bug in which the output power spectrum was a factor of sqrt(2) off in normalisation

v0.2.1 [10 Jan 2017]

Bugfixes - Fixed output of create_discrete_sample when not randomising positions.

Enhancements - New option to set bounds of discrete particles to (0, boxlength) rather than centring at 0.

v0.2.0 [10 Jan 2017]

Features - New LogNormalPowerBox class for creating log-normal fields

Enhancements - Restructuring of code for more flexibility after creation. Now requires cached_property package.

v0.1.0 [27 Oct 2016]

First working version. Only Gaussian fields working.

API Summary

powerbox.powerbox The main module of powerbox.
powerbox.tools A set of tools for dealing with structured boxes, such as those output by powerbox.
powerbox.dft A module defining some “nicer” fourier transform functions.

Indices and tables