pyRSD¶
Accurate predictions for the clustering of galaxies in redshift-space in Python
pyRSD is a Python package for computing the theoretical predictions of the redshift-space power spectrum of galaxies. The package also includes functionality for performing Bayesian parameter estimation using the MCMC sampling technique or Maximum a posteriori estimation using the LBFGS algorithm to perform the nonlinear optimization.
Note
The theoretical models used in this paper are described in more detail in Hand et al. 2017. Please cite this work if you use this package in your research.
Index¶
Getting Started
Install pyRSD¶
pyRSD can be installed using the conda
utility or by installing from
source.
Conda¶
The easiest installation method uses the conda
utility, as part
of the Anaconda package
manager. We have pre-built binaries available that are compatible with
Linux and macOS platforms. The package can be installed via:
conda install -c nickhand -c astropy pyrsd
The package is available for Python versions 2.7, 3.5, and 3.6.
Install From Source¶
pyRSD can also be installed directly from source:
# clone the github source
git clone https://github.com/nickhand/pyRSD.git
cd pyRSD
# run the install
pip install .
Test¶
Test whether or not the installation succeeded by importing the module in IPython:
import pyRSD
Overview¶
pyRSD provides functionality for computing complex, theoretical models of the galaxy power spectrum in redshift space. The package has two main modules:
pyRSD.rsd
The module responsible for evaluating theoretical power spectra and related quantities for an input cosmology specified by the user.
pyRSD.rsdfit
The module responsible for performing parameter estimation; it uses the theory models inpyRSD.rsd
and finds the best-fit parameters describing an input data set.
The pyRSD.rsd module¶
This module provides the ability to compute several theoretical power spectrum quantities. These include
The galaxy power spectrum in redshift space. See this section for more details.
pyRSD.rsd.QuasarSpectrum
The quasar power spectrum in redshift space. See this section for more details
pyRSD.rsd.hzpt
A module for computing dark matter power spectra using Halo Zel’dovich Perturbation Theory. See this section for more details.
pyRSD.pygcl
A module for interfacing with the CLASS Boltzmann code and computing various clustering quantities using the CLASS transfer function. This is a swig-wrapped C++ module that computes most of the perturbation theory and other numerically intensive calculations on which the pyRSD models are based. See this section for more details.
The pyRSD.rsdfit module¶
This module handles parameter estimation, fitting the theoretical models
provided in the pyRSD.rsd
module to data provided by the user. There
are two ways parameter estimation can be performed:
- Monte Carlo Markov Chain (MCMC)
The full posterior distribution of the model parameters can be found using theemcee
Python MCMC package.
- Nonlinear optimization via the LBFGS algorithm
The best-fit parameters can be found by maximizing the likelihood distribution, which is performed using the well-known LBFGS algorithm.
Quickstart¶
The pyRSD package provides a number of examples in order to get users
up and running quickly. Once these examples have been downloaded, the
user can start running their own parameter fits using the example data
and parameter files. The executable responsible for downloading the pyRSD
examples is pyrsd-quickstart
, which has the following calling sequence:
$ pyrsd-quickstart -h
usage: pyrsd-quickstart [-h]
{galaxy/survey-poles,galaxy/periodic-poles,galaxy/periodic-pkmu}
dirname
download example data and parameter files to get up and running with pyRSD
positional arguments:
{galaxy/survey-poles,galaxy/periodic-poles,galaxy/periodic-pkmu}
which example to download
dirname the output directory to save the downloaded files; if
it doesn't exist it will be created
optional arguments:
-h, --help show this help message and exit
The first argument provided to the command is the name of the example to download, which should be of the following:
Example Name | Description |
galaxy/periodic-pkmu | Fitting \(P(k,\mu)\) galaxy data from a periodic box simulation |
galaxy/periodic-poles | Fitting \(P_\ell(k)\) galaxy data from a periodic box simulation |
galaxy/survey-poles | Fitting \(P_\ell(k)\) galaxy data from a simulation with a realistic survey window function |
And the second argument is the name of the directory to download the files too.
So, for example, to download the files from the periodic-poles
examples to
a directory called pyRSD-example
simply do
pyrsd-quickstart galaxy/periodic-poles pyRSD-example
Running Parameter Fits¶
With the example downloaded, the user can run MCMC or nonlinear optimization
fits using the default model parametrization with the rsdfit
command.
See The Free Parameters for a table of the default free parameters and
The Constrained Parameters for a table of the default constrained parameters.
For example, to run 10 nonlinear optimization steps (using the LBFGS optimization algorithm), you can do
rsdfit nlopt -m pyRSD-example/model.npy -p pyRSD-example/params.dat -o pyRSD-example-results -i 10
or to run 10 MCMC iterations (using 30 emcee
walkers), you can do
rsdfit mcmc -m pyRSD-example/model.npy -p pyRSD-example/params.dat -o pyRSD-example-results -i 10 -w 30
The number of iterations has been set to 10 here just for illustration purporse. Typically, the LBFGS algoritm will take \(\sim100-200\) steps to converge, and the mcmc algorithm will need 1000s of iterations to fully explore the posterior distributions of the parameters.
Analyzing the Fit Results¶
The rsdfit
saves the best-fit parameter set to a numpy .npz
file in
the directory specified via the -o
output, which is pyRSD-example-results
in the example above. There are two Python objects in pyRSD that can read these
files, depending on the type of fit that was run. For mcmc
fits, use the
pyRSD.rsdfit.results.EmceeResults
class and for nlopt
fits, use
the pyRSD.rsdfit.results.LBFGSResults
class.
For example, to explore the fitting results from a mcmc
fit
In [1]: from pyRSD.rsdfit.results import EmceeResults, LBFGSResults
In [2]: mcmc_results = EmceeResults.from_npz('mcmc_result.npz')
# print out a summary of the parameters, with mean values and 68% and 95% intervals
In [3]: print(mcmc_results)
Free parameters [ median (+/-68%) (+/-95%) ]
_______________
<Parameter Nsat_mult: 2.423 (+0.2082 -0.1809) (+0.4361 -0.3432)>
<Parameter alpha_par: 1.009 (+0.00682 -0.007681) (+0.01353 -0.01379)>
<Parameter alpha_perp: 1.005 (+0.004241 -0.004049) (+0.008281 -0.008406)>
<Parameter b1_cA: 1.999 (+0.05749 -0.05801) (+0.1126 -0.1265)>
<Parameter f: 0.867 (+0.02897 -0.02502) (+0.05642 -0.05073)>
<Parameter f1h_sBsB: 3.569 (+0.5521 -0.6841) (+1.224 -1.396)>
<Parameter fs: 0.1434 (+0.007553 -0.008046) (+0.01554 -0.01544)>
<Parameter fsB: 0.4662 (+0.08146 -0.0792) (+0.1685 -0.1606)>
<Parameter gamma_b1sA: 1.31 (+0.1001 -0.1032) (+0.2119 -0.2205)>
<Parameter gamma_b1sB: 2.343 (+0.1661 -0.181) (+0.3225 -0.3636)>
<Parameter sigma8_z: 0.5411 (+0.01224 -0.01362) (+0.02611 -0.02452)>
<Parameter sigma_c: 0.9297 (+0.06205 -0.06473) (+0.1126 -0.1177)>
<Parameter sigma_sA: 3.443 (+0.2779 -0.2702) (+0.5436 -0.5378)>
Constrained parameters [ median (+/-68%) (+/-95%) ]
______________________
<Parameter NsBsB: 5.178e+04 (+1.721e+04 -1.36e+04) (+4.16e+04 -2.44e+04)>
<Parameter b1_sA: 2.617 (+0.169 -0.1761) (+0.3037 -0.3725)>
<Parameter sigma_sB: 5.71 (+0.3067 -0.2659) (+0.6624 -0.4928)>
<Parameter NcBs: 2.261e+04 (+2900 -2096) (+6130 -4001)>
<Parameter b1_sB: 4.686 (+0.2679 -0.3235) (+0.5596 -0.672)>
<Parameter b1_cB: 3.175 (+0.1504 -0.1736) (+0.2942 -0.3522)>
<Parameter fcB: 0.122 (+0.01336 -0.0149) (+0.02856 -0.02761)>
<Parameter b1_c: 2.141 (+0.04721 -0.04684) (+0.08834 -0.08337)>
<Parameter b1: 2.344 (+0.05369 -0.04764) (+0.1051 -0.1053)>
<Parameter fsigma8: 0.4689 (+0.009673 -0.009076) (+0.01866 -0.01848)>
<Parameter b1_s: 3.575 (+0.1907 -0.2037) (+0.3854 -0.4186)>
<Parameter alpha: 1.006 (+0.004107 -0.004331) (+0.007706 -0.0083)>
<Parameter epsilon: 0.00125 (+0.002298 -0.002462) (+0.004696 -0.004652)>
<Parameter b1sigma8: 1.268 (+0.007575 -0.008006) (+0.01531 -0.01495)>
<Parameter F_AP: 1.004 (+0.006927 -0.007386) (+0.01419 -0.01393)>
# access parameters like a dictionary
In [4]: fsat = mcmc_results['fs']
In [5]: print(fsat.median)
0.14339263725236592
and to explore the fitting results from a nlopt
fit
In [6]: nlopt_results = LBFGSResults.from_npz('nlopt_result.npz')
# print out a summary of the parameters, with best-fit values
In [7]: print(nlopt_results)
minimum chi2 = 120.57745038002743
Free parameters [ mean ]
_______________
Nsat_mult : 2.4324089012014207
alpha_par : 1.0072991239946898
alpha_perp : 1.0048333693698863
b1_cA : 2.0068377074748778
f : 0.8735346371321935
f1h_sBsB : 3.6140366462767153
fs : 0.14175570928456935
fsB : 0.4782779433107577
gamma_b1sA : 1.31213984320964
gamma_b1sB : 2.3651886897525896
sigma8_z : 0.5363993024676117
sigma_c : 0.9194602113178405
sigma_sA : 3.420304679802984
Constrained parameters [ mean ]
_________________________
NsBsB : 51736.1
b1_sA : 2.6332517
NcBs : 23183.68
fcB : 0.11864934
sigma_sB : 5.739127
b1_sB : 4.74655
b1_cB : 3.2117057
epsilon : 0.0008172965
b1_c : 2.1497946
b1sigma8 : 1.2667638
alpha : 1.0056546
F_AP : 1.0024539
b1_s : 3.6439955
fsigma8 : 0.46856338
b1 : 2.3616061
# access best-fit values like a dictionary
In [8]: fsat = nlopt_results['fs']
In [9]: print(fsat)
0.14175570928456935
Comparing the Best-fit Model to Data¶
Users can compare the best-fitting model to the data by loading the
results of a fitting run using the pyRSD.rsdfit.FittingDriver
.
We can easily initialize this object by passing the directory where the results
were written to the pyRSD.rsdfit.FittingDriver.from_directory
function.
For the example data downloaded above, we can explore both the data
and theory simulataneously using the included result file
nlopt_result.npz
:
from pyRSD.rsdfit import FittingDriver
# load the model and results into one object
d = FittingDriver.from_directory('pyRSD-example', model_file='pyRSD-example/model.npy', results_file='pyRSD-example/nlopt_result.npz')
# set the fit results
d.set_fit_results()
# the best-fit log probability (likelihood + priors)
print(d.lnprob())
# the best-fit chi2
print(d.chi2())
# the best-fit reduced chi2
print(d.reduced_chi2())
# make a plot of the data vs the theory
d.plot()
show()

In this plot, we show the monopole, quadrupole, and hexadecapole normalized
by the smooth, no-wiggle Eisenstein and Hu
monopole. All of the above steps are identical if we are analyzing \(P(k,\mu)\)
data rather than \(P_\ell(k)\) data. For example, if the periodic-pkmu
example is downloaded, running the function FittingDriver.plot()
using
the included result file nlopt_result.npz
produces the following figure:

This plot shows the best-fit theory and data for 5 wide \(\mu\) bins, normalized by the linear Kaiser \(P(k,\mu)\), using the no-wiggle Eisenstein and Hu linear power spectrum.
RSD
The RSD module deals with computing the theoretical power spectrum predictions, given an input cosmology specified by the user.
- Specifying the Cosmology
- Interfacing with CLASS
- Halo Zel’dovich Perturbation Theory
- Galaxy Power Spectrum
- Quasar Power Spectrum
Specifying the Cosmology¶
The pyRSD.rsd.cosmology
module allows users to specify the desired
cosmological parameters, and it also provides the functionality for users
to interface with the CLASS CMB Boltzmann code.
Overview¶
Users can specify cosmological parameters by creating a new
Cosmology
object or by using one of the builtin
cosmologies (see Available Cosmologies).
When constructing a new Cosmology object, parameter values should be specified as keyword parameters. The parameters that can be specified are:
Parameter | Description |
---|---|
H0 | The Hubble constant at z=0, in km/s/Mpc |
Om0 | The matter density/critical density at z=0 |
Ob0 | The baryon density/critical density at z=0 |
Ode0 | The dark energy density/critical density at z=0 |
w0 | The dark energy equation of state |
Tcmb0 | The temperature of the CMB in K at z=0 |
Neff | The the effective number of neutrino species |
m_nu | The mass of neutrino species in eV |
sigma8 | The the mass variance on the scale of R=8 Mpc/h at z=0, which sets the normalization of the linear power spectrum |
n_s | The the spectral index of the primoridal power spectrum |
flat | if True, automatically set Ode0 such that Ok0 is zero |
Note
The pyRSD.rsd.cosmology.Cosmology
class is nearly identical to the
astropy.cosmology.FLRW
object, with the addition of the n_s
and sigma8
attributes
Examples¶
In [1]: from pyRSD.rsd import cosmology
# initialize a new Cosmology
In [2]: cosmo = cosmology.Cosmology(H0=70, sigma8=0.80, n_s=0.96)
# access parameters as attribute or key entry
In [3]: print(cosmo['sigma8'], cosmo.sigma8)
0.8 0.8
# compute the comoving distance to z = 0.4
In [4]: Dz = cosmo.comoving_distance(0.4)
In [5]: print(Dz)
1547.1248846885328 Mpc
The Cosmology class is read-only; changes to the parameters should be
performed with the clone()
function,
which creates a copy of the class, with any specified changes.
In [6]: new_cosmo = cosmo.clone(sigma8=0.85, Om0=0.27)
# compare sigma8
In [7]: print(cosmo.sigma8, new_cosmo.sigma8)
0.8 0.85
# compare Om0
In [8]: print(cosmo.Om0, new_cosmo.Om0)