Welcome to RH 1.5D’s documentation!¶
Contents:
Introduction¶
What is RH 1.5D¶
RH 1.5D is a modified version of the RH radiative transfer code that runs through a 3D/2D/1D atmosphere column-by-column, in the same fashion of rhf1d. It was developed as a way to efficiently run RH column-by-column (1.5D) over large atmospheres, simplifying the output and being able to run in supercomputers. It is MPI-parallel and scales well to at least 10,000 processes.
While initially developed as another geometry on the RH tree, the requirements of the parallel version required changes to the RH core tree. Thus, it is more than a wrapper over RH and is distributed with a complete modified version of RH (see below for differences from RH distributed by Han Uitenbroek).
Acknowledging RH 1.5D¶
If you use RH 1.5D for your work, we would appreciate if you would acknowledge it appropriately in a publication, presentation, poster, or talk. For a publication, this is best done by citing the RH 1.5D (Pereira & Uitenbroek 2015) and RH (Uitenbroek 2001) papers. In addition, if the journal allows it please include a link to its Github repository.
What this manual covers¶
Because there is much in common with RH, this manual should be seen as an incremental documentation of the 1.5D parallel side. This manual focuses on what is different from RH. Users should refer to the RH documentation by Han Uitenbroek for more detailed information on RH.
Comparison with RH¶
RH 1.5D inherits most of the code base from RH, but some features are new. The code is organised in the same was as the RH source tree, with the routines specific to the 1.5D version residing on a subdirectory rh15d
of the rh
source tree. In this way, it works similarly to the subdirectories in rh
for different geometries. The compilation and linking proceeds as for the other geometries: first the general librh.a
library should be compiled, and then the code in rh15d
will be compiled and linked to it. The run directory is very similar to that of a given geometry in RH: most of the *.input
files are used in the same way.
The lists below show a comparison between RH 1.5D and RH for a 1D plane-parallel geometry:
Commonalities between RH 1.5D and RH¶
- Core RH library
- Structure and location of
*.input
files (some new options available) - Wavetable, Atom, Molecule, line list, and any other input files except the atmospheres
- Directory-level compilation and general structure of run directory (new subdirectories needed)
What is new in RH 1.5D¶
- MPI-parallelism with dynamic load balancing and efficient I/O
- Different format of atmosphere files
- Different formats of output files
- Select which quantities should be in output
- New options for
keyword.input
andatoms.input
- Hybrid angle-dependent PRD mode
- PRD-switching
- Option for using escape probability approximation as initial solution
- Exclude from the calculations the higher parts of the atmosphere, above a user-defined temperature threshold
- Depth scale optimisation
- Option for cubic Hermite interpolation of source function in formal solver
- Option for cubic Bézier and DELO Bézier interpolation of source function in formal solvers, both for polarised and unpolarised light
- Support for more types of collisional excitations
- Easy re-run of non-converged columns
- Option for keeping background opacities in memory and not in disk
- New analysis suite in Python
What is not supported in RH 1.5D¶
- Currently only a fraction of the RH output is written to disk (to save space), but more output can be added
- The old IDL analysis suite does not currently support the new output format
solveray
is no longer usedbackgrcontr
no longer works with the new output- Any other geometry aside from 1.5D
- Continuing old run by reading populations and output files
- Full Stokes NLTE iterations and background polarisation (might work with little effort, but has not been tested)
- Thread parallelisation
Installation¶
Getting the code¶
The code is available on a git repository, hosted on github: https://github.com/ita-solar/rh. If you don’t have git installed and just want to get started, the easiest way is to download a zip file with the latest revision: https://github.com/ita-solar/rh/archive/master.zip. If you have git installed and would like to be up-to-date with the repository, you can do a git clone:
git clone https://github.com/ita-solar/rh.git
or using SSH (only for contributors):
git clone git@github.com:ita-solar/rh.git
Whether you unpack the zip file or do one of the above it will create a directory called rh
in your current path. This directory will have the following subdirectories:
Directory | Contents |
---|---|
rh | Main RH source |
rh/Atmos | Used to keep atmosphere files |
rh/Atoms_example | Used to keep atom files, line and wavelength lists (example) |
rh/idl | Old RH IDL routines, not used |
rh/Molecules | Used to keep molecule files |
rh/python | Utility Python programs |
rh/rh15d_mpi | Source files for RH 1.5D |
rh/rhf1d | Source files for 1D geometry, deprecated, do not use |
rh/rhsc2d | Source files for 2D geometry, deprecated, do not use |
rh/rhsc3d | Source files for 3D geometry, deprecated, do not use |
rh/rhsphere | Source files for spherical geometry, deprecated, do not use |
rh/tools | Associate C programs for RH, not tested. |
Warning
The source code directories for different geometries (rhf1d, rhsc2d, rhsc3d, rhsphere) are still in the code tree, but they are deprecated and will be removed soon. With the latest changes related to rh15d, they are not guaranteed to work or even run. Do not use.
Dependencies¶
HDF5¶
RH 1.5D makes use of the HDF5 library to read the atmosphere files and write the output. It is not possible to run the code without this library. RH 1.5D requires HDF5 version 1.8.1 or newer (versions from branch 1.10.x do not currently work).
Note
RH 1.5D previously made use of the netCDF4 library for its output (which in turn also required HDF5). The latest changes mean RH 1.5D needs only HDF5. Because netCDF4 files are also HDF5 files, the output is still readable in the same way as before and input files in netCDF version 4 format can still be read in the same way by RH 1.5D. If you used input atmospheres in netCDF version 3 format, then these will have to be converted to HDF5. It is recommended that new atmosphere files be created in HDF5 only.
Because HDF5 is commonly used in high-performance computing, many supercomputers already have them available. In Fram, they can be loaded as (also loading the intel compilers):
module load HDF5/1.8.18-intel-2017a intel/2017a
in Pleiades:
module load hdf5/1.8.7/gcc/mpt
in Hexagon:
module load cray-hdf5-parallel
and at ITA’s Linux system:
module load hdf5/Intel/1.8.19
MPI¶
RH 1.5D is parallelised via MPI, therefore an MPI library is necessary even if running with only one CPU. These are readily available in supercomputers and clusters, but not always in individual workstations. In such cases, users will have to manually install both MPI and HDF5 libraries (HDF5 should be compiled with the MPI library so that the parallel I/O module works).
If using Linux or MacOS, the easiest way to install both MPI and HDF5 with parallel support is via the Anaconda Python distribution. Once you install Anaconda (version 3.6 or above), you can install the hdf5-parallel
package, which installs both MPI and HDF5 and has been tested with RH 1.5D:
conda install -c spectraldns hdf5-parallel
Warning
If you obtain pre-compiled binaries or packages for HDF5, you need to make sure they have parallel support enabled. Most available packages do not have parallel support. This includes typical packages from Linux distributions and the hdf5
Anaconda package (but not the above hdf5-parallel
package).
Compilation¶
Compilation of RH 1.5D consists of two steps:
- Compilation of the geometry-independent main libraries (
librh.a
andlibrh_f90.a
) - Compilation of the
rh15d_mpi
tree and main binaries
RH 1.5D has been compiled in a variety of architectures and compilers, including gcc, the Intel compilers, and clang. As for MPI implementations, it has been tested with SGI’s mpt, OpenMPI, mpich, mvapich, and Intel’s MPI.
Main libraries¶
First, one needs to set the environment variables OS
and CPU
:
export CPU=`uname -m`
export OS=`uname -s`
Note
All the shell commands given in this manual are for bash, so you’ll have to modify them if using another shell.
The main Makefile will then look for an architecture-dependent Makefile in rh/makefile.$CPU.$OS
. If a Makefile for your system combination does not exist, you’ll have to create a new Makefile and adapt it to your configuration. You need to make sure that the architecture-dependent Makefile reflects your system’s configuration (i.e., compiler names and flags).
After setting the correct compiler, just build the main libraries with make
on the rh
directory. If successful, the compilation will produce the two library files librh.a
and librh_f90.a
.
Program binaries¶
Go to the rh/rh15d_mpi/
directory and update the Makefile with your compiler and flags. You will need to set CC
to the MPI alias (e.g. mpicc
). The path to the HDF5 library needs to be explicitly set in HDF5_DIR
. In Fram and Hexagon this is already stored in the HDF5_DIR
environment variable.
If you are using HDF5 parallel from Anaconda, you need to set HDF5_DIR
to your Anaconda directory plus include/
(e.g. ~/anaconda/include
).
If your version of HDF5 was not built as a shared binary, you need to link HDF5 and other used libraries directly. Set the LDFLAGS
accordingly, and update the LIBS
variable to contain all the other libraries. For Pleiades, make sure your Makefile contains the following:
OTHER_LIBRARY_DIR = /path/to/library
HDF5_DIR = /path/to/hdf5
LDFLAGS = -L../ -L$(OTHER_LIBRARY_DIR)/lib/ -L$(HDF5_DIR)/lib/
LIBS = -lrh -lrh_f90 $(ARCHLIBS) -lhdf5_hl -lhdf5 -lz -lm
Once the Makefile is updated, compilation is achieved with make
. The following executables will be created:
File | Description |
---|---|
rh15d_ray_pool | Main RH 1.5D binary, uses a job pool (see Binaries and execution) |
rh15d_ray | Alternative RH 1.5D binary. Deprecated. This program runs much slower than rh15d_ray_pool and is kept for backwards compatibility only. Will be removed in a future revision. |
rh15d_lteray | Special binary for running in LTE |
Run directory¶
Once compiled, you can copy or link the binaries to a run directory. This directory will contain all the necessary input files, and it should contain two subdirectories called output
and scratch
.
Warning
If the subdirectories output
and scratch
do not exist in the directory where the code is run, the code will crash with an obscure error message.
Input files¶
Configuration files¶
The configuration of an RH 1.5D is made primarily through several text files that reside in the run directory. The main file is keyword.input
. All the other files and their locations are specified in keyword.input
. The source tree contains a sample rh/rh15d/run/
directory with the following typically used configuration files:
File | Description |
---|---|
atoms.input | Lists the atom files to be used |
keyword.input | Main configuration file |
kurucz.input | Contains list of line lists to be used |
molecules.input | Lists the molecule files to be used |
ray.input | Selects output mu and wavelengths for detailed output |
The kurucz.input
and the molecules.input
files are identical under RH, so we refer to the RH manual for more information about them. Most of the other files behave very similarly in RH and RH 1.5D, with a few differences.
The atoms.input
file is identical in RH, but it can also have a new starting solution, ESCAPE_PROBABILITY
.
The keyword.input
file functions in very much the same manner under RH and RH 1.5D. The main difference is that there are new options for the 1.5D version, and some options should not be used.
The new keyword.input
options for the 1.5D version are:
Name | Default value | Description |
---|---|---|
SNAPSHOT |
0 |
Snapshot index from the atmosphere file. |
X_START |
0 |
Starting column in the x direction. If < 0, will be set to 0. |
X_END |
-1 |
Ending column in the x direction. If <= 0, will be set to NX in the
atmosphere file. |
X_STEP |
1 |
How many columns to sample in the y direction. If < 1, will be set to 1. |
Y_START |
0 |
Starting column in the y direction. If < 0, will be set to 0. |
Y_END |
-1 |
Ending column in the y direction. If <= 0, will be set to NY in the
atmosphere file. |
Y_STEP |
1 |
How many columns to sample in the y direction. If < 1, will be set to 1. |
15D_WRITE_POPS |
FALSE |
If TRUE , will write the level populations (including LTE) for each active
atom to output_aux.hdf5 . |
15D_WRITE_RRATES |
FALSE |
If TRUE , will write the radiative rates (lines and continua) for each
active atom to output_aux.hdf5 . |
15D_WRITE_TAU1 |
FALSE |
If TRUE , will write the height of tau=1 to output_ray.hdf5 , for all
wavelengths (this takes up as much space as the intensity). |
15D_RERUN |
FALSE |
If TRUE , will rerun for non-converged columns. |
15D_DEPTH_ZCUT |
TRUE |
If TRUE , will perform a cut in z for points above a threshold temperature |
15D_TMAX_CUT |
-1 |
Threshold temperature (in K) over which the above depth cut ids made. If < 0, no temperature cut will be made. |
15D_DEPTH_REFINE |
FALSE |
If TRUE , will perform an optimisation of the depth scale,
based on optical depth, density and temperature gradients. |
BACKGR_IN_MEM |
FALSE |
If TRUE , will keep background opacity coefficients in memory instead of
scratch files on disk. |
BARKLEM_DATA_DIR |
../../Atoms |
Directory where the Barklem_*data.dat data files are saved. |
COLLRAD_SWITCH |
0.0 |
Defines if collisional radiative switching is on.
If < 0, switching parameter is constant (and equal to COLLRAD_SWITCH_INI ).
If = 0, no collisional radiative switching.
If > 0, collisional radiative switching decreases by
COLLRAD_SWITCH per log decade, starting with COLLRAD_SWITCH_INI . |
COLLRAD_SWITCH_INIT |
1.0 |
Initial increment for collisional-radiative switching |
LIMIT_MEMORY |
FALSE |
If TRUE , will not keep several large arrays in memory but rather save them
to scratch files. Not recommended unless memory usage is critical. |
N_PESC_ITER |
3 |
Number of escape probability iterations, if any atoms have it as initial solution. |
PRD_SWITCH |
0.0 |
If > 0, the PRD effects will be added gradually, converging
to the full PRD solution in 1/sqrt(PRD_SWITCH) iterations. |
PRDH_LIMIT_MEM |
FALSE |
If TRUE and using PRD_ANGLE_APPROX , will not keep in memory quantities
necessary to calculate the current PRD weights, but rather calculate them
again. Will affect the performance, so should be used only when necessary. |
S_INTERPOLATION |
LINEAR |
Type of source function interpolation to use in formal solver. Can be
LINEAR , BEZIER , BEZIER3 or CUBIC_HERMITE . |
S_INTERPOLATION_STOKES |
DELO_PARABOLIC |
Type of source function interpolation to use in formal solver for polarised
cases. Can be DELO_PARABOLIC or DELO_BEZIER3 , see [1] . |
VTURB_MULTIPLIER |
1.0 |
Atmospheric vturb will be multiplied by this value |
VTURB_ADD |
0.0 |
Value to be added to atmospheric vturb |
[1] | de la Cruz Rodríguez, J.; Piskunov, N. 2013, ApJ, 764, 33, ADS link. |
The X_START
, X_END
, and X_STEP
keywords (and the equivalent for the y direction) define which columns of the atmosphere file are going to be run. They can be used to calculate only a specific region. RH 1.5D chooses the columns to calculate using the (start, end, step)
parameters as in the range()
function in Python: the result is [start, start + step, start + 2 * step, ...]
. The last element is the largest start + i * step
less than end
. This means that the numbers given by X_END
and Y_END
are not inclusive (e.g. if nx = 50
and X_END = 49
, the column with the index 49 will not be calculated). One must set X_END = nx
to calculate all the columns.
The following options have a different meaning under RH 1.5D:
Name | Default value | Description |
---|---|---|
PRD_ANGLE_DEP |
PRD_ANGLE_INDEP |
This keyword is no longer boolean. To accommodate for new options, it
now takes the values PRD_ANGLE_INDEP for angle-independent PRD,
PRD_ANGLE_DEP for angle-dependent PRD, and PRD_ANGLE_APPROX for
the approximate angle-dependent scheme of Leenaarts et al. (2012) [2] . |
BACKGROUND_FILE |
This keyword is no longer the name of the background file, but the prefix of
the background files. There will be one file per process, and the filenames are
this prefix plus _i.dat , where i is the process number. |
|
STOKES_INPUT |
This option is not used in RH 1.5D because the magnetic fields are now written
to the atmosphere file. However, it must be set to any string if one is
using any STOKES_MODE other than NO_STOKES (RH won’t read B otherwise). |
[2] | Leenaarts, J., Pereira, T. M. D., & Uitenbroek, H. 2012, A&A, 543, A109, ADS link. |
And the following options are valid for RH but may not work with RH 1.5D:
Name | Default value | Description |
---|---|---|
LIMIT_MEMORY |
FALSE |
This option has not been tested and may not work well with RH 1.5D. |
PRINT_CPU |
FALSE |
This option does not work with RH 1.5D and should always be FALSE . |
N_THREADS |
0 |
Thread parallelism will not work with RH 1.5D.
This option should always be 0 or 1 . |
The ray.input
has the same structure in RH1D and RH 1.5D. In RH it is used as input for the solveray
program, but in RH 1.5D it is used for the main program. It should contain the following:
1.00
Nsource
The first line is the mu
angle for the output ray, and it should always be 1.00. The second line is Nsource
, the number of wavelengths for which detailed output (typically source function, opacity, and emissivities) will be written. If Nsource > 0
, it should be followed in the same line by the indices of the wavelengths (e.g. 0 2 10 20
).
Atom and molecule files¶
The atom and molecule files have the same format as in RH. In the rh/Atoms
and rh/Molecules
directories there are a few sample files. They are read by the procedures in readatom.c
and readmolecule.c
. The atom files have the following basic structure:
Input | Format |
---|---|
ID |
(A2). Two-character atom identifier. |
Nlevel Nline Ncont Nfixed |
(4I). Number of levels, lines, continua, and fixed radiation temperature transitions. |
level_entries |
Nlevel * (2F, A20, I) |
line entries |
Nline * (2I, F, A, I, A, 2F, A, 6F) |
continuum_entries |
Ncont * (I, I, F, I, A, F) |
fixed_entries |
Ncont * (2I, 2F, A) |
Atmosphere files¶
The atmosphere files for RH 1.5D are a significant departure from RH. They are written in the flexible and self-describing HDF5 format. They can be written with any version, except the 1.10.x development branch.
The atmosphere files contain all the atmospheric variables necessary for RH 1.5D, and they may contain one or more simulation snapshots. The basic dimensions of the file are:
nt |
Number of snapshots. |
nx |
Number of x points |
ny |
Number of y points. |
nz |
Number of depth points. |
nhydr |
Number of hydrogen levels. |
While strictly 3D atmosphere files, 2D and 1D snapshots can also be used provided that one or both of nx
and ny
are equal to 1.
The atmosphere file can contain the following variables:
Name | Dimensions | Units | Notes |
---|---|---|---|
B_x |
(nt, nx, ny, nz) |
T | Magnetic field x component. Optional |
B_y |
(nt, nx, ny, nz) |
T | Magnetic field y component. Optional |
B_z |
(nt, nx, ny, nz) |
T | Magnetic field z component. Optional |
electron_density |
(nt, nx, ny, nz) |
m-3 | Optional. |
hydrogen_populations |
(nt, nhydr, nx, ny, nz) |
m-3 | nhydr must correspond to the number
of levels in the hydrogen atom used. If
nhydr=1 , this variable should contain
the total number of hydrogen atoms
(in all levels), and LTE populations will
be calculated. |
snapshot_number |
(nt) |
None | The snapshot number is an array of integers to identify each snapshot in the output files. |
temperature |
(nt, nx, ny, nz) |
K | |
velocity_z |
(nt, nx, ny, nz) |
m s-1 | Vertical component of velocity. Positive velocity is upflow. |
velocity_turbulent |
(nt, nx, ny, nz) |
m s-1 | Turbulent velocity (microturbulence). |
z |
(nt, nx, ny, nz) or
(nt, nz) |
m | Height grid. Can vary with column and
snapshot. First nz index is top of
the atmosphere (closest to observer). |
Any other variable in the file will not be used. In addition, the atmosphere file must have a global attribute called has_B
. This attribute should be 1 when the magnetic field variables are present, and 0 otherwise. Also recommended, but optional, is a global attribute called description
with a brief description of the atmosphere file (e.g. how and from they were generated).
Note
Variables in the atmosphere files can be compressed (zlib or szip), but compression is not recommended for performance reasons.
As HDF5 files, the contents of the atmosphere files can be examined with the h5dump
utility. To see a summary of what’s inside a given file, one can do:
h5dump -H atmosfile
Here is the output of the above for a sample file:
HDF5 "example.hdf5" {
GROUP "/" {
ATTRIBUTE "boundary_bottom" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
ATTRIBUTE "boundary_top" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
ATTRIBUTE "description" {
DATATYPE H5T_STRING {
STRSIZE H5T_VARIABLE;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_UTF8;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
}
ATTRIBUTE "has_B" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
ATTRIBUTE "nhydr" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
ATTRIBUTE "nx" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
ATTRIBUTE "ny" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
ATTRIBUTE "nz" {
DATATYPE H5T_STD_I64LE
DATASPACE SCALAR
}
DATASET "electron_density" {
DATATYPE H5T_IEEE_F64LE
DATASPACE SIMPLE { ( 1, 512, 512, 425 ) / ( H5S_UNLIMITED, 512, 512, 425 ) }
}
DATASET "hydrogen_populations" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 1, 6, 512, 512, 425 ) / ( H5S_UNLIMITED, 6, 512, 512, 425 ) }
}
DATASET "snapshot_number" {
DATATYPE H5T_STD_I32LE
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
}
DATASET "temperature" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 1, 512, 512, 425 ) / ( H5S_UNLIMITED, 512, 512, 425 ) }
}
DATASET "velocity_z" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 1, 512, 512, 425 ) / ( H5S_UNLIMITED, 512, 512, 425 ) }
}
DATASET "x" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 512 ) / ( 512 ) }
}
DATASET "y" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 512 ) / ( 512 ) }
}
DATASET "z" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 1, 425 ) / ( H5S_UNLIMITED, 425 ) }
}
}
}
All the floating point variables can be either double or single precision.
Line lists and wavelength files¶
Other auxiliary files that can be used are line lists files and wavelength files.
The line list files are used to include additional lines not included in the different atoms. These lines will be treated in LTE. The line lists are specified in the kurucz.input
file (one per line), and have the Kurucz line list format (link).
Just adding new transitions doesn’t mean that they will be included in the synthetic spectra. The extra lines will only be included in the existing wavelength grid, which depends on the active atoms used. The calculation of additional wavelengths can be forced by using a wavelength file. This file is specified in keyword.input
using the keyword WAVETABLE
. The format is a binary XDR file. Its contents are, in order: the number of new wavelengths (1 XDR int), vacuum wavelength values (XDR doubles).
Running the code¶
Binaries and execution¶
Compilation should produce three executables: rh15d_ray_pool
, rh15d_ray
, and rh15d_lteray
. The latter is a special case for running only in LTE. The other two are the main programs. They represent two different modes of job distribution: normal and pool.
In the pool mode there is a process that works as overlord: its function is to distribute the work to other processes. The other processes (drones) ask the overlord for a work unit. When they finish their unit, they go back and ask for more, until all tasks are completed. Because of race conditions and because different columns will run at different speeds, it is not possible to know which columns a given process will run beforehand. Due to the overlord, rh15d_ray_pool
needs to run with two or more processes. The advantage of the pool mode is that the dynamic load allocation ensures the most efficient use of the resources. With the normal mode it may happen that some processors will work on columns that take longer to converge (especially as they are adjacent), and in the end the execution will have to wait for the process that takes longer. In some cases (especially with PRD) the pool mode can be 2-3 times faster than the normal mode. When one runs with a large number of processes (> 2000) and each column takes little time to calculate, the pool mode can suffer from communication bottlenecks and may be slower because a single overlord cannot distribute the tasks fast enough. The only disadvantage of the pool mode (so far) is that not all output is currently supported with this mode.
Warning
The normal mode is deprecated and will be removed in a later revision. Use only for single processor runs or if you know what you’re doing!
In the normal mode the jobs (all the atmosphere columns for which one wants to calculate) are divided by the number of processes at the start of the execution. There is no communication between processes, and each process knows from the start all the columns it is going to run. These columns are adjacent. If the number of columns is not a multiple of the number of processes, there will be some processes with larger workloads. There is no minimum number of processes to run, and rh15d_ray
can also be run in a single process. Regions of an atmosphere can take a lot longer to run than others, and the processes that work on those will take longer to finish. In the normal mode this means that the slowest process will set the overall running time, and therefore in practice it can take more than 10x longer than the pool mode (and is therefore not recommended).
As an MPI program, the binaries should be launched with the appropriate command. Some examples:
mpirun -np N ./rh15d_ray_pool
mpiexec ./rh15d_ray_pool # use in Pleiades
aprun -B ./rh15d_ray # use in Hexagon or other Cray
The run directory¶
Warning
Before running, make sure you have the sub-directories scratch
and output
in the run directory.
The run directory contains the configuration files, the binaries, and the scratch
and output
directories. As the names imply, temporary files will be placed under scratch
and the final output files in output
. No files under scratch
will be used after the run is finished (they are not read for re-runs).
The scratch
directory contains different types of files. Most of them are binary files write by RH 1.5D to save memory. Example files are the background_p*.dat
with the background opacities, files with PRD weights, and the rh_p*.log
log files. Each process creates one of those files, and they will have the suffix _pN.*
, where N
is the process number. The log files have the same format as in RH. The writing of each process’s log file is buffered by line. Because these are updated often, when running with many processes this can be a drag on some systems. Therefore, it is possible to run full buffering (meaning log files are only written when the main program finishes). This option is not exposed in the configuration files, so one needs to change the file parallel.c
in the following part:
/* _IOFBF for full buffering, _IOLBF for line buffering */
setvbuf(mpi.logfile, NULL, _IOLBF, BUFSIZ_MPILOG);
One should replace _IOLBF
by _IOFBF
to change from line buffering to full buffering.
The output
directory will contain the three output files: output_aux.hdf5
, output_indata.hdf5
, and output_ray.hdf5
. See Output file structure for more details on the structure of these files. If doing a re-run, these files must already exist; they will be updated with the new results. Otherwise, if these files are already in output
before the execution, they will be overwritten. At the start of the execution, the output files are written with a special a fill value. This means that the disk space for the full output must be available at the start of the run, and no CPU time will be wasted if at the end of the run there is not enough disk space. The files are usually written every time a process finishes work on a given column. The fill value arrays are overwritten with the data. One advantage of this method is that even if the system crashes or the program stops, it is possible to recover the results already written (and a re-run can be performed for just the missing columns).
All the processes write asynchronously to all the output files. In some cases this can cause contention in the filesystem, with many processes trying to access the same data at the same time. In the worst case scenario, the contention can create bottlenecks which practically stop the execution. Therefore, it is highly recommended that the users tune their filesystem for the typical loads of RH. Many supercomputers make use of Lustre, a parallel filesystem. With Lustre, resources such as files can be divided in different stripes that can be placed in several different machines (OSTs). For running RH with more than 500 processes, one should use as many OSTs as available in the system, and select the lustre stripe size to the typical amount of data written to a file per simulation column. The stripe can set with the lfs setstripe command:
lfs setstripe -s stripe_size -c stripe_count -o stripe_offset directory|filename
It can be run per file (e.g. output_ray.hdf5
), or for the whole output
directory. Using a stripe count of -1
will ensure that the maximum number of OSTs is used. For the typical files RH 1.5D produces, it is usually ok to apply the same Lustre settings to the whole output
directory, and the following settings seem to reasonable:
lfs setstripe -s 4m -c -1 output/
Similarly, the scratch
directory can also benefit from Lustre striping. Because most files there are small, it is recommended to use a stripe count of 1 for scratch
.
Warning
You need a parallel filesystem if you are running RH 1.5D across more than one host. Most supercomputers have parallel filesystems, but if you are running in a smaller cluster this may not be the case. RH 1.5D will always run, but the HDF5 writes will not work and the results will be unreadable. NFS is not a parallel file system.
Logs and messages¶
In addition to the logs per process saved to scratch
, a much smaller log will be printed in stdout
. This log is a smaller summary of what each process is doing. Here is an example of typical messages:
Process 1: --- START task 1, (xi,yi) = ( 0,156)
Process 232: --- START task 1, (xi,yi) = ( 0,159)
Process 36: --- START task 1, (xi,yi) = ( 0,162)
Process 12: --- START task 1, (xi,yi) = ( 0,171)
(...)
Process 12: *** END task 1 iter, iterations = 121, CONVERGED
Process 3: *** END task 1 iter, iterations = 200, NO Convergence
Process 4: *** SKIP task 1 (crashed after 81 iterations)
Process 3: --- START task 2, (xi,yi) = ( 23, 64)
Process 12: --- START task 2, (xi,yi) = ( 23, 65)
Process 4: --- START task 2, (xi,yi) = ( 23, 65)
(...)
*** Job ending. Total 262144 1-D columns: 262142 converged, 1 did not converge, 1 crashed.
*** RH finished gracefully.
In this example one can see the three possible outputs for a single-column calculation: convergence, non-convergence (meaning the target ITER_LIMIT
was not met in N_MAX_ITER
iterations), or a crash (many reasons). If there are singular matrices or other causes for a column to crash, RH 1.5D will skip that column and proceed to the next work unit. Such cases can be re-run with different parameters. In some cases (e.g. inexistent files) it is not possible to prevent a crash, and RH 1.5D will finish non-gracefully.
Reruns and lack of convergence¶
Dynamic atmospheres often have large gradients in temperature, density, velocity, etc. that cause problems when solving the non-LTE problem. This may lead to some (or all) columns not converging, and is dependent on the input atmosphere, model atoms, and run options. In RH terms, non-converged or “crashed”, represent the same problem. In some cases, the iterations diverge strongly (crash), while in others they fail to reach the target limit for convergence in the allocated maximum number of iterations (non-convergence).
The output for atmosphere columns that did not converged or crashed is not saved to disk (a fill value is used instead). The recommended procedure in these cases is to rerun RH with different input options (e.g. less agressive acceleration, a larger number of maximum iterations). A special rerun mode is available to save time calculating again the columns that have already converged. When 15D_RERUN = TRUE
in keyword.input
, RH will read the output and run again only for the columns that did not converge.
The rerun mode requires all the previous output to be under output/
. The user has the possibility of changing options in keyword.input
for the rerun. Not all options can be changed, because this could lead to non-sensical results (e.g. changing the input atmosphere or atom files). Only the following options can be changed in keyword.input
:
15D_RERUN |
15D_DEPTH_CUT |
15D_TMAX_CUT |
15D_DEPTH_REFINE |
N_PESC_ITER |
COLLRAD_SWITCH |
COLLRAD_SWITCH_INIT |
PRD_SWITCH |
NRAYS |
N_MAX_SCATTER |
N_MAX_ITER |
ITER_LIMIT |
NG_DELAY |
NG_ORDER |
NG_PERIOD |
S_INTERPOLATION |
PRD_N_MAX_ITER |
PRD_ITER_LIMIT |
B_STRENGTH_CHAR |
All the other options are locked to the values used for the firs run. Likewise, it is not possible to change atoms.input
, the atom files, or the line list files. When RH 1.5D is first run, nearly all input options (except the input atmosphere and molecule files) are saved into the output, and in a rerun these are read from the output and not from the original files. This also means that a rerun can be performed even if the original files are no longer available.
Helper script¶
There is a Python script called runtools.py
designed to make it easier to run RH 1.5D for large projects. It resides in rh/python/runtools.py
. It requires Python with the numpy and h5py (or netCDF4) modules. It was made to run a given RH 1.5D setup over many simulation snapshots, spanning several atmosphere files. It supports a progressive rerun of a given problem, and allows the of use different keyword.input
parameters for different columns, tackling columns harder to converge.
The first part of runtools.py
should be modified for a users’s need. It typically contains:
atmos_dir = '/mydata_dir'
seq_file = 'RH_SEQUENCE'
outsuff = 'output/output_ray_mysim_CaII_PRD_s%03i.hdf5'
mpicmd = 'mpiexec'
bin = './rh15d_ray_pool'
defkey = 'keyword.save'
log = 'rh_running.log'
tm = 40
rerun = True
rerun_opt = [ {'NG_DELAY': 60, 'NG_PERIOD': 40, '15D_DEPTH_REFINE': 'FALSE',
'15D_ZCUT': 'TRUE', 'N_MAX_ITER': 250, 'PRD_SWITCH': 0.002 },
{'NG_DELAY': 120, 'NG_PERIOD': 100, '15D_DEPTH_REFINE': 'TRUE',
'PRD_SWITCH': 0.001 } ]
The different options are:
Name | Type | Description |
---|---|---|
atmos_dir |
string | Directory where the atmosphere files are kept. |
seq_file |
string | Location of sequence file. This file contains the names of the atmosphere files to be used (one file per line). The script will then run RH 1.5D for every snapshot in every file listed. |
outsuff |
string | Template to write the output_ray.ncdf files. The %03i
format will be replaced with the snapshot number. |
mpicmd |
string | System-dependent command to launch MPI. The script knows that for aprun the -B option should be used. This option also activates system specific routines (e.g. how to kill the run in pleiades). |
bin |
string | RH 1.5D binary to use. |
defkey |
string | Default template for keyword.input . Because the of the
rerun options, keyword.input is overwritten for every rerun.
This file is used as a template it (i.e., most of its options
will be unchanged, unless specified in rerun_opt ). |
log |
string | File where to save the main log. Will be overwritten for each new snapshot. |
tm |
int | Timeout (in minutes) to kill execution of code, if there is no
message written to main log. Used to prevent code from hanging if
there are system issues. After killed, program is relaunched.
If tm = 0 , program will never be killed. |
rerun |
bool | If True , will re-run the program (with different settings) to
achieve convergence if any columns failed. Number of reruns is
given by size of rerun_opt . |
rerun_opt |
list | Options for re-run. This is a list made of dictionaries. Each
dictionary contains the keywords to update keyword.input .
Only the keywords that differ from the defkey file are
necessary. |
Note
Only the first line of the sequence time is read at a time. The script reads the first line, deletes it from the file, and closes the file. It then reads the first line again and continues running, until there are no more lines in the file. This behaviour enables the file to be worked by multiple scripts at the same time, and allows one to dynamically change the task list at any time of the run.
Note
The script also includes a tenaciously persistent wait and relaunch feature designed to avoid corruption if there are system crashes or problems. Besides the tm
timeout, if there is any problem with the execution, the code will wait for some periods and try and relaunch the code. For example, if one of the atmosphere files does not exist, runtools.py
will try three times and then proceed to the next file.
Analysis of output¶
Output file structure¶
The output is written to three files: output_aux.hdf5
, output_indata.hdf5
, and output_ray.hdf5
. This is a big departure from RH, which contained several more output files. In particular, RH 1.5D will not write all the information that was written by RH, due to the sheer size it would take for large 3D simulations. The files are written in the machine-independent, self-describing HDF5 format. The contents of the files are organised in groups, variables, and attributes. Groups and variables can be imagined as directories and files in a filesystem. Inside groups, different variables and dimensions can be organised. The content of the output files can vary: some runs will have more detailed information and therefore more variables written to the files.
HDF5 is an open, platform-independent format, and therefore interfaces to many programming languages are available. The main interface libraries are available in C, C++, Fortran, and Java. But there are also interfaces for Python (h5py), Julia, IDL (from version 6.2), MATLAB , Octave, Perl, and R.
The RH 1.5D output format is standard HDF5 but it is also compatible with NetCDF 4 readers: in most cases one needs to specify only the variable or group name to read the data. The HDF5 and NetCDF libraries provide useful command line tools, which can be used to gather information about the RH 1.5D files or extract data. Additionally, there is a more complete set of tools written in Python to read and analyse these files.
Warning
Because of the limitations of different languages, not all interfaces support all HDF5 features. Some libraries (e.g. IDL or plain h5py in Python) will not detect missing data in arrays (written with the fill value). In such cases, reading variables with missing data (see Output file structure), the data are read with no warning or indication of those that have special fill values.
The structure of the three output files is given below.
Note
When a column fails to converge, output for that column is not written. This means that the variables that depend on (nx, ny)
will have some values missing. HDF5 marks these values as missing data and uses a fill value (of 9.9692e+36). When the 15D_DEPTH_ZCUT
option is used, not all heights will be used in the calculation. The code does not read the skipped parts of the atmosphere. When writing such variables of nz
, only the points that were used are written to the file, and the rest will be marked as missing data (typically the z cut height varies with the column).
output_aux.hdf5
¶
This file contains the level populations and radiative rates. For each active atom or molecule, it contains different groups called atom_XX
or molecule_XX
, where XX
is the identifier for the species (e.g. MG
, CO
).
Note
The atmosphere dimensions on many of the output files are not necessarily the same as in the atmosphere file. They depend on the number of columns calculated, which are a function of X/Y_START/END/STEP
.
It has the following global attributes:
atmosID |
Identifier for the atmosphere file. |
rev_id |
Revision identifier. |
nx |
Number of points in x dimension |
ny |
Number of points in y dimension |
nz |
Number of points in height dimension |
Inside each of the atom/molecule groups, the following dimensions can exist:
Name | Description |
---|---|
x |
Horizontal x dimension. |
y |
Horizontal y dimension. |
height |
Vertical dimension. |
level |
Number of atomic levels. |
line |
Number of atomic transitions |
continuum |
Number of bound-free transitions. |
vibration_level |
Number of molecule vibration levels. |
molecular_line |
Number of molecular lines. |
rotational_state |
Number of rotational states. |
The atom groups can contain the following optional variables:
Name | Dimensions | Description |
---|---|---|
populations |
(level, x, y, height) |
Atomic populations. |
populations_LTE |
(level, x, y, height) |
Atomic LTE populations. |
Rij_line |
(line, x, y, height) |
Radiative rates out of the line. |
Rji_line |
(line, x, y, height) |
Radiative rates into the line. |
Rij_continuum |
(continuum, x, y, height) |
Radiative rates out of the bf transition. |
Rji_continuum |
(continuum, x, y, height) |
Radiative rates into the bf transition. |
The molecule groups can contain the following optional variables:
Name | Dimensions | Description |
---|---|---|
populations |
(vibration_level, x, y, height) |
Molecular populations. |
populations_LTE |
(vibration_level, x, y, height) |
Molecular LTE populations. |
All units are SI.
Note
In older versions it was possible to specify the keyword 15D_WRITE_EXTRA
and get additional output written to output_aux.hdf5
(e.g. a new opacity
group and more rates). While the procedures are still in writeAux_p.c
, the functionality is deprecated because other changes in the code were not compatible with this way of writing the output. It is possible that this functionality will return at a later version.
output_indata.hdf5
¶
This file contains data and metadata related to the run. It contains three groups: input
(mostly settings from keyword.input
), atmos
(atmospheric variables), and mpi
(several variables relating to the run).
It has the following global attributes:
atmosID |
Identifier for the atmosphere file. |
rev_id |
Revision identifier. |
nx |
Number of points in x dimension |
ny |
Number of points in y dimension |
nz |
Number of points in height dimension |
The input
group contains all the input files (except atmosphere and molecular data), and a few attributes that are options from keyword.input
. It contains the following string variables:
Variable | Description |
---|---|
atom_groups |
Array with names of atom groups. Other is the same as atom order in atoms.input |
atoms_file_contents |
Contents of atoms.input saved into a string |
keyword_file_contents |
Contents of keyword.input saved into a string |
kurucz_file contents |
Contents of kurucz.input saved into a string, only if used |
The input
group also has other groups inside. If Kurucz line lists are used, it contains groups called Kurucz_line_file0
, …, Kurucz_line_fileN
, where N-1 is the total number of line list files. The other groups are all atom files (PASSIVE
and ACTIVE
), and they take the names of atom_XX
, where XX
is the element name (for a list of these, see the variable atom_groups
above). Inside all of these groups (Kurucz and atom) there is one variable, called `file_contents
, which contains the file saved intro a string and an attribute, called file_name
, which contains the file name and path. These input options and files are read instead of the original files when doing a rerun.
The atmos
groups contains the dimensions x
, y
, height
, element
and ray
. It also contains the following variables:
Name | Dimensions | Units | Description |
---|---|---|---|
temperature |
(x, y, height) |
K | Temperatures |
velocity_z |
(x, y, height) |
m s-1 | Vertical velocities |
height_scale |
(x, y, height) |
m | Height scale used. Can be different for every column when depth refine is used. |
element_weight |
(element) |
a.m.u. | Atomic weights |
element_abundance |
(element) |
Log of element abundances relative to hydrogen (A(H) = 12). | |
muz |
(ray) |
mu values for each ray. | |
muz |
(ray) |
mu weights for each ray. | |
x |
(x) |
m | Spatial coordinates along x axis. |
y |
(y) |
m | Spatial coordinates along y axis. |
Note
When 15D_DEPTH_REFINE
is used, each column will have a different (optimised) height scale, but they all have the same number of depth points (nz
). In these cases, it is very important to save the height
variable because otherwise one does not know how to relate the height relations of quantities from different columns.
The atmos
group also contains the following attributes:
moving |
Unsigned int, 1 if velocity fields present. |
stokes |
Unsigned int, 1 if stokes output present. |
The mpi
group contains the dimensions x
, y
, and iteration
(maximum number of iterations).
Warning
iteration
is currently hardcoded in the code to a maximum of 1500. If you try to run more than 1500 iterations, there will be an error writing to the output.
The mpi
group also contains several variables:
Name | Dimensions | Description |
---|---|---|
xnum |
(x) |
Indices of x positions calculated. |
xnum |
(x) |
Indices of x positions calculated. |
task_map |
(x, y) |
Maps which process ran which column. |
task_map_number |
(x, y) |
Maps which task number each column was. |
iterations |
(x, y) |
Number of iterations used for each column. |
convergence |
(x, y) |
Indicates if each column converged or not. Possible
values are 1 (converged), 0 (non converged),
or -1 (crashed). |
delta_max |
(x, y) |
Final value for delta_max when iteration finished. |
delta_max_history |
(x, y, iteration) |
Evolution of delta_max |
z_cut |
(x, y) |
Height index of the temperature cut. |
The mpi
group also contains the following attributes: x_start
, x_end
, x_step
, y_start
, y_end
, and y_step
, all of which are options from keyword.input
.
output_ray.hdf5
¶
This file contains the synthetic spectra and can also contain extra information such as opacities and the source function. It contains only the root group. Its dimensions are x
, y
, wavelength
, and eventually wavelength_selected
and height
. The latter two are only present when ray.input
specifies more than 0
wavelengths for detailed output, and it matches Nsource
, the number of those wavelengths entered in ray.input
.
It can contain the following variables:
Name | Dimensions | Units | Description |
---|---|---|---|
wavelength |
(wavelength) |
nm | Wavelength scale. |
intensity |
(x, y, wavelength) |
W m-2 Hz-1 sr-1 | Synthetic disk-centre intensity (Stokes I). |
stokes_Q |
(x, y, wavelength) |
W m-2 Hz-1 sr-1 | Stokes Q. Optional. |
stokes_U |
(x, y, wavelength) |
W m-2 Hz-1 sr-1 | Stokes U. Optional. |
stokes_V |
(x, y, wavelength) |
W m-2 Hz-1 sr-1 | Stokes V. Optional. |
tau_one_height |
(x, y, wavelength) |
m | Height where optical depth reaches unity, for each column. Optional. |
wavelength_selected |
(wavelength_selected) |
Wavelength scale for the detailed output variables below. Optional. | |
wavelength_indices |
(wavelength_selected) |
Indices of wavelengths selected for variables below. Optional. | |
chi |
(x, y, height, wavelength_selected) |
m-1 | Total opacity (line and continuum). Optional. |
source_function |
(x, y, height, wavelength_selected) |
W m-2 Hz-1 sr-1 | Total opacity (line and continuum). Optional. |
Jlambda |
(x, y, height, wavelength_selected) |
W m-2 Hz-1 sr-1 | Angle-averaged radiation field. Optional. |
scattering |
(x, y, height, wavelength_selected) |
Scattering term multiplied by Jlambda. Optional. |
The wavelength
is in nm, air or vacuum units, depending if VACUUM_TO_AIR
is TRUE
or FALSE
(in keyword.input
). chi
is in m-1and tau_one_height
in m.
Despite internally being calculated in double precision, all the output (except the wavelength scale) is written in single precision to save disk space.
The full Stokes vector is only written when in keyword.input
STOKES_MODE
is not NO_STOKES
and the STOKES_INPUT
is set.
The chi
, source_function
, and Jlambda
variables depend on the 3D grid and on wavelength. Therefore, for even moderate grid sizes they can take huge amounts of space. If nx = ny = nz = 512
and wavelength_selected = 200
, each of these variables will need 100Gb of disk space. For a simulation with a cubic grid of 10243 points and saving the full output for 1000 wavelength points, output_ray.hdf5
will occupy a whopping 12Tb per snapshot of disk space. To avoid such problems, these large arrays are only written when ray.input
contains Nsource > 0
, and for the wavelengths selected.
The output_ray.hdf5
file contains the following global attributes:
atmosID |
Identifier for the atmosphere file |
snapshot_number |
Number of simulation snapshot (from atmosphere file) |
rev_id |
Revision identifier |
nx |
Number of points in x dimension |
ny |
Number of points in y dimension |
nz |
Number of points in height dimension |
nwave |
Number of wavelength points |
wavelength_selected |
Number of wavelength points selected for detailed output |
creation_time |
Local time when file was created |
Command line tools¶
Two useful command line tools that come with HDF5 are h5dump and h5repack.
h5dump
can be used with the -H
option to look at the header of a file: see the dimensions, variables, groups. It can also be used to print a text version of any variable in an HDF5 file (e.g. this can be redirected to a text file). When printing a variable (dataset in HDF5) one uses the option -d variable
, and the resulting output is the same as in the -H
mode, with the variable printed at the end. The NetCDF ncdump program offers an even clearer look into the file (e.g. used with the -h
option to print out the header).
The h5repack
program can be used to copy and modify the parameters of HDF5 files. It can convert the files between different format versions, compress variables, etc. Of particular importance is the option for rechunking a file. Chunking in HDF5 files can be used to improve performance by changing the disk structures to improve different read patterns. It is analogous to fully or partially transposing the variables along certain dimensions.
See also
- h5dump guide
- Detailed information about
h5dump
. - h5repack guide
- Detailed information about
h5repack
. - Chunking in HDF5
- Description on the advantages of chunking.
Reading output in Python¶
The helita package has a complete python interface to read the output, input, and visualise files from RH 1.5D. The `helita
tools are described in detail in section helita interface.
If `helita
is not available, the easiest and fastest way to read the RH 1.5D output (or input) files in Python is via the xarray package. xarray
can load the output files as a dataset directly, but in the case of the output_aux.hdf5
and output_indata.hdf5
one needs to specify which group to read (see above).
Here is a quick example on how to read some output from RH 1.5D with xarray
:
>>> import xarray
>>> ray = xarray.open_dataset("output_ray.hdf5")
>>> ray
<xarray.Dataset>
Dimensions: (height: 82, wavelength: 902, wavelength_selected: 10, x: 1, y: 1)
Coordinates:
* wavelength (wavelength) float64 28.0 31.4 32.8 33.7 34.3 35.3 ...
* wavelength_selected (wavelength_selected) float64 85.1 276.4 278.5
* x (x) float64 0.0
* y (y) float64 0.0
Dimensions without coordinates: height
Data variables:
Jlambda (x, y, height, wavelength_selected) float64 ...
chi (x, y, height, wavelength_selected) float64 ...
intensity (x, y, wavelength) float64 ...
scattering (x, y, height, wavelength_selected) float64 ...
source_function (x, y, height, wavelength_selected) float64 ...
wavelength_indices (wavelength_selected) int32 ...
Attributes:
atmosID: FALC_82_5x5.hdf5 (Wed Jan 10 15:29:28 2018)
snapshot_number: 0
rev_id: 001d537 Tiago Pereira 2018-01-10 12:34:07 +0100
nx: 1
ny: 1
nz: 82
nwave: 902
wavelength_selected: 3
creation_time: 2018-01-10T16:16:42+0100
>>> aux = xarray.open_dataset("output_aux.hdf5", group="atom_MG")
>>> aux
<xarray.Dataset>
Dimensions: (continuum: 10, height: 82, level: 11, line: 15, x: 1, y: 1)
Coordinates:
* x (x) float64 0.0
* y (y) float64 0.0
Dimensions without coordinates: continuum, height, level, line
Data variables:
Rij_continuum (continuum, x, y, height) float64 ...
Rij_line (line, x, y, height) float64 ...
Rji_continuum (continuum, x, y, height) float64 ...
Rji_line (line, x, y, height) float64 ...
populations (level, x, y, height) float64 ...
populations_LTE (level, x, y, height) float64 ...
Attributes:
nlevel: 11
nline: 15
ncontinuum: 10
Reading output in IDL¶
There are no specific IDL routines for reading the output from RH 1.5D. However, there is a utility function that can be used to variables from HDF5/netCDF4 files, under the idl/
directory in a file named read_ncdf_var.pro
. The function read_ncdf_var()
can be used to read variables from an HDF5 or netCDF4 file, e.g.:
IDL> data = read_ncdf_var("output_ray.hdf5", "intensity")
IDL> help, data
DATA FLOAT = Array[902, 512, 512]
IDL> pops = read_ncdf_var("output_aux.hdf5", "populations", groupname="atom_CA")
IDL> help, pops
POPS FLOAT = Array[400, 512, 512, 5]
Note
The IDL analysis suite of RH does not work with RH 1.5D.
helita
interface¶
The helita Python package contains several routines to interface with RH 1.5D. Installation instructions are available in its website.
Reading and writing input files¶
Writing atmosphere files¶
The rh15d
module in helita.sim
contains a function to write an input atmosphere in RH 1.5D format, assuming the user already has the required data to write at hand. Its function definition is:
def make_xarray_atmos(outfile, T, vz, z, nH=None, x=None, y=None, Bz=None, By=None,
Bx=None, rho=None, ne=None, vx=None, vy=None, vturb=None,
desc=None, snap=None, boundary=None, append=False):
"""
Creates HDF5 input file for RH 1.5D using xarray.
Parameters
----------
outfile : string
Name of destination. If file exists it will be wiped.
T : n-D array
Temperature in K. Its shape will determine the output
dimensions. Shape is generally (nt, nx, ny, nz), but any
dimensions except nz can be omitted. Therefore the array can
be 1D, 2D, or 3D, 4D but ultimately will always be saved as 4D.
vz : n-D array
Line of sight velocity in m/s. Same shape as T.
z : n-D array
Height in m. Can have same shape as T (different height scale
for each column) or be only 1D (same height for all columns).
nH : n-D array, optional
Hydrogen populations in m^-3. Shape is (nt, nhydr, nx, ny, nz),
where nt, nx, ny can be omitted but must be consistent with
the shape of T. nhydr can be 1 (total number of protons) or
more (level populations). If nH is not given, rho must be given!
ne : n-D array, optional
Electron density in m^-3. Same shape as T.
rho : n-D array, optional
Density in kg m^-3. Same shape as T. Only used if nH is not given.
vx : n-D array, optional
x velocity in m/s. Same shape as T. Not in use by RH 1.5D.
vy : n-D array, optional
y velocity in m/s. Same shape as T. Not in use by RH 1.5D.
vturb : n-D array, optional
Turbulent velocity (Microturbulence) in km/s. Not usually needed
for MHD models, and should only be used when a depth dependent
microturbulence is needed (constant microturbulence can be added
in RH).
Bx : n-D array, optional
Magnetic field in x dimension, in Tesla. Same shape as T.
By : n-D array, optional
Magnetic field in y dimension, in Tesla. Same shape as T.
Bz : n-D array, optional
Magnetic field in z dimension, in Tesla. Same shape as T.
x : 1-D array, optional
Grid distances in m. Same shape as first index of T.
y : 1-D array, optional
Grid distances in m. Same shape as second index of T.
x : 1-D array, optional
Grid distances in m. Same shape as first index of T.
snap : array-like, optional
Snapshot number(s).
desc : string, optional
Description of file
boundary : Tuple, optional
Tuple with [bottom, top] boundary conditions. Options are:
0: Zero, 1: Thermalised, 2: Reflective.
append : boolean, optional
If True, will append to existing file (if any).
"""
Note that while in this routine the writing of the hydrogen populations is optional (they can be derived from the mass density, if available), RH 1.5D does not support this yet.
Note
The variables passed to make_xarray_atmos
must be consistent with the height scale. The first height index must be the top of the atmosphere (closest to observer), and the height scale must be strictly decreasing.
Reading atmosphere files¶
Once written, the input atmosphere files can be read in Python with xarray
, and do not require helita
.
For example:
>>> import xarray
>>> atmos = xarray.open_dataset('my_atmos.hdf5')
>>> atmos
<xarray.Dataset>
Dimensions: (depth: 82, nhydr: 6, snapshot_number: 1, x: 5, y: 5)
Coordinates:
* x (x) int64 0 1 2 3 4
* y (y) int64 0 1 2 3 4
z (snapshot_number, depth) float32 ...
* snapshot_number (snapshot_number) int32 0
Dimensions without coordinates: depth, nhydr
Data variables:
temperature (snapshot_number, x, y, depth) float32 ...
velocity_z (snapshot_number, x, y, depth) float32 ...
electron_density (snapshot_number, x, y, depth) float64 ...
hydrogen_populations (snapshot_number, nhydr, x, y, depth) float32 ...
velocity_turbulent (snapshot_number, x, y, depth) float32 ...
Attributes:
comment: Created with make_xarray_atmos on 2018-01-25 15:28:10.4...
boundary_top: 0
boundary_bottom: 1
has_B: 0
description: FAL C model with 82 depth points replicated to 5x5 colu...
nx: 5
ny: 5
nz: 82
nt: 1
The amount of detail loaded by xarray
will depend how the atmosphere was written. Older atmosphere files may not have as much verbose attributes or labeled coordinates (especially if written by plain HDF5 with no attaching of dimension scales), but they are still valid. Older netCDF atmospheres should work fine with xarray
.
It is also possible to modify the data with xarray
, and saving and updated atmosphere is done via the to_netcdf()
method:
>>> atmos.to_netcdf("newfile.hdf5", format='NETCDF4')
Be sure to use format='NETCDF4'
so that the file is internally HDF5!
Writing wavelength files¶
Another utility function in rh15d.py
is make_wave_file
. This creates an RH wavelength file (to be used with the option WAVETABLE
in keyword.input
) that contains additional wavelengths to be calculated. The function’s usage is documented in its function call:
def make_wave_file(outfile, start=None, end=None, step=None, new_wave=None,
ewave=None, air=True):
"""
Writes RH wave file (in xdr format). All wavelengths should be in nm.
Parameters
----------
start: number
Starting wavelength.
end: number
Ending wavelength (non-inclusive)
step: number
Wavelength separation
outfile: string
Name of file to write.
ewave: 1-D array, optional
Array of existing wavelengths. Program will make discard points
to make sure no step is enforced using these points too.
air: boolean, optional
If true, will at the end convert the wavelengths into vacuum
wavelengths.
"""
You can either supply an array with the wavelengths, or give a range of wavelengths and a fixed spacing, e.g.:
>>> from helita.sim import rh15d
>>> rr = rh15d.Rh15dout()
# this will write wavelenghts from 650 to 650 nm, 0.01 nm spacing:
>>> rh15d.make_wave_file('my.wave', 650, 660, 0.01)
# this will write an existing array "my_waves", if it exists
>>> rh15d.make_wave_file('my.wave', ewave=my_waves)
Reading output files¶
The main class to read the output is called Rh15dout
. It uses xarray
under the hood and populates an object with all the different datasets. It can be initiated in the following way:
>>> from helita.sim import rh15d
>>> rr = rh15d.Rh15dout()
--- Read ./output_aux.hdf5 file.
--- Read ./output_indata.hdf5 file.
--- Read ./output_ray.hdf5 file.
By default, it will look for the three files in the directory specified as main argument (defaults to current directory). Additionally, the method read_group(infile)
can be used to manually load the output_aux.hdf5
or output_indata.hdf5
and the method and read_ray(infile)
can be used to manually load the output_ray.hdf5
file. The variables themselves are not read into memory, but are rather a memmap object (file pointer; only read when needed) that xarray
opens.
After loading the files, the Rh15dout
instance loads each file as an xarray
dataset with the base name of each group (e.g. ray
, atmos
, atom_CA
, mpi
).The ray
attribute contains the same dataset as shown in the xarray
example above.
The attributes of each file are still accessible under the attributes of each object, e.g.:
>>> rr.ray.creation_time
'2018-01-10T16:16:42+0100'
>>> rr.atmos.nrays
5
>>> rr.mpi.nprocesses
2048
With xarray
it is easy to quickly inspect and plot different quantities. For example, to plot the intensity at (x, y) = (0, 0)
:
>>> rr.ray.intensity[0, 0].plot()
Or the intensity at a fixed wavelength:
>>> rr.ray.intensity.sel(wavelength=279.55, method='nearest').plot()
(This only shows a 2D image if you calculated the intensity from a 3D model, otherwise an histogram or line plot is shown.)
Visualisation and notebooks¶
helita
includes a visualisation module, helita.sim.rh15d_vis
, with widgets that are meant to be used inside the Jupyter notebook. To use these, you will need to install not only helita
but also the Matplotlib Jupyter Extension and the IPython widgets for Jupyter. If you have Anaconda, both can be installed with conda:
conda install -c conda-forge ipywidgets ipympl widgetsnbextension
jupyter nbextension enable --py widgetsnbextension
You can also install them with pip
(check their pages for details).
Currently we have the following Jupyter notebooks for visualisation of RH 1.5D output:
To use the above notebooks, you need to have run RH 1.5D and have the output files ready!
You can also explore the input atmosphere files with the Jupyter widget rh15d_vis.InputAtmosphere
in helita
:
>>> from helita.sim import rh15d_vis
>>> rh15d_vis.InputAtmosphere('my_atmos.hdf5');
Known bugs and limitations¶
RH 1.5D is always evolving, and there are likely to be bugs and limitations. Please send all bug reports to tiago.pereira-at-astro.uio.no and they will be dealt with as time permits.
Current issues¶
- Check the github RH issues page for an updated list.
- If the
scratch
oroutput
directories are not present, the code will crash. The error message is not very clear. - In
keyword.input
, if one sets aSNAPSHOT
value to be more than what is in the atmosphere file, the code will stop with an error message:Index exceeds dimension bound
. This error should be made more clear. - The atom files must not end with a blank line, otherwise
gencol
will fail and the program stops. - Line buffered or full buffered log options still require the user to change the source code.
- Depth refinement fails in some cases due to problems caused by cubic interpolation artefacts.
- Using more than 4000 cores and writing full output may cause I/O slowdowns and Lustre contention in some systems.
Planned features¶
- Support for multiple snapshots in the output files.
- pool mode be more flexible, with the possibility of several overlord nodes, useful for running with more than 4000 processes.
- More flexible control of what output is written.
\ Sort by:\ best rated\ newest\ oldest\
\\
Add a comment\ (markup):
\``code``
, \ code blocks:::
and an indented block after blank line