Organize your computational models transparently¶
docs |
|
---|---|
tests |
|
package |
Welcome! This package attempts to create an interface for managing the usage of
a climate model. It provides the ModelOrganization
class that manages
different experiments in projects.
Content¶
Getting started¶
Motivation¶
Developing computational models can be quite a mess in terms of the file and configuration management. Therefore most of the (climate) models are accompanied with some kind of framework to guide the user through their piece of software. Those however can be very complex and every model follows it’s own strategy. Therefore this package tries to organize the procedure focusing at the end-user by providing an outer framework that allows the user to interfere with the model from the command line.
How does the package work¶
When doing research, we usually have a specific (funding) project that requests multiple runs (experiments) of our model. The framework of the model-organization package mirrors this basic structure: It works with projects, each project contains several experiments. The package keeps track of your projects and experiments and saves the configuration in separate configuration (.yml) files.
Configuration files¶
All the paths to the projects are stored in the configuration directory
determined by the name
attribute of your model (see the
config.get_configdir()
function). By default, it’s (on linux and mac)
"~/.config/<name>"
, but you can determine it via the global
<NAME>CONFIGDIR
variable, where <name>
must be replaced by the
ModelOrganizer.name
of your model. Our
example below would store the
configuration files in "$HOME/.config/square"
and the corresponding
environment variable is SQUARECONFIGDIR
.
The above directory contains 3 files:
- globals.yml
The global configuration that should be applicable to all projects
- projects.yml
A mapping from project name to project directory
- experiments.yml
The list of experiments
Additional, each project directory contains a '.project'
directory where
each experiment is represented by one yaml file (e.g. 'sine.yml'
in
our example) and the
project configuration is stored in '.project/.project.yml'
. To get the
specific name of the configuration file, you can also use the exp_path
parameter of the info()
method or the command
model info respectively.
Creating your own ModelOrganizer
¶
The ModelOrganizer
class is designed for subclassing in order to fit
to your specific model. See the incode documentation of the
ModelOrganizer
for more information.
Using the command line argument¶
Each method that is listed in the commands
attribute
is implemented as a subparser for the for the main command line utility (see
our Starting example: square). If you subclassed the model organizer, you can use
the main()
method to run your model. You can do
(for example) as we did in the example via:
if __name__ == '__main__':
SquareModelOrganizer.main()
You can see, the results of this methodology in the Command Line API Reference.
Parallel usage¶
The usage of configuration files includes some limitations because the
configuration cannot be accessed in parallel. Hence, you should not
setup projects
and
initialize experiments
in parallel. The same
for the archiving method. However, the run
, postproc
and
preproc
method as we implemented in in our example,
could be used in parallel.
Starting example: square¶
This small example will you demonstrate the basic methodology how our package works. We use a very simple model represented by only one function that squares the input data and we create a project to investigate trigonmetric functions (sine, cosine, etc.).
So let’s define our model function
"""A small example for a computational model
"""
def compute(x):
return x ** 2
and save it in a file called 'calculate.py'
file. This function runs
only in python and does not know anything about where the input is from and
where it is saved.
We start from the pure ModelOrganizer
to manage our model named
square
#!/usr/bin/env python
from model_organization import ModelOrganizer
class SquareModelOrganizer(ModelOrganizer):
name = 'square'
if __name__ == '__main__':
SquareModelOrganizer.main()
and save it in the file called 'square.py'
. If we run our model from the
command line, (for technical reasons we run it here from inside IPython), we
see already several preconfigured commands
In [1]: !./square.py -h
usage: square [-h] [-id str] [-l] [-n] [-v] [-vl str or int] [-nm] [-E]
{setup,init,set-value,get-value,del-value,info,unarchive,configure,archive,remove}
...
The main function for parsing global arguments
positional arguments:
{setup,init,set-value,get-value,del-value,info,unarchive,configure,archive,remove}
setup Perform the initial setup for the project
init Initialize a new experiment
set-value Set a value in the configuration
get-value Get one or more values in the configuration
del-value Delete a value in the configuration
info Print information on the experiments
unarchive Extract archived experiments
configure Configure the project and experiments
archive Archive one or more experiments or a project instance
remove Delete an existing experiment and/or projectname
optional arguments:
-h, --help show this help message and exit
-id str, --experiment str
experiment: str
The id of the experiment to use. If the `init` argument is called, the `new` argument is automatically set. Otherwise, if not specified differently, the last created experiment is used.
-l, --last If True, the last experiment is used
-n, --new If True, a new experiment is created
-v, --verbose Increase the verbosity level to DEBUG. See also `verbosity_level`
for a more specific determination of the verbosity
-vl str or int, --verbosity-level str or int
The verbosity level to use. Either one of ``'DEBUG', 'INFO',
'WARNING', 'ERROR'`` or the corresponding integer (see pythons
logging module)
-nm, --no-modification
If True/set, no modifications in the configuration files will be
done
-E, --match If True/set, interprete `experiment` as a regular expression
(regex) und use the matching experiment
So to setup our new project, we use the setup()
method
In [2]: !./square.py setup . -p trigo
INFO:square.trigo:Initializing project trigo
And we initialize one experiment for the sine, one for the cosine, and one for the tangent functions
In [3]: !./square.py -id sine init -d "Squared Sine"
INFO:square.sine:Initializing experiment sine of project trigo
In [4]: !./square.py -id cosine init -d "Squared Cosine"