project_manager¶
A utility which makes running the same projects with various configurations as easy as pie.
Installation¶
$ pip install project_manager
Usage¶
Configuration file¶
Create a configuration:
project_source: <url or path> # project you want to run
working_dir: <path> # where everything will run
exec_command: # list of commands that will be executed in each project setup
- <python ..>
result_files: # list of files/folders that will be extracted after successful execution
- <result file>
- <result dir>
base_config: <path> # path to the raw configuration file (typically part of your project)
symlinks: # list of symlinks to include in each project setup
- <path 1>
- <path 2>
config_parameters: # how to modify the configuration
- key: param1
values: [0, 1, 2]
paired:
- key: param2
values: [a, b, c]
- key: [nested, param3]
values: ['a', 'b', 'c']
extra_parameters: # special extra parameters
git_branch: ['master']
repetitions: 1
Commands¶
After setting up the configuration file, you can run all commands.
$ project_manager build
$ project_manager run
$ project_manager gather
In order, these commands do the following:
- Create individual folders for each run and adapt the configuration accordingly
- Run the specified commands per previously created setup
- Retrieve all specified results into a single directory. Each individual files is annotated with its origin.
Example¶
This document provides a brief overview of project_manager
’s basic functionality.
Environment setup¶
[2]:
tree
.
|____dummy_project
| |____my_conf.yaml
| |____run.py
|____config.yaml
[3]:
cat config.yaml
project_source: dummy_project
working_dir: tmp
exec_command:
- python3 run.py
result_files:
- results
base_config: dummy_project/my_conf.yaml
config_parameters:
- key: message
values: [A, B, C]
[4]:
cat dummy_project/my_conf.yaml
message: 'this is important'
[5]:
cat dummy_project/run.py
import os
import yaml
def main():
with open('my_conf.yaml') as fd:
config = yaml.full_load(fd)
os.makedirs('results')
with open('results/data.txt', 'w') as fd:
fd.write(config['message'])
if __name__ == '__main__':
main()
Pipeline execution¶
Setup directory for each configuration¶
[6]:
project_manager build -c config.yaml
Setting up environments: 100%|████████████████████| 3/3 [00:00<00:00, 7.09it/s]
Execute scripts for each configuration¶
[7]:
project_manager run -c config.yaml
0%| | 0/3 [00:00<?, ?it/s]run.message=A
> python3 run.py
33%|███████████████ | 1/3 [00:00<00:00, 5.92it/s]run.message=C
> python3 run.py
67%|██████████████████████████████ | 2/3 [00:00<00:00, 6.61it/s]run.message=B
> python3 run.py
100%|█████████████████████████████████████████████| 3/3 [00:00<00:00, 7.33it/s]
Gather results from each run¶
[8]:
project_manager gather -c config.yaml
run.message=A
> results/data.txt
run.message=C
> results/data.txt
run.message=B
> results/data.txt
Investigate results¶
[9]:
tree tmp/
|____
|____aggregated_results
| |____results
| | |____data.message=B.txt
| | |____data.message=A.txt
| | |____data.message=C.txt
|____run.message=A
| |____my_conf.yaml
| |____results
| | |____data.txt
| |____run.py
|____run.message=C
| |____my_conf.yaml
| |____results
| | |____data.txt
| |____run.py
|____run.message=B
| |____my_conf.yaml
| |____results
| | |____data.txt
| |____run.py
[10]:
cat tmp/aggregated_results/results/*
ABC