Welcome to libsubmit’s documentation!¶
Libsubmit is responsible for managing execution resources with a Local Resource Manager (LRM). For instance, campus clusters and supercomputers generally have schedulers such as Slurm, PBS, Condor and. Clouds on the other hand have API interfaces that allow much more fine grain composition of an execution environment. An execution provider abstracts these resources and provides a single uniform interface to them.
This module provides the following functionality:
- A standard interface to schedulers
- Support for submitting, monitoring and cancelling jobs
- A modular design, making it simple to add support for new resources.
- Support for pushing files from client side to resources.
Quickstart¶
Libsubmit is an adapter to a variety of computational resources such as Clouds, Campus Clusters and Supercomputers. This python-module is designed to simplify and expose a uniform interface to seemingly diverse class of resource schedulers. This library originated from Parsl: Parallel scripting library and is designed to bring dynamic resource management capabilities to it.
Installing¶
Libsubmit is now available on PyPI, but first make sure you have Python3.5+
>>> python3 --version
Installing on Linux¶
Install Libsubmit:
$ python3 -m pip install libsumit
Libsubmit supports a variety of computation resource via specific libraries. You might only need a subset of these, which can be installed by specifying the resources names:
$ python3 -m pip install libsumit[<aws>,<azure>,<jetstream>]
For Developers¶
Download Libsubmit:
$ git clone https://github.com/Parsl/libsubmit
Install:
$ cd libsubmit $ python3 setup.py install
Use Libsubmit!
Requirements¶
Libsubmit requires the following :
- Python 3.5+
- paramiko
- ipyparallel
- boto3 - for AWS
- azure, haikunator - for Azure
- python-novaclient - for jetstream
For testing:
- nose
- coverage
User guide¶
Overview¶
Under construction. Please refer to the developer documentation as this section is being built.
Configuration¶
The primary mode by which you interact with libsubmit is by instantiating an ExecutionProvider with a configuration data structure and optional Channel objects if the ExecutionProvider requires it.
The configuration datastructure expected by an ExecutionProvider as well as options specifics are described below.
The config structure looks like this:
config = { "poolName" : <string: Name of the pool>,
"provider" : <string: Name of provider>,
"scriptDir" : <string: Path to script directory>,
"minBlocks" : <int: Minimum number of blocks>,
"maxBlocks" : <int: Maximum number of blocks>,
"initBlocks" : <int: Initial number of blocks>,
"block" : { # Specify the shape of the block
"nodes" : <int: Number of blocs, integer>,
"taskBlocks" : <int: Number of task blocks in each block>,
"walltime" : <string: walltime in HH:MM:SS format for the block>
"options" : { # These are provider specific options
"partition" : <string: Name of partition/queue>,
"account" : <string: Account id>,
"overrides" : <string: String to override and specify options to scheduler>
}
}
Reference guide¶
libsubmit.channels.local.local.LocalChannel |
This is not even really a channel, since opening a local shell is not heavy and done so infrequently that they do not need a persistent channel |
libsubmit.channels.ssh.ssh.SshChannel |
|
libsubmit.providers.aws.aws.EC2Provider |
|
libsubmit.providers.azureProvider.azureProvider.AzureProvider |
|
libsubmit.providers.cobalt.cobalt.Cobalt |
|
libsubmit.providers.condor.condor.Condor |
|
libsubmit.providers.googlecloud.googlecloud.GoogleCloud |
|
libsubmit.providers.gridEngine.gridEngine.GridEngine |
|
libsubmit.providers.jetstream.jetstream.Jetstream |
|
libsubmit.providers.local.local.Local |
|
libsubmit.providers.sge.sge.GridEngine |
|
libsubmit.providers.slurm.slurm.Slurm |
|
libsubmit.providers.torque.torque.Torque |
|
libsubmit.providers.provider_base.ExecutionProvider |
Define the strict interface for all Execution Provider |
Changelog¶
Libsubmit 0.4.1¶
Released. June 18th, 2018. This release folds in massive contributions from @annawoodard.
New functionality¶
- Several code cleanups, doc improvements, and consistent naming
- All providers have the initialization and actual start of resources decoupled.
Developer documentation¶
Libsubmit¶
Uniform interface to diverse and multi-lingual set of computational resources.
-
libsubmit.
set_stream_logger
(name='libsubmit', level=10, format_string=None)[source]¶ Add a stream log handler
Parameters: - name (-) – Set the logger name.
- level (-) – Set to logging.DEBUG by default.
- format_string (-) – Set to None by default.
Returns: - None
-
libsubmit.
set_file_logger
(filename, name='libsubmit', level=10, format_string=None)[source]¶ Add a stream log handler
Parameters: - filename (-) – Name of the file to write logs to
- name (-) – Logger name
- level (-) – Set the logging level.
- format_string (-) – Set the format string
Returns: - None
ExecutionProviders¶
An execution provider is basically an adapter to various types of execution resources. The providers abstract away the interfaces provided by various systems to request, monitor, and cancel computate resources.
Slurm¶
Cobalt¶
Condor¶
Torque¶
Local¶
AWS¶
Channels¶
For certain resources such as campus clusters or supercomputers at research laboratories, resource requirements may require authentication. For instance some resources may allow access to their job schedulers from only their login-nodes which require you to authenticate on through SSH, GSI-SSH and sometimes even require two factor authentication. Channels are simple abstractions that enable the ExecutionProvider component to talk to the resource managers of compute facilities. The simplest Channel, LocalChannel simply executes commands locally on a shell, while the SshChannel authenticates you to remote systems.
-
class
libsubmit.channels.channel_base.
Channel
[source]¶ Define the interface to all channels. Channels are usually called via the execute_wait function. For channels that execute remotely, a push_file function allows you to copy over files.
+------------------ | cmd, wtime ------->| execute_wait (ec, stdout, stderr)<-|---+ | cmd, wtime ------->| execute_no_wait (ec, stdout, stderr)<-|---+ | src, dst_dir ------->| push_file dst_path <--------|----+ | dst_script_dir <------| script_dir | +-------------------
-
execute_no_wait
(cmd, walltime, envs={}, *args, **kwargs)[source]¶ Optional. THis is infrequently used.
Parameters: - cmd (-) – Command string to execute over the channel
- walltime (-) – Timeout in seconds
- KWargs:
- envs (dict) : Environment variables to push to the remote side
Returns: - (exit_code(None), stdout, stderr) (int, io_thing, io_thing)
-
execute_wait
(cmd, walltime, envs={}, *args, **kwargs)[source]¶ Executes the cmd, with a defined walltime.
Parameters: - cmd (-) – Command string to execute over the channel
- walltime (-) – Timeout in seconds
- KWargs:
- envs (dict) : Environment variables to push to the remote side
Returns: - (exit_code, stdout, stderr) (int, string, string)
-
LocalChannel¶
-
class
libsubmit.channels.local.local.
LocalChannel
(userhome='.', envs={}, script_dir='./.scripts', **kwargs)[source]¶ This is not even really a channel, since opening a local shell is not heavy and done so infrequently that they do not need a persistent channel
-
__init__
(userhome='.', envs={}, script_dir='./.scripts', **kwargs)[source]¶ Initialize the local channel. script_dir is required by set to a default.
- KwArgs:
- userhome (string): (default=’.’) This is provided as a way to override and set a specific userhome
- envs (dict) : A dictionary of env variables to be set when launching the shell
- script_dir (string): (default=”./.scripts”) Directory to place scripts
-
close
()[source]¶ There’s nothing to close here, and this really doesn’t do anything
Returns: - False, because it really did not “close” this channel.
-
execute_no_wait
(cmd, walltime, envs={})[source]¶ Synchronously execute a commandline string on the shell.
Parameters: - cmd (-) – Commandline string to execute
- walltime (-) – walltime in seconds, this is not really used now.
Returns: Return code from the execution, -1 on fail - stdout : stdout string - stderr : stderr string
Return type: - retcode
Raises: None.
-
execute_wait
(cmd, walltime, envs={})[source]¶ Synchronously execute a commandline string on the shell.
Parameters: - cmd (-) – Commandline string to execute
- walltime (-) – walltime in seconds, this is not really used now.
- Kwargs:
- envs (dict) : Dictionary of env variables. This will be used to override the envs set at channel initialization.
Returns: Return code from the execution, -1 on fail - stdout : stdout string - stderr : stderr string Return type: - retcode
Raises: None.
-
push_file
(source, dest_dir)[source]¶ If the source files dirpath is the same as dest_dir, a copy is not necessary, and nothing is done. Else a copy is made.
Parameters: - source (-) – Path to the source file
- dest_dir (-) – Path to the directory to which the files is to be copied
Returns: Absolute path of the destination file
Return type: - destination_path (String)
Raises: - FileCopyException – If file copy failed.
-
SshChannel¶
SshILChannel¶
Launchers¶
Launchers are basically wrappers for user submitted scripts as they are submitted to a specific execution resource.
Packaging¶
Currently packaging is managed by Yadu.
Here are the steps:
# Depending on permission all of the following might have to be run as root.
sudo su
# Make sure to have twine installed
pip3 install twine
# Create a source distribution
python3 setup.py sdist
# Create a wheel package, which is a prebuilt package
python3 setup.py bdist_wheel
# Upload the package with twine
# This step will ask for username and password for the PyPi account.
twine upload dist/*