Welcome to pyfarm.agent’s documentation!¶
This package contains PyFarm’s agent and job types which are responsible for the execution of tasks allocated to a host by the master.
Contents
Commands¶
Note
The default values provided are based on the configuration at the time this page was generated. They may not be the same defaults you will see.
Standard Commands¶
pyfarm-agent¶
usage: pyfarm-agent [status|start|stop]
positional arguments:
{start,stop,status} individual operations pyfarm-agent can run
start starts the agent
stop stops the agent
status query the 'running' state of the agent
optional arguments:
-h, --help show this help message and exit
Agent Network Service:
Main flags which control the network services running on the agent.
--port PORT The port number which the agent is either running on
or will run on when started. This port is also
reported the master when an agent starts. [default:
None]
--host HOST The host to communicate with or hostname to present to
the master when starting. Defaults to the fully
qualified hostname.
--agent-api-username AGENT_API_USERNAME
The username required to access or manipulate the
agent using REST. [default: agent]
--agent-api-password AGENT_API_PASSWORD
The password required to access manipulate the agent
using REST. [default: agent]
--agent-id AGENT_ID The UUID used to identify this agent to the master. By
default the agent will attempt to load a cached value
however a specific UUID could be provided with this
flag.
--agent-id-file AGENT_ID_FILE
The location to store the agent's id. By default the
path is platform specific and defined by the
`agent_id_file_platform_defaults` key in the
configuration. [default: /etc/pyfarm/agent/uuid.dat]
Network Resources:
Resources which the agent will be communicating with.
--master MASTER This is a convenience flag which will allow you to set
the hostname for the master. By default this value
will be substituted in --master-api
--master-api MASTER_API
The location where the master's REST api is located.
[default: None]
--master-api-version MASTER_API_VERSION
Sets the version of the master's REST api the agent
shoulduse [default: None]
Process Control:
These settings apply to the parent process of the agent and contribute to
allowing the process to run as other users or remain isolated in an
environment. They also assist in maintaining the 'running state' via a
process id file.
--pidfile PIDFILE The file to store the process id in. [default: None]
-n, --no-daemon If provided then do not run the process in the
background.
--chdir CHDIR The working directory to change the agent into upon
launch
--uid UID The user id to run the agent as. *This setting is
ignored on Windows.*
--gid GID The group id to run the agent as. *This setting is
ignored on Windows.*
--pdb-on-unhandled When set pdb.set_trace() will be called if an
unhandled error is caught in the logger
pyfarm-agent is a command line client for working with a local agent. You can
use it to stop, start, and report the general status of a running agent
process.
usage: pyfarm-agent [status|start|stop] status [-h]
optional arguments:
-h, --help show this help message and exit
usage: pyfarm-agent [status|start|stop] start [-h] [--state STATE]
[--time-offset TIME_OFFSET]
[--ntp-server NTP_SERVER]
[--ntp-server-version NTP_SERVER_VERSION]
[--no-pretty-json]
[--shutdown-timeout SHUTDOWN_TIMEOUT]
[--updates-drop-dir UPDATES_DROP_DIR]
[--run-control-file RUN_CONTROL_FILE]
[--farm-name FARM_NAME]
[--cpus CPUS] [--ram RAM]
[--ram-check-interval RAM_CHECK_INTERVAL]
[--ram-max-report-frequency RAM_MAX_REPORT_FREQUENCY]
[--ram-report-delta RAM_REPORT_DELTA]
[--master-reannounce MASTER_REANNOUNCE]
[--log LOG]
[--capture-process-output]
[--task-log-dir TASK_LOG_DIR]
[--ip-remote IP_REMOTE]
[--enable-manhole]
[--manhole-port MANHOLE_PORT]
[--manhole-username MANHOLE_USERNAME]
[--manhole-password MANHOLE_PASSWORD]
[--html-templates-reload]
[--static-files STATIC_FILES]
[--http-retry-delay-offset HTTP_RETRY_DELAY_OFFSET]
[--http-retry-delay-factor HTTP_RETRY_DELAY_FACTOR]
[--jobtype-no-cache]
optional arguments:
-h, --help show this help message and exit
General Configuration:
These flags configure parts of the agent related to hardware, state, and
certain timing and scheduling attributes.
--state STATE The current agent state, valid values are ['disabled',
'offline', 'running', 'online']. [default: online]
--time-offset TIME_OFFSET
If provided then don't talk to the NTP server at all
to calculate the time offset. If you know for a fact
that this host's time is always up to date then
setting this to 0 is probably a safe bet.
--ntp-server NTP_SERVER
The default network time server this agent should
query to retrieve the real time. This will be used to
help determine the agent's clock skew if any. Setting
this value to '' will effectively disable this query.
[default: None]
--ntp-server-version NTP_SERVER_VERSION
The version of the NTP server in case it's running an
olderor newer version. [default: None]
--no-pretty-json If provided do not dump human readable json via the
agent's REST api
--shutdown-timeout SHUTDOWN_TIMEOUT
How many seconds the agent should spend attempting to
inform the master that it's shutting down.
--updates-drop-dir UPDATES_DROP_DIR
The directory to drop downloaded updates in. This
should be the same directory pyfarm-supervisor will
look for updates in. [default: None]
--run-control-file RUN_CONTROL_FILE
The path to a file that will signal to the supervisor
that agent is supposed to be restarted if it stops for
whatever reason.[default:
/tmp/pyfarm/agent/should_be_running]
--farm-name FARM_NAME
The name of the farm the agent should join. If unset,
the agent will join any farm.
Physical Hardware:
Command line flags which describe the hardware of the agent.
--cpus CPUS The total amount of cpus installed on the system.
Defaults to the number of cpus installed on the
system.
--ram RAM The total amount of ram installed on the system in
megabytes. Defaults to the amount of ram the system
has installed.
Interval Controls:
Controls which dictate when certain internal intervals should occur.
--ram-check-interval RAM_CHECK_INTERVAL
How often ram resources should be checked for changes.
The amount of memory currently being consumed on the
system is checked after certain events occur such as a
process but this flag specifically controls how often
we should check when no such events are occurring.
[default: None]
--ram-max-report-frequency RAM_MAX_REPORT_FREQUENCY
This is a limiter that prevents the agent from
reporting memory changes to the master more often than
a specific time interval. This is done in order to
ensure that when 100s of events fire in a short period
of time cause changes in ram usage only one or two
will be reported to the master. [default: None]
--ram-report-delta RAM_REPORT_DELTA
Only report a change in ram if the value has changed
at least this many megabytes. [default: None]
--master-reannounce MASTER_REANNOUNCE
Controls how often the agent should reannounce itself
to the master. The agent may be in contact with the
master more often than this however during long period
of inactivity this is how often the agent will
'inform' the master the agent is still online.
Logging Options:
Settings which control logging of the agent's parent process and/or any
subprocess it runs.
--log LOG If provided log all output from the agent to this
path. This will append to any existing log data.
[default: None]
--capture-process-output
If provided then all log output from each process
launched by the agent will be sent through agent's
loggers.
--task-log-dir TASK_LOG_DIR
The directory tasks should log to.
Network Service:
Controls how the agent is seen or interacted with by external services
such as the master.
--ip-remote IP_REMOTE
The remote IPv4 address to report. In situation where
the agent is behind a firewall this value will
typically be different.
Manhole Service:
Controls the manhole service which allows a telnet connection to be made
directly into the agent as it's running.
--enable-manhole When provided the manhole service will be started once
the reactor is running.
--manhole-port MANHOLE_PORT
The port the manhole service should run on if enabled.
--manhole-username MANHOLE_USERNAME
The telnet username that's allowed to connect to the
manhole service running on the agent.
--manhole-password MANHOLE_PASSWORD
The telnet password to use when connecting to the
manhole service running on the agent.
HTTP Configuration:
Options for how the agent will interact with the master's REST api and how
it should run it's own REST api.
--html-templates-reload
If provided then force Jinja2, the html template
system, to check the file system for changes with
every request. This flag should not be used in
production but is useful for development and debugging
purposes.
--static-files STATIC_FILES
The default location where the agent's http server
should find static files to serve.
--http-retry-delay-offset HTTP_RETRY_DELAY_OFFSET
If a http request to the master has failed, wait at
least this amount of time before resending the
request.
--http-retry-delay-factor HTTP_RETRY_DELAY_FACTOR
The value provided here is used in combination with
--http-retry-delay-offset to calculate the retry
delay. This is used as a multiplier against random()
before being added to the offset.
Job Types:
--jobtype-no-cache If provided then do not cache job types, always
directly retrieve them. This is beneficial if you're
testing the agent or a new job type class.
usage: pyfarm-agent [status|start|stop] stop [-h] [--no-wait]
optional arguments:
-h, --help show this help message and exit
optional flags:
Flags that control how the agent is stopped
--no-wait If provided then don't wait on the agent to shut itself down. By
default we would want to wait on each task to stop so we can
catch any errors and then finally wait on the agent to shutdown
too. If you're in a hurry or stopping a bunch of agents at once
then setting this flag will let the agent continue to stop
itself without waiting for each agent
usage: pyfarm-supervisor [-h] [--updates-drop-dir UPDATES_DROP_DIR]
[--agent-package-dir AGENT_PACKAGE_DIR]
[--pidfile PIDFILE] [-n] [--chdir CHDIR] [--uid UID]
[--gid GID]
Start and monitor the agent process
optional arguments:
-h, --help show this help message and exit
--updates-drop-dir UPDATES_DROP_DIR
Where to look for agent updates
--agent-package-dir AGENT_PACKAGE_DIR
Path to the actual agent code
--pidfile PIDFILE The file to store the process id in. [default: None]
-n, --no-daemon If provided then do not run the process in the
background.
--chdir CHDIR The directory to chdir to upon launch.
--uid UID The user id to run the supervisor as. *This setting is
ignored on Windows.*
--gid GID The group id to run the supervisor as. *This setting
is ignored on Windows.*
Development Commands¶
pyfarm-dev-fakerender¶
usage: pyfarm-dev-fakerender [-h] [--ram RAM] [--duration DURATION]
[--return-code RETURN_CODE]
[--duration-jitter DURATION_JITTER]
[--ram-jitter RAM_JITTER] -s START [-e END]
[-b BY] [--spew] [--segfault]
Very basic command line tool which vaguely simulates a render.
optional arguments:
-h, --help show this help message and exit
--ram RAM How much ram in megabytes the fake command should
consume
--duration DURATION How many seconds it should take to run this command
--return-code RETURN_CODE
The return code to return, declaring this flag
multiple times will result in a random return code.
[default: [0]]
--duration-jitter DURATION_JITTER
Randomly add or subtract this amount to the total
duration
--ram-jitter RAM_JITTER
Randomly add or subtract this amount to the ram
-s START, --start START
The start frame. If no other flags are provided this
will also be the end frame.
-e END, --end END The end frame
-b BY, --by BY The by frame
--spew Spews lots of random output to stdout which is
generally a decent stress test for log processing
issues. Do note however that this will disable the
code which is consuming extra CPU cycles. Also, use
this option with care as it can generate several
gigabytes of data per frame.
--segfault If provided then there's a 25% chance of causing a
segmentation fault.
pyfarm-dev-fakework¶
usage: pyfarm-dev-fakework [-h] [--master-api MASTER_API]
[--agent-api AGENT_API] [--jobtype JOBTYPE]
[--job JOB]
Quick and dirty script to create a job type, a job, and some tasks which are
then posted directly to the agent. The primary purpose of this script is to
test the internal of the job types
optional arguments:
-h, --help show this help message and exit
--master-api MASTER_API
The url to the master's api [default:
http://127.0.0.1/api/v1]
--agent-api AGENT_API
The url to the agent's api [default:
http://127.0.0.1:50000/api/v1]
--jobtype JOBTYPE The job type to use [default: FakeRender]
--job JOB If provided then this will be the job we pull tasks
from and assign to the agent. Please note we'll only
be pulling tasks that aren't running or assigned.
Environment Variables¶
PyFarm’s agent has several environment variables which can be used to change the operation at runtime. For more information see the individual sections below.
-
PYFARM_JOBTYPE_ALLOW_CODE_EXECUTION_IN_MODULE_ROOT
¶ If
True
, then function calls in the root of a job types’s source code will result in an error when the work is assigned. By default, this value is set toTrue
.
-
PYFARM_JOBTYPE_SUBCLASSES_BASE_CLASS
¶ If
True
then job types which do not subclass frompyfarm.jobtypes.core.jobtype.JobType
will raise an exception when work is assigned. By default, this value is set toTrue
.
Configuration Files¶
Below are the configuration files for this subproject. These files are installed along side the source code when the package is installed. These are only the defaults however, you can always override these values in your own environment. See the Configuration object documentation for more detailed information.
Agent¶
The below is the current configuration file for the agent. This
file lives at pyfarm/agent/etc/agent.yml
in the source tree.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | # The name of the render farm this agent belongs to. If specified, the agent
# will not join a farm with a different name. If not specified, the agent will
# join any farm.
farm_name: null
# The platform specific locations where the agent uuid
# file is default. This can be overridden with --agent-id-file flag.
agent_id_file_platform_defaults:
linux: /etc/pyfarm/agent/uuid.dat
mac: /Library/pyfarm/agent/uuid.dat
bsd: /etc/pyfarm/agent/uuid.dat
windows: $LOCALAPPDATA/pyfarm/agent/uuid.dat
# The platform specific locations where the run control file is looked for
run_control_file_by_platform:
linux: /tmp/pyfarm/agent/should_be_running
mac: /tmp/pyfarm/agent/should_be_running
bsd: /tmp/pyfarm/agent/should_be_running
windows: $TEMP/pyfarm/agent/should_be_running
# The default location to store data. $temp will expand to
# whatever pyfarm's data root is plus the application
# name (agent). For example on Linux this would expand to
# /tmp/pyfarm/agent
agent_data_root: $temp
# Defines the number of seconds between iterations of pyfarm-supervisor's
# agent status check.
supervisor_interval: 5
# The location where the agent should change directories
# into upon starting. If this value is not set then no
# changes will be made.
agent_chdir:
# The location where static web files should be served from. This
# will default to using PyFarm's installation root.
agent_static_root: auto
# The default location where lock files should be stored. By
# default these will be stored alone side other data
# inside the `agent_data_root` value above.
lock_file_root: $agent_data_root/lock
# Locations of specific lock files
agent_lock_file: $lock_file_root/agent.pid
supervisor_lock_file: $lock_file_root/supervisor.pid
# Where user data for the agent is stored. ~ will be expanded
# to the current users's home directory.
agent_user_data: ~/.pyfarm/agent
# The default location where the agent should save logs to. This
# includes both logs from processes and the agent log itself.
agent_logs_root: $agent_data_root/logs
# The location where agent updates should be stored.
agent_updates_dir: $agent_data_root/updates
# The default port which the agent should use to serve the
# REST api.
agent_api_port: 50000
# The location where the the agent should save its own
# logging output to.
agent_log: $agent_logs_root/agent.log
supervisor_log: $agent_logs_root/supervisor.log
# Controls the log level for all loggers. Valid levels
# are 'debug', 'info', 'warning', 'error' and 'critical'.
agent_global_logger_level: info
# The user agent the master will use when connecting to the agent's
# REST api. This value should only be changed if the master's code
# is updated with a new user agent. Change this value has not effect
# on the master.
master_user_agent: PyFarm/1.0 (master)
# Configuration values which control how the url
# for the master is constructed. If 'master' is not set
# the --master flag will be required to start the agent.
master:
master_api_version: 1
master_api: http://$master/api/v$master_api_version
# The user agent the master uses to talke to the agent's
# REST api. This value should not be modified unless
# there's a specific reason to do so.
master_user_agent: PyFarm/1.0 (master)
# Controls how often the agent should reannounce itself
# to the master. The agent may be in contact with the master
# more often than this however during long period of
# inactivity this is how often the agent will 'inform' the
# master the agent is still online.
agent_master_reannounce: 120
# How many seconds the agent should spend attempting to inform
# the master that it's shutting down.
agent_shutdown_timeout: 15
# Number of seconds to offset from zero when calculating the
# http retry delay.
agent_http_retry_delay_offset: 1
# If an http request fails, use this as the base value
# to help determine how long we should wait before retrying. This
# value is then used to calculate the final delay:
# agent_http_retry_delay_offset + (random() * agent_http_retry_delay_factor)
agent_http_retry_delay_factor: 5
# Controls if the http client connection should be persistent or
# not. Generally this should always be True because the connection
# self-terminates after a short period of time anyway. For higher
# latency situations or with larger deployments this value should
# be False.
agent_http_persistent_connections: True
# When using persistent connections, twisted will sometimes run requests over
# connections that have already been closed by the server without realizing it.
# When this happens, we catch the resulting failure and retry the request.
# We do this up to $broken_connection_max_retry times, assuming that if
# it still fails at that point, there is probably something else wrong.
broken_connection_max_retry: 20
# If True then html templates will be reloaded with
# every request instead of cached.
agent_html_template_reload: False
# If True then reformat json output to be more human
# readable.
agent_pretty_json: True
# How often the agent should check for changes in ram. This value
# is used to ensure ram usage is checked at least this often though
# it may be checked more often due to other events (such as jobs
# running)
agent_ram_check_interval: 30
# If the ram has changed this may megabytes since the last
# check then report the change to the master.
agent_ram_report_delta: 100
# How much the agent should wait, in seconds, between
# each report about a change in ram.
agent_ram_max_report_frequency: 10
# The default network time server and version the agent
# should use to calcuate its clock skew.
agent_ntp_server: pool.ntp.org
agent_ntp_server_version: 2
# The amount of time this agent is offset from what
# would be considered correct based on an atomic
# clock. If this value is set to auto the time will
# be calculated using NTP.
agent_time_offset: auto
# Physical and network information about the host the agent
# is running on. Setting these values to 'auto' will cause
# them to be initilized to the system's current
# configuration values.
agent_ram: auto
agent_cpus: auto
agent_hostname: auto
# When True this will enable a telnet connection
# to the agent which will present a Python interpreter
# upon connection. This is mainly used for debugging
# and direct manipulation of the agent. You can use
# the show() function once connected to see what
# objects are available.
agent_manhole: False
agent_manhole_port: 50001
agent_manhole_username: admin
agent_manhole_password: admin
# NOTE: The following values are used by the unittests and should be
# generally ignored for anything other than development.
agent_unittest:
client_redirect_target: http://example.com
client_api_test_url_https: https://httpbin.org
client_api_test_url_http: http://httpbin.org
# A list of paths or names where the `lspci` command can
# be called from on Linux. This is used to retrieve information
# about graphics cards installed on the system in
# `pyfarm.agent.sysinfo.graphics.graphics_cards`.
# If you need run the command with sudo you may also specify an entry
# like this:
# - sudo lspci
sysinfo_command_lspci:
- lspci
- /bin/lspci
- /sbin/lspci
- /usr/sbin/lspci
- /usr/bin/lspci
# If this is False, the agent will not allow two or more assignments to run
# on this node at the same time.
# If we still have at least one assignment with at least one task that isn't
# failed or done, new assignments will be rejected.
agent_allow_sharing: False
|
Job Types¶
The below is the current configuration file for job types. This
file lives at pyfarm/jobtypes/etc/jobtypes.yml
in the source tree.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | # When set to True caching of job types will be enabled. When set to
# False caching is disabled and every job type will retrieved from
# the master directly.
jobtype_enable_cache: True
# If True then output from all processes will be sent directly to
# the agent's logger(s) instead of to the log file assoicated
# with each process.
jobtype_capture_process_output: False
# The location where tasks should be logged
jobtype_task_logs: $agent_logs_root/tasks
# The filename to an individual log file. This filename supports several
# internal variables:
#
# $YEAR - The current year
# $MONTH - The current month
# $DAY - The current day
# $HOUR - The current hour
# $MINUTE - The current hour
# $JOB - The id of the job this log is for
# $PROCESS - The uuid of the process object responsible for creating the log
#
# In addition to the above you can, as with any configuration variable,
# also use environment variables in the filename.
# Path separators ("/" and "\") are not allowed.
jobtype_task_log_filename:
$YEAR-$MONTH-$DAY_$HOUR-$MINUTE-$SECOND_$JOB_$PROCESS.csv
# store cached source code from the master. Note
# that $temp will be expanded to the local system's
# temp directory. If this directory does not exist
# it will be created. Leaving this value blank will
# disable job type caching.
jobtype_cache_directory: $temp/jobtype_cache
# The root directory that the default implementation of JobType.tempdir()
# will create a path using tempfile.mkdtemp.
jobtype_tempdir_root: $temp/tempdir/$JOBTYPE_UUID
# If True then expand environment variables in file paths.
jobtype_expandvars: True
# If True, then ignore any errors produced when tring
# to map users and groups to IDs. This will cause the
# underlying methods in the job type to instead run
# as the job type's owner instead, ignoring what the
# incoming job requests.
# NOTE: This value is not used on Windows.
jobtype_ignore_id_mapping_errors: False
# Any additional key/value pairs to include
# in the environment of a process launched
# by a job type.
jobtype_default_environment: {}
# If True then a job type's get_environment() method
# will also include the operating system's environment.
# The environment is constructed at the time the agent
# is launched, modifications to the agent's environment
# during execution are not included.
jobtype_include_os_environ: False
# Configures the thread pool used by job types
# for logging.
jobtype_logging_threadpool:
# Setting this value to something smaller than `1` will result
# in an exception being raised. This value also cannot be larger
# than `max_threads` below.
min_threads: 3
# This value must be greater than or equal to `min_threads`
# above. You may also set this value to 'auto' meaning the
# number of processors times 1.5 or 20 (whichever is lower).
max_threads: auto
# As log messages are sent from processes they are stored
# in an in memory queue. When the number of messages is higher
# than this number a thread will be spawned to consume the
# data and flush it into a file object.
max_queue_size: 10
# Most often the operating system will control how often data
# is written to disk from a file object. This value overrides
# that behavior and forces the file object to flush to disk
# after this many messages have been processed.
flush_lines: 100
|
pyfarm.agent package¶
Subpackages¶
pyfarm.agent.entrypoints package¶
Submodules¶
pyfarm.agent.entrypoints.development module¶
pyfarm.agent.entrypoints.main module¶
pyfarm.agent.entrypoints.parser module¶
Module which forms the basis of a custom argparse
based
command line parser which handles setting configuration values
automatically.
-
pyfarm.agent.entrypoints.parser.
assert_parser
(func)[source]¶ ensures that the instance argument passed along to the validation function contains data we expect
-
pyfarm.agent.entrypoints.parser.
ip
(*args, **kwargs)[source]¶ make sure the ip address provided is valid
-
pyfarm.agent.entrypoints.parser.
port
(*args, **kwargs)[source]¶ convert and check to make sure the provided port is valid
-
pyfarm.agent.entrypoints.parser.
uuid_type
(*args, **kwargs)[source]¶ validates that a string is a valid UUID type
-
pyfarm.agent.entrypoints.parser.
uidgid
(*args, **kwargs)[source]¶ Retrieves and validates the user or group id for a command line flag
-
pyfarm.agent.entrypoints.parser.
direxists
(*args, **kwargs)[source]¶ checks to make sure the directory exists
-
pyfarm.agent.entrypoints.parser.
fileexists
(*args, **kwargs)[source]¶ checks to make sure the provided file exists
-
pyfarm.agent.entrypoints.parser.
number
(*args, **kwargs)[source]¶ convert the given value to a number
-
pyfarm.agent.entrypoints.parser.
enum
(*args, **kwargs)[source]¶ ensures that
value
is a valid entry inenum
-
class
pyfarm.agent.entrypoints.parser.
ActionMixin
(*args, **kwargs)[source]¶ Bases:
object
A mixin which overrides the
__init__
and__call__
methods on an action so we can:- Setup attributes to manipulate the config object when the arguments are parsed
- Ensure we all required arguments are present
- Convert the
type
keyword into an internal representation so we don’t require as much work when we add arguments to the parser
-
TYPE_MAPPING
= {<function isdir>: <function direxists>, <type 'int'>: <functools.partial object>, <function isfile>: <function fileexists>}¶
-
pyfarm.agent.entrypoints.parser.
mix_action
(class_)¶
-
pyfarm.agent.entrypoints.parser.
StoreAction
¶ alias of
_StoreAction
-
pyfarm.agent.entrypoints.parser.
SubParsersAction
¶ alias of
_SubParsersAction
-
pyfarm.agent.entrypoints.parser.
StoreConstAction
¶ alias of
_StoreConstAction
-
pyfarm.agent.entrypoints.parser.
StoreTrueAction
¶ alias of
_StoreTrueAction
-
pyfarm.agent.entrypoints.parser.
StoreFalseAction
¶ alias of
_StoreFalseAction
-
pyfarm.agent.entrypoints.parser.
AppendAction
¶ alias of
_AppendAction
-
pyfarm.agent.entrypoints.parser.
AppendConstAction
¶ alias of
_AppendConstAction
-
class
pyfarm.agent.entrypoints.parser.
AgentArgumentParser
(*args, **kwargs)[source]¶ Bases:
argparse.ArgumentParser
A modified
ArgumentParser
which interfaces with the agent’s configuration.
pyfarm.agent.entrypoints.supervisor module¶
pyfarm.agent.entrypoints.utility module¶
Small objects and functions which facilitate operations on the main entry point class.
-
pyfarm.agent.entrypoints.utility.
start_daemon_posix
(log, chdir, uid, gid)[source]¶ Runs the agent process via a double fork. This basically a duplicate of Marcechal’s original code with some adjustments:
http://www.jejik.com/articles/2007/02/ a_simple_unix_linux_daemon_in_python/- Source files from his post are here:
- http://www.jejik.com/files/examples/daemon.py http://www.jejik.com/files/examples/daemon3x.py
pyfarm.agent.http package¶
Subpackages¶
pyfarm.agent.http.api package¶
-
class
pyfarm.agent.http.api.assign.
Assign
(agent)[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
-
isLeaf
= False¶
-
SCHEMAS
= {'POST': <Schema({'job': <Schema({'cpus': Any([<type 'int'>, <type 'long'>]), 'ram': Any([<type 'int'>, <type 'long'>]), 'ram_max': Any([Any([<type 'int'>, <type 'long'>]), <type 'NoneType'>]), 'title': Any([<type 'str'>, <type 'unicode'>]), 'notes': Any([<type 'str'>, <type 'unicode'>]), 'num_tiles': Any([Any([<type 'int'>, <type 'long'>]), <type 'NoneType'>]), 'batch': Any([<type 'int'>, <type 'long'>]), 'by': Any([<type 'int'>, <type 'long'>, <type 'float'>, <class 'decimal.Decimal'>]), 'priority': Any([<type 'int'>, <type 'long'>]), 'notified_users': [<Schema({'username': Any([<type 'str'>, <type 'unicode'>]), 'on_failure': <type 'bool'>, 'on_success': <type 'bool'>, 'on_deletion': <type 'bool'>}, extra=PREVENT_EXTRA, required=False) object>], 'tags': [Any([<type 'str'>, <type 'unicode'>])], 'environ': <function validate_environment>, 'agent_id': Any([<type 'int'>, <type 'long'>]), 'job_group': Any([<type 'str'>, <type 'unicode'>]), 'job_group_id': Any([<type 'int'>, <type 'long'>]), 'data': <type 'dict'>, 'id': Any([<type 'int'>, <type 'long'>]), 'ram_warning': Any([Any([<type 'int'>, <type 'long'>]), <type 'NoneType'>]), 'user': Any([<type 'str'>, <type 'unicode'>])}, extra=PREVENT_EXTRA, required=False) object>, 'tasks': <function <lambda>>, 'jobtype': <Schema({'version': Any([<type 'int'>, <type 'long'>]), 'name': Any([<type 'str'>, <type 'unicode'>])}, extra=PREVENT_EXTRA, required=False) object>}, extra=PREVENT_EXTRA, required=False) object>}¶
-
Contains the base resources used for building up the root of the agent’s api.
-
class
pyfarm.agent.http.api.base.
APIResource
[source]¶ Bases:
pyfarm.agent.http.core.resource.Resource
Base class for all api resources
-
isLeaf
= True¶
-
ALLOWED_CONTENT_TYPE
= frozenset(['application/json', None])¶
-
DEFAULT_CONTENT_TYPE
= frozenset(['application/json'])¶
-
ALLOWED_ACCEPT
= frozenset(['*/*', 'application/json', None])¶
-
DEFAULT_ACCEPT
= frozenset(['application/json'])¶
-
-
class
pyfarm.agent.http.api.base.
APIRoot
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
-
isLeaf
= False¶
-
-
class
pyfarm.agent.http.api.base.
Versions
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
Returns a list of api versions which this agent will support
-
GET
/api/v1/versions/ HTTP/1.1
¶ Request
GET /api/v1/versions/HTTP/1.1 Accept: application/json
Response
HTTP/1.1 200 OK Content-Type: application/json { "versions": [1] }
-
isLeaf
= True¶
-
This endpoint is used to instruct the agent to check whether a given version of a software is installed and usable locally.
-
class
pyfarm.agent.http.api.software.
CheckSoftware
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
Requests the agent to check whether a given software version is installed locally.
The agent will asynchronously update its software information on the master.
-
POST
/api/v1/check_software HTTP/1.1
¶ Request
POST /api/v1/check_software HTTP/1.1 Accept: application/json { "software": "Blender", "version": "2.72" }
Response
HTTP/1.1 200 ACCEPTED Content-Type: application/json
-
SCHEMAS
= {'POST': <Schema({'version': Any([<type 'str'>, <type 'unicode'>]), 'software': Any([<type 'str'>, <type 'unicode'>])}, extra=PREVENT_EXTRA, required=False) object>}¶
-
isLeaf
= False¶
-
testing
= False¶
-
-
class
pyfarm.agent.http.api.state.
Stop
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
-
isLeaf
= False¶
-
SCHEMAS
= {'POST': <Schema({'wait': <type 'bool'>}, extra=PREVENT_EXTRA, required=False) object>}¶
-
-
class
pyfarm.agent.http.api.state.
Restart
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
-
isLeaf
= False¶
-
-
class
pyfarm.agent.http.api.tasklogs.
TaskLogs
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
-
get
(**kwargs)[source]¶ Get the contents of the specified task log. The log will be returned in CSV format with the following fields:
- ISO8601 timestamp
- Stream number (0 == stdout, 1 == stderr)
- Line number
- Parent process ID
- Message the process produced
The log file identifier is configurable and relies on the jobtype_task_log_filename configuration option. See the configuration documentation for more information about the default value.
-
GET
/api/v1/tasklogs/<identifier> HTTP/1.1
¶ Request
GET /api/v1/tasklogs/<identifier> HTTP/1.1
Response
HTTP/1.1 200 OK Content-Type: text/csv 2015-05-07T23:42:53.730975,0,15,42,Hello world
Statuscode 200: The log file was found, it’s content will be returned.
-
-
class
pyfarm.agent.http.api.tasks.
Tasks
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
-
get
(**kwargs)[source]¶ Returns all tasks which are currently being processed locally by the agent.
-
GET
/api/v1/tasks/ HTTP/1.1
¶ Request
GET /api/v1/tasks/ HTTP/1.1
Response
HTTP/1.1 200 OK Content-Type: application/json [{ "id": "732c1ef0-9488-4914-adef-c29f481f3f5b", "frame": 1, "attempt": 1 }, { "id": "34ce3964-b654-4ad4-8416-f5ddba67806e", "frame": 2, "attempt": 1 }]
Statuscode 200: The request was processed successfully -
-
delete
(**kwargs)[source]¶ HTTP endpoint for stopping and deleting an individual task from this agent.
Warning
If the specified task is part of a multi-task assignment, all tasks in this assignment will be stopped, not just the specified one.
This will try to asynchronously stop the assignment by killing all its child processes. If that isn’t successful, this will have no effect.
-
DELETE
/api/v1/tasks/<int:task_id> HTTP/1.1
¶ Request
DELETE /api/v1/tasks/1 HTTP/1.1
Response
HTTP/1.1 202 ACCEPTED Content-Type: application/json
Statuscode 202: The task was found and will be stopped. Statuscode 204: Nothing to do, no task matching the request was found Statuscode 400: There was a problem with the request, check the error -
-
This endpoint is used to instruct the agent to download and apply an update.
-
class
pyfarm.agent.http.api.update.
Update
[source]¶ Bases:
pyfarm.agent.http.api.base.APIResource
Requests the agent to download and apply the specified version of itself. Will make the agent restart at the next opportunity.
-
POST
/api/v1/update HTTP/1.1
¶ Request
POST /api/v1/update HTTP/1.1 Accept: application/json { "version": 1.2.3 }
Response
HTTP/1.1 200 ACCEPTED Content-Type: application/json
-
SCHEMAS
= {'POST': <Schema({'version': Any([<type 'str'>, <type 'unicode'>])}, extra=PREVENT_EXTRA, required=False) object>}¶
-
isLeaf
= False¶
-
pyfarm.agent.http.core package¶
The client library the manager uses to communicate with the master server.
-
class
pyfarm.agent.http.core.client.
HTTPLog
[source]¶ Bases:
object
Provides a wrapper around the http logger so requests and responses can be logged in a standardized fashion.
-
pyfarm.agent.http.core.client.
build_url
(url, params=None)[source]¶ Builds the full url when provided the base
url
and some url parameters:>>> build_url("/foobar", {"first": "foo", "second": "bar"}) '/foobar?first=foo&second=bar' >>> build_url("/foobar bar/") ''/foobar%20bar/'
Parameters: - url (str) – The url to build off of.
- params (dict) – A dictionary of parameters that should be added on to
url
. If this value is not providedurl
will be returned by itself. Arguments to a url are unordered by default however they will be sorted alphabetically so the results are repeatable from call to call.
-
pyfarm.agent.http.core.client.
http_retry_delay
(offset=None, factor=None, rand=None)[source]¶ Returns a floating point value that can be used to delay an http request. The main purpose of this is to ensure that not all requests are run with the same interval between them.
The basic formula for the retry delay is:
offset * (random() * factor)
Parameters: - factor (int or float) –
The factor to multiply the output from
random()
by.This defaults to the
agent_http_retry_delay_factor
configuration variable. - offset –
The initial offset to start the calculation at.
This defaults to the
agent_http_retry_delay_offset
configuration variable. - rand – A callable to determine randomness, defaulting to
random()
. This is mainly used for testing purposes.
- factor (int or float) –
-
class
pyfarm.agent.http.core.client.
Request
[source]¶ Bases:
pyfarm.agent.http.core.client.Request
Contains all the information used to perform a request such as the
method
,url
, and original keyword arguments (kwargs
). These values contain the basic information necessary in order toretry()
a request.
-
class
pyfarm.agent.http.core.client.
Response
(deferred, response, request)[source]¶ Bases:
twisted.internet.protocol.Protocol
This class receives the incoming response body from a request constructs some convenience methods and attributes around the data.
Parameters: - deferred (Deferred) – The deferred object which contains the target callback and errback.
- response – The initial response object which will be passed along to the target deferred.
- request (Request) – Named tuple object containing the method name, url, headers, and data.
-
data
()[source]¶ Returns the data currently contained in the buffer.
Raises: RuntimeError – Raised if this method id called before all data has been received.
-
json
(loader=<function loads>)[source]¶ Returns the json data from the incoming request
Raises: - RuntimeError – Raised if this method id called before all data has been received.
- ValueError – Raised if the content type for this request is not application/json.
-
pyfarm.agent.http.core.client.
request
(method, url, **kwargs)[source]¶ Wrapper around
treq.request()
with some added arguments and validation.Parameters: - method (str) – The HTTP method to use when making the request.
- url (str) – The url this request will be made to.
- data (str, list, tuple, set, dict) – The data to send along with some types of requests
such as
POST
orPUT
- headers (dict) – The headers to send along with the request to
url
. Currently only single values per header are supported. - callback (function) – The function to deliver an instance of
Response
once we receive and unpack a response. - errback (function) – The function to deliver an error message to. By default
this will use
log.err()
. - response_class (class) – The class to use to unpack the internal response. This is mainly used by the unittests but could be used elsewhere to add some custom behavior to the unpack process for the incoming response.
Raises: NotImplementedError – Raised whenever a request is made of this function that we can’t implement such as an invalid http scheme, request method or a problem constructing data to an api.
-
pyfarm.agent.http.core.client.
random
() → x in the interval [0, 1).¶
Base resources which can be used to build top leve documents, pages, or other types of data for the web.
-
class
pyfarm.agent.http.core.resource.
Resource
[source]¶ Bases:
twisted.web.resource.Resource
Basic subclass of
_Resource
for passing requests to specific methods. Unlike_Resource
however this will will also handle:- Templates
- Content type discovery and validation
- Handling of deferred responses
- Validation of POST/PUT data against a schema
Variables: - TEMPLATE (string) – The name of the template this class will use when rendering an html view.
- SCHEMAS –
A dictionary of schemas to validate the data of an incoming request against. The structure of this dictionary is:
{http method: <instance of voluptuous.Schema>}
If the schema validation fails the request will be rejected with
400 BAD REQUEST
. - ALLOWED_CONTENT_TYPE –
An instance of
frozenset
which describes what this resource is going to allow in theContent-Type
header. The request and this instance must share at least on entry in common. If not, the request will be rejected with415 UNSUPPORTED MEDIA TYPE
.This must be defined in subclass
- ALLOWED_ACCEPT –
An instance of
frozenset
which describes what this resource is going to allow in theAccept
header. The request and this instance must share at least one entry in common. If not, the request will be rejected with406 NOT ACCEPTABLE
.This must be defined in subclass
- DEFAULT_ACCEPT – If
Accept
header is not present in the request, use this as the value instead. This defaults tofrozenset(["*/*"])
- DEFAULT_CONTENT_TYPE – If
Content-Type
header is not present in the request, use this as the value instead. This defaults tofrozenset([""])
-
TEMPLATE
= NotImplemented¶
-
SCHEMAS
= {}¶
-
ALLOWED_ACCEPT
= NotImplemented¶
-
ALLOWED_CONTENT_TYPE
= NotImplemented¶
-
DEFAULT_ACCEPT
= frozenset(['*/*'])¶
-
DEFAULT_CONTENT_TYPE
= frozenset([None])¶
-
template
¶ Loads the template provided but the partial path in
TEMPLATE
on the class.
-
get_content_type
(request)[source]¶ Return the
Content-Type
header(s) in the request orDEFAULT_CONTENT_TYPE
if the header is not set.
-
get_accept
(request)[source]¶ Return the
Accept
header(s) in the request orDEFAULT_ACCEPT
if the header is not set.
-
putChild
(path, child)[source]¶ Overrides the builtin putChild() so we can return the results for each call and use them externally.
-
error
(request, code, message)[source]¶ Writes the proper out an error response message depending on the content type in the request
-
set_response_code_if_not_set
(request, code)[source]¶ Sets the response code if one has not already been set
-
render_tuple
(request, response)[source]¶ Takes a response tuple of
(body, code, headers)
or(body, code)
and renders the resulting data onto the request.
HTTP server responsible for serving requests that
control or query the running agent. This file produces
a service that the pyfarm.agent.manager.service.ManagerServiceMaker
class can consume on start.
-
class
pyfarm.agent.http.core.server.
Site
(resource, requestFactory=None, *args, **kwargs)[source]¶ Bases:
twisted.web.server.Site
Site object similar to Twisted’s except it also carries along some of the internal agent data.
-
displayTracebacks
= True¶
-
-
class
pyfarm.agent.http.core.server.
StaticPath
(*args, **kwargs)[source]¶ Bases:
twisted.web.static.File
More secure version of
File
that does not list directories. In addition this will also sending along a response header asking clients to cache to data.-
EXPIRES
= 604800¶
-
ALLOW_DIRECTORY_LISTING
= False¶
-
Submodules¶
pyfarm.agent.http.system module¶
-
class
pyfarm.agent.http.system.
HTMLResource
[source]¶ Bases:
pyfarm.agent.http.core.resource.Resource
-
ALLOWED_CONTENT_TYPE
= frozenset(['', 'text/html'])¶
-
ALLOWED_ACCEPT
= frozenset(['*/*', 'text/html'])¶
-
-
class
pyfarm.agent.http.system.
Index
[source]¶ Bases:
pyfarm.agent.http.system.HTMLResource
serves request for the root, ‘/’, target
-
TEMPLATE
= 'index.html'¶
-
-
class
pyfarm.agent.http.system.
Configuration
[source]¶ Bases:
pyfarm.agent.http.system.HTMLResource
-
TEMPLATE
= 'configuration.html'¶
-
HIDDEN_FIELDS
= ('agent', 'agent_pretty_json')¶
-
EDITABLE_FIELDS
= ('agent_cpus', 'agent_hostname', 'master_api', 'master', 'agent_ram_check_interval', 'agent_ram', 'agent_ram_report_delta', 'agent_time_offset', 'state', 'agent_http_retry_delay_factor', 'agent_http_retry_delay_offset')¶
-
pyfarm.agent.sysinfo package¶
Submodules¶
pyfarm.agent.sysinfo.cpu module¶
Contains information about the cpu and its relation to the operating system such as load, processing times, etc.
-
pyfarm.agent.sysinfo.cpu.
cpu_name
()[source]¶ Returns the full name of the CPU installed in the system.
-
pyfarm.agent.sysinfo.cpu.
total_cpus
(logical=True)[source]¶ Returns the total number of cpus installed on the system.
Parameters: logical (bool) – If True the return the number of cores the system has. Setting this value to False will instead return the number of physical cpus present on the system.
-
pyfarm.agent.sysinfo.cpu.
load
(interval=1)[source]¶ Returns the load across all cpus value from zero to one. A value of 1.0 means the average load across all cpus is 100%.
-
pyfarm.agent.sysinfo.cpu.
user_time
()[source]¶ Returns the amount of time spent by the cpu in user space
-
pyfarm.agent.sysinfo.cpu.
system_time
()[source]¶ Returns the amount of time spent by the cpu in system space
pyfarm.agent.sysinfo.disks module¶
Contains information about the local disks.
-
class
pyfarm.agent.sysinfo.disks.
DiskInfo
(mountpoint, free, size)¶ Bases:
tuple
-
free
¶ Alias for field number 1
-
mountpoint
¶ Alias for field number 0
-
size
¶ Alias for field number 2
-
-
pyfarm.agent.sysinfo.disks.
disks
(as_dict=False)[source]¶ Returns a list of disks in the system, in the form of
DiskInfo
objects.Parameters: as_dict (bool) – If True then return a dictionary value instead of DiskInfo
instances. This is mainly used by the agent to eliminate an extra loop for translation.
pyfarm.agent.sysinfo.graphics module¶
pyfarm.agent.sysinfo.memory module¶
pyfarm.agent.sysinfo.network module¶
Returns information about the network including ip address, dns, data sent/received, and some error information.
const IP_PRIVATE: | |
---|---|
set of private class A, B, and C network ranges See also |
|
const IP_NONNETWORK: | |
set of non-network address ranges including all of the
above constants except the |
-
pyfarm.agent.sysinfo.network.
mac_addresses
(long_addresses=False, as_integers=False)[source]¶ Returns a tuple of all mac addresses on the system.
Parameters:
-
pyfarm.agent.sysinfo.network.
hostname
(trust_name_from_ips=True)[source]¶ Returns the hostname which the agent should send to the master.
Parameters: trust_resolved_name (bool) – If True and all addresses provided by addresses()
resolve to a single hostname then just return that name as it’s the most likely hostname to be accessible by the rest of the network.
pyfarm.agent.sysinfo.software module¶
Contains utilities to check the availability of certain software on the local machine.
-
pyfarm.agent.sysinfo.software.
get_software_version_data
(*args, **kwargs)[source]¶ Asynchronously fetches the known data about the given software version from the master.
Parameters: Returns: Returns information about the given software version from the master
-
pyfarm.agent.sysinfo.software.
get_discovery_code
(*args, **kwargs)[source]¶ Asynchronously fetches the discovery code for the given software version from the master.
Parameters: Returns: Returns the discovery code from the master/
-
pyfarm.agent.sysinfo.software.
check_software_availability
(*args, **kwargs)[source]¶ Asynchronously checks for the availability of a given software in a given version. Will pass True to its callback function if the software could be found, False otherwise. Works only for software versions that have a discovery registered on the master.
Parameters:
pyfarm.agent.sysinfo.system module¶
Information about the operating system including type, filesystem information, and other relevant information. This module may also contain os specific information such as the Linux distribution, Windows version, bitness, etc.
-
pyfarm.agent.sysinfo.system.
filesystem_is_case_sensitive
()[source]¶ returns True if the file system is case sensitive
-
pyfarm.agent.sysinfo.system.
environment_is_case_sensitive
()[source]¶ returns True if the environment is case sensitive
-
pyfarm.agent.sysinfo.system.
machine_architecture
(arch='x86_64')[source]¶ returns the architecture of the host itself
-
pyfarm.agent.sysinfo.system.
interpreter_architecture
()[source]¶ returns the architecture of the interpreter itself (32 or 64)
-
pyfarm.agent.sysinfo.system.
uptime
()[source]¶ Returns the amount of time the system has been running in seconds.
-
pyfarm.agent.sysinfo.system.
operating_system
(plat='linux2')[source]¶ Returns the operating system for the given platform. Please note that while you can call this function directly you’re more likely better off using values in
pyfarm.core.enums
instead.
pyfarm.agent.sysinfo.user module¶
Returns information about the current user such as the user name, admin access, or other related information.
Module contents¶
Top level module which provides information about the operating system, system memory, network, and processor related information
Submodules¶
pyfarm.agent.config module¶
Configuration¶
Central module for storing and working with a live configuration objects. This
module instances ConfigurationWithCallbacks
onto config
.
Attempting to reload this module will not reinstance the config
object.
The config
object should be directly imported from this
module to be used:
>>> from pyfarm.agent.config import config
-
class
pyfarm.agent.config.
LoggingConfiguration
(data=None, environment=None, load=True)[source]¶ Bases:
pyfarm.core.config.Configuration
Special configuration object which logs when a key is changed in a dictionary. If the reactor is not running then log messages will be queued until they can be emitted so they are not lost.
-
_expandvars
(value)¶ Performs variable expansion for
value
. This method is run when a string value is returned fromget()
or__getitem__()
. The default behavior of this method is to recursively expand variables using sources in the following order:- The environment,
os.environ
- The environment (from the configuration),
env
- Other values in the configuration
~
to the user’s home directory
For example, the following configuration:
foo: foo bar: bar foobar: $foo/$bar path: ~/$foobar/$TEST
Would result in the following assuming
$TEST
is an environment variable set tosomevalue
and the current user’s name isuser
:{ "foo": "foo", "bar": "bar", "foobar": "foo/bar", "path": "/home/user/foo/bar/somevalue" }
- The environment,
-
MODIFIED
= 'modified'¶
-
CREATED
= 'created'¶
-
DELETED
= 'deleted'¶
-
clear
()[source]¶ Deletes all keys in this object and triggers a
delete
event usingchanged()
for each one.
-
update
(data=None, **kwargs)[source]¶ Updates the data held within this object and triggers the appropriate events with
changed()
.
-
-
class
pyfarm.agent.config.
ConfigurationWithCallbacks
(data=None, environment=None, load=True)[source]¶ Bases:
pyfarm.agent.config.LoggingConfiguration
Subclass of
LoggingDictionary
that provides the ability to run a function when a value is changed.-
callbacks
= {}¶
-
classmethod
register_callback
(key, callback, append=False)[source]¶ Register a function as a callback for
key
. Whenkey
is set the givencallback
will be run bychanged()
Parameters:
-
classmethod
deregister_callback
(key, callback)[source]¶ Removes any callback(s) that are registered with the provided
key
-
clear
(callbacks=False)[source]¶ Performs the same operations as
dict.clear()
except this method can also clear any registered callbacks if requested.
-
pyfarm.agent.manhole module¶
Manhole¶
Provides a way to access the internals of the agent via the telnet protocol.
-
class
pyfarm.agent.manhole.
LoggingManhole
(namespace=None)[source]¶ Bases:
twisted.conch.manhole.ColoredManhole
A slightly modified implementation of
ColoredManhole
which logs information to the logger so we can track activity in the agent’s log.
-
class
pyfarm.agent.manhole.
TransportProtocolFactory
(portal)[source]¶ Bases:
object
Glues together a portal along with the
TelnetTransport
andAuthenticatingTelnetProtocol
objects. This class is instanced onto theprotocol
attribute of theServerFactory
class inbuild_manhole()
.
-
class
pyfarm.agent.manhole.
TelnetRealm
[source]¶ Bases:
object
Wraps together
ITelnetProtocol
,TelnetBootstrapProtocol
,ServerProtocol
andColoredManhole
inrequestAvatar()
which will provide the interface to the manhole.-
NAMESPACE
= None¶
-
pyfarm.agent.service module¶
Manager Service¶
Sends and receives information from the master and performs systems level tasks such as log reading, system information gathering, and management of processes.
-
class
pyfarm.agent.service.
Agent
[source]¶ Bases:
object
Main class associated with getting getting the internals of the agent’s operations up and running including adding or updating itself with the master, starting the periodic task manager, and handling shutdown conditions.
-
classmethod
agent_api
()[source]¶ Return the API url for this agent or None if agent_id has not been set
-
classmethod
agents_endpoint
()[source]¶ Returns the API endpoint for used for updating or creating agents on the master
-
shutting_down
¶
-
repeating_call
(delay, function, function_args=None, function_kwargs=None, now=True, repeat_max=None, function_id=None)[source]¶ Causes
function
to be called repeatedly up untilrepeat_max
or until stopped.Parameters: - delay (int) –
Number of seconds to delay between calls of
function
.Note
delay
is an approximate interval between when one call ends and the next one begins. The exact time can vary due to how the Twisted reactor runs, how long it takesfunction
to run and what else may be going on in the agent at the time. - function – A callable function to run
- function_args (tuple, list) – Arguments to pass into
function
- function_kwargs (dict) – Keywords to pass into
function
- now (bool) – If True then run
function
right now in addition to scheduling it. - repeat_max (int) – Repeat calling
function
this may times. If not provided then we’ll continue to repeat callingfunction
until the agent shuts down. - function_id (uuid.UUID) – Used internally to track a function’s execution count. This
keyword exists so if you call
repeating_call()
multiple times on the same function or method it will handlerepeat_max
properly.
- delay (int) –
-
should_reannounce
()[source]¶ Small method which acts as a trigger for
reannounce()
-
reannounce
(*args, **kwargs)[source]¶ Method which is used to periodically contact the master. This method is generally called as part of a scheduled task.
-
system_data
(requery_timeoffset=False)[source]¶ Returns a dictionary of data containing information about the agent. This is the information that is also passed along to the master.
-
start
(shutdown_events=True, http_server=True)[source]¶ Internal code which starts the agent, registers it with the master, and performs the other steps necessary to get things running.
Parameters:
-
stop
(*args, **kwargs)[source]¶ Internal code which stops the agent. This will terminate any running processes, inform the master of the terminated tasks, update the state of the agent on the master.
-
post_shutdown_to_master
(*args, **kwargs)[source]¶ This method is called before the reactor shuts down and lets the master know that the agent’s state is now
offline
-
post_agent_to_master
(*args, **kwargs)[source]¶ Runs the POST request to contact the master. Running this method multiple times should be considered safe but is generally something that should be avoided.
-
classmethod
pyfarm.agent.testutil module¶
-
class
pyfarm.agent.testutil.
skipIf
(should_skip, reason)[source]¶ Bases:
object
Wrapping a test with this class will allow the test to be skipped if
should_skip
evals as True.
-
pyfarm.agent.testutil.
random_port
(bind='127.0.0.1')[source]¶ Returns a random port which is not in use
-
pyfarm.agent.testutil.
create_jobtype
(classname=None, sourcecode=None)[source]¶ Creates a job type on the master and fires a deferred when finished
-
class
pyfarm.agent.testutil.
APITestServerResource
[source]¶ Bases:
twisted.web.resource.Resource
-
isLeaf
= False¶
-
render_POST
(request)¶
-
render_PUT
(request)¶
-
render_GET
(request)¶
-
render_DELETE
(request)¶
-
-
class
pyfarm.agent.testutil.
APITestServer
(url, code=None, response=None, headers=None)[source]¶ Bases:
object
A object used for setting up a fake HTTP server which can respond to requests during a test.
-
class
pyfarm.agent.testutil.
DummyRequest
(postpath='/', session=None)[source]¶ Bases:
twisted.web.test.requesthelper.DummyRequest
-
code
= 200¶
-
setHeader
(name, value)[source]¶ Default override, _DummyRequest.setHeader does not actually set the response headers. Instead it sets the value in a different location that’s never used in an actual request.
-
-
class
pyfarm.agent.testutil.
TestCase
(methodName='runTest')[source]¶ Bases:
twisted.trial._asynctest.TestCase
-
longMessage
= True¶
-
POP_CONFIG_KEYS
= []¶
-
RAND_LENGTH
= 8¶
-
maxDiff
= None¶
-
timeout
= 15¶
-
assertRaisesRegexp
(expected_exception, expected_regexp, callable_obj=None, *args, **kwargs)[source]¶
-
create_file
(content=None, dir=None, suffix='')[source]¶ Creates a test file on disk using
tempfile.mkstemp()
and uses the lower level file interfaces to manage it. This is done to ensure we have more control of the file descriptor itself so on platforms such as Windows we don’t have to worry about running out of file handles.
-
-
class
pyfarm.agent.testutil.
BaseRequestTestCase
(methodName='runTest')[source]¶ Bases:
pyfarm.agent.testutil.TestCase
-
HTTP_SCHEME
= 'http'¶
-
TEST_URL
= 'http://httpbin.org'¶
-
REDIRECT_TARGET
= 'http://example.com'¶
-
HTTP_REQUEST_SUCCESS
= None¶
-
-
class
pyfarm.agent.testutil.
BaseHTTPTestCase
(methodName='runTest')[source]¶ Bases:
pyfarm.agent.testutil.TestCase
-
URI
= NotImplemented¶
-
CLASS
= NotImplemented¶
-
CLASS_FACTORY
= NotImplemented¶
-
DEFAULT_HEADERS
= NotImplemented¶
-
-
class
pyfarm.agent.testutil.
BaseAPITestCase
(methodName='runTest')[source]¶ Bases:
pyfarm.agent.testutil.BaseHTTPTestCase
-
DEFAULT_HEADERS
= {'Accept': ['application/json']}¶
-
pyfarm.agent.utility module¶
Utilities¶
Top level utilities for the agent to use internally. Many of these are copied over from the master (which we can’t import here).
-
pyfarm.agent.utility.
validate_environment
(values)[source]¶ Ensures that
values
is a dictionary and that it only contains string keys and values.
-
pyfarm.agent.utility.
validate_uuid
(value)[source]¶ Ensures that
value
can be converted to or is a UUID object.
-
pyfarm.agent.utility.
TASKS_SCHEMA
(values)¶
-
pyfarm.agent.utility.
json_safe
(source)[source]¶ Recursively converts
source
into something that should be safe forjson.dumps()
to handle. This is used in conjunction withdefault_json_encoder()
to also convert keys to something the json encoder can understand.
-
pyfarm.agent.utility.
quote_url
(source_url)[source]¶ This function serves as a wrapper around
urlsplit()
andquote()
and a url that has the path quoted.
-
pyfarm.agent.utility.
dumps
(*args, **kwargs)[source]¶ Agent’s implementation of
json.dumps()
orpyfarm.master.utility.jsonify()
-
pyfarm.agent.utility.
request_from_master
(request)[source]¶ Returns True if the request appears to be coming from the master
-
class
pyfarm.agent.utility.
UTF8Recoder
(f, encoding)[source]¶ Bases:
object
Iterator that reads an encoded stream and reencodes the input to UTF-8
-
class
pyfarm.agent.utility.
UnicodeCSVReader
(f, dialect=<class csv.excel>, encoding='utf-8', **kwds)[source]¶ Bases:
object
A CSV reader which will iterate over lines in the CSV file “f”, which is encoded in the given encoding.
-
class
pyfarm.agent.utility.
UnicodeCSVWriter
(f, dialect=<class csv.excel>, **kwds)[source]¶ Bases:
object
A CSV writer which will write rows to CSV file “f”, which is encoded in the given encoding.
-
pyfarm.agent.utility.
total_seconds
(td)[source]¶ Returns the total number of seconds in the time delta object. This function is provided for backwards comparability with Python 2.6.
-
class
pyfarm.agent.utility.
AgentUUID
[source]¶ Bases:
object
This class wraps all the functionality required to load, cache and retrieve an Agent’s UUID.
-
log
= <pyfarm.agent.logger.python.Logger object>¶
-
classmethod
load
(path)[source]¶ A classmethod to load a UUID object from a path. If the provided
path
does not exist or does not contain data which can be converted into a UUID objectNone
will be returned.
-
classmethod
save
(agent_uuid, path)[source]¶ Saves
agent_uuid
topath
. This classmethod will also create the necessary parent directories and handle conversion from the input typeuuid.UUID
.
-
classmethod
generate
()[source]¶ Generates a UUID object. This simply wraps
uuid.uuid4()
and logs a warning.
-
-
pyfarm.agent.utility.
remove_file
(path, retry_on_exit=False, raise_=True, ignored_errnos=(2, ))[source]¶ Simple function to remove the provided file or retry on exit if requested. This function standardizes the log output, ensures it’s only called once per path on exit and handles platform specific exceptions (ie.
WindowsError
).Parameters: - retry_on_exit (bool) – If True, retry removal of the file when Python exists.
- raise (bool) – If True, raise an exceptions produced. This will always be
False if
remove_file()
is being executed byatexit
- ignored_errnos (tuple) – A tuple of ignored error numbers. By default this function only ignores ENOENT.
-
pyfarm.agent.utility.
remove_directory
(path, retry_on_exit=False, raise_=True, ignored_errnos=(2, ))[source]¶ Simple function to recursively remove the provided directory or retry on exit if requested. This function standardizes the log output, ensures it’s only called once per path on exit and handles platform specific exceptions (ie.
WindowsError
).Parameters: - retry_on_exit (bool) – If True, retry removal of the file when Python exists.
- raise (bool) – If True, raise an exceptions produced. This will always be
False if
remove_directory()
is being executed byatexit
- ignored_errnos (tuple) – A tuple of ignored error numbers. By default this function only ignores ENOENT.
-
exception
pyfarm.agent.utility.
LockTimeoutError
[source]¶ Bases:
exceptions.Exception
Raised if we timeout while attempting to acquire a deferred lock
-
class
pyfarm.agent.utility.
TimedDeferredLock
[source]¶ Bases:
twisted.internet.defer.DeferredLock
A subclass of
DeferredLock
which has a timeout for theacquire()
call.-
acquire
(timeout=None)[source]¶ This method operates the same as
DeferredLock.acquire()
does except it requires a timeout argument.Parameters: timeout (int) – The number of seconds to wait before timing out. Raises: LockTimeoutError – Raised if the timeout was reached before we could acquire the lock.
-
pyfarm.jobtypes package¶
Subpackages¶
pyfarm.jobtypes.core package¶
Submodules¶
pyfarm.jobtypes.core.internals module¶
Contains classes which contain internal methods for
the pyfarm.jobtypes.core.jobtype.JobType
class.
-
class
pyfarm.jobtypes.core.internals.
ProcessData
(protocol, started, stopped)¶ Bases:
tuple
-
protocol
¶ Alias for field number 0
-
started
¶ Alias for field number 1
-
stopped
¶ Alias for field number 2
-
-
exception
pyfarm.jobtypes.core.internals.
InsufficientSpaceError
[source]¶ Bases:
exceptions.Exception
-
class
pyfarm.jobtypes.core.internals.
Cache
[source]¶ Bases:
object
Internal methods for caching job types
-
cache
= {}¶
-
JOBTYPE_VERSION_URL
= '%(master_api)s/jobtypes/%(name)s/versions/%(version)s'¶
-
CACHE_DIRECTORY
= '/tmp/pyfarm/agent/jobtype_cache'¶
-
e
= OSError(17, 'File exists')¶
-
pyfarm.jobtypes.core.jobtype module¶
This module contains the core job type from which all
other job types are built. All other job types must
inherit from the JobType
class in this modle.
-
class
pyfarm.jobtypes.core.jobtype.
CommandData
(command, *arguments, **kwargs)[source]¶ Bases:
object
Stores data to be returned by
JobType.get_command_data()
. Instances of this class are alosed used byJobType.spawn_process_inputs()
at execution time.Note
This class does not perform any key of path resolution by default. It is assumed this has already been done using something like
JobType.map_path()
Parameters: - command (string) – The command that will be executed when the process runs.
- arguments – Any additional arguments to be passed along to the command being launched.
- env (dict) – If provided, this will be the environment to launch the command
with. If this value is not provided then a default environment
will be setup using
set_default_environment()
whenJobType.start()
is called.JobType.start()
itself will useJobType.set_default_environment()
to generate the default environment. - cwd (string) – The working directory the process should execute in. If not provided the process will execute in whatever the directory the agent is running inside of.
- user (string or integer) – The username or user id that the process should run as. On Windows
this keyword is ignored and on Linux this requires the agent to be
executing as root. The value provided here will be run through
JobType.get_uid_gid()
to map the incoming value to an integer. - group (string or integer) – Same as
user
above except this sets the group the process will execute. - id – An arbitrary id to associate with the resulting process protocol. This can help identify
-
validate
()[source]¶ Validates that the attributes on an instance of this class contain values we expect. This method is called externally by the job type in
JobType.start()
and may correct some instance attributes.
-
class
pyfarm.jobtypes.core.jobtype.
JobType
(assignment)[source]¶ Bases:
pyfarm.jobtypes.core.internals.Cache
,pyfarm.jobtypes.core.internals.System
,pyfarm.jobtypes.core.internals.Process
,pyfarm.jobtypes.core.internals.TypeChecks
Base class for all other job types. This class is intended to abstract away many of the asynchronous necessary to run a job type on an agent.
Variables: - PERSISTENT_JOB_DATA (set) – A dictionary of job ids and data that
prepare_for_job()
has produced. This is used during__init__()
to setpersistent_job_data
. - COMMAND_DATA_CLASS (CommandData) – If you need to provide your own class to represent command data you should override this attribute. This attribute is used by by methods within this class to do type checking.
- PROCESS_PROTOCOL (ProcessProtocol) – The protocol object used to communicate with each process spawned
- ASSIGNMENT_SCHEMA (voluptuous.Schema) – The schema of an assignment. This object helps to validate the incoming assignment to ensure it’s not missing any data.
- uuid (UUID) – This is the unique identifier for the job type instance and is automatically set when the class is instanced. This is used by the agent to track assignments and job type instances.
- finished_tasks (set) – A set of tasks that have had their state changed to finished through
set_task_state()
. At the start of the assignment, this list is empty. - failed_tasks (set) – This is analogous to
finished_tasks
except it contains failed tasks only.
Parameters: assignment (dict) – This attribute is a dictionary the keys “job”, “jobtype” and “tasks”. self.assignment[“job”] is itself a dict with keys “id”, “title”, “data”, “environ” and “by”. The most important of those is usually “data”, which is the dict specified when submitting the job and contains jobtype specific data. self.assignment[“tasks”] is a list of dicts representing the tasks in the current assignment. Each of these dicts has the keys “id” and “frame”. The list is ordered by frame number.
-
PERSISTENT_JOB_DATA
= {}¶
-
COMMAND_DATA
¶ alias of
CommandData
-
PROCESS_PROTOCOL
¶ alias of
ProcessProtocol
-
ASSIGNMENT_SCHEMA
= <Schema({'job': <Schema({'cpus': Any([<type 'int'>, <type 'long'>]), 'ram': Any([<type 'int'>, <type 'long'>]), 'ram_max': Any([Any([<type 'int'>, <type 'long'>]), <type 'NoneType'>]), 'title': Any([<type 'str'>, <type 'unicode'>]), 'notes': Any([<type 'str'>, <type 'unicode'>]), 'num_tiles': Any([Any([<type 'int'>, <type 'long'>]), <type 'NoneType'>]), 'batch': Any([<type 'int'>, <type 'long'>]), 'by': Any([<type 'int'>, <type 'long'>, <type 'float'>, <class 'decimal.Decimal'>]), 'priority': Any([<type 'int'>, <type 'long'>]), 'notified_users': [<Schema({'username': Any([<type 'str'>, <type 'unicode'>]), 'on_failure': <type 'bool'>, 'on_success': <type 'bool'>, 'on_deletion': <type 'bool'>}, extra=PREVENT_EXTRA, required=False) object>], 'tags': [Any([<type 'str'>, <type 'unicode'>])], 'environ': <function validate_environment>, 'agent_id': Any([<type 'int'>, <type 'long'>]), 'job_group': Any([<type 'str'>, <type 'unicode'>]), 'job_group_id': Any([<type 'int'>, <type 'long'>]), 'data': <type 'dict'>, 'id': Any([<type 'int'>, <type 'long'>]), 'ram_warning': Any([Any([<type 'int'>, <type 'long'>]), <type 'NoneType'>]), 'user': Any([<type 'str'>, <type 'unicode'>])}, extra=PREVENT_EXTRA, required=False) object>, 'tasks': <function <lambda>>, 'id': <function validate_uuid>, 'jobtype': <Schema({'version': Any([<type 'int'>, <type 'long'>]), 'name': Any([<type 'str'>, <type 'unicode'>])}, extra=PREVENT_EXTRA, required=False) object>}, extra=PREVENT_EXTRA, required=False) object>¶
-
classmethod
load
(assignment)[source]¶ Given an assignment this class method will load the job type either from cache or from the master.
Parameters: assignment (dict) – The dictionary containing the assignment. This will be passed into an instance of ASSIGNMENT_SCHEMA
to validate that the internal data is correct.
-
classmethod
prepare_for_job
(job)[source]¶ Note
This method is not yet implemented
Called before a job executes on the agent first the first time. Whatever this classmethod returns will be available as
persistent_job_data
on the job type instance.Parameters: job (int) – The job id which prepare_for_job is being run for By default this method does nothing.
-
classmethod
cleanup_after_job
(persistent_data)[source]¶ Note
This method is not yet implemented
This classmethod will be called after the last assignment from a given job has finished on this node.
Parameters: persistent_data – The persistent data that prepare_for_job()
produced. The value for this data may beNone
ifprepare_for_job()
returned None or was not implemented.
-
classmethod
spawn_persistent_process
(job, command_data)[source]¶ Note
This method is not yet implemented
Starts one child process using an instance of
CommandData
or similiar input. This process is intended to keep running until the last task from this job has been processed, potentially spanning more than one assignment. If the spawned process is still running then we’ll cleanup the process aftercleanup_after_job()
-
node
()[source]¶ Returns live information about this host, the operating system, hardware, and several other pieces of global data which is useful inside of the job type. Currently data from this method includes:
- master_api - The base url the agent is using to communicate with the master.
- hostname - The hostname as reported to the master.
- agent_id - The unique identifier used to identify. this agent to the master.
- id - The database id of the agent as given to us by the master on startup of the agent.
- cpus - The number of CPUs reported to the master
- ram - The amount of ram reported to the master.
- total_ram - The amount of ram, in megabytes, that’s installed on the system regardless of what was reported to the master.
- free_ram - How much ram, in megabytes, is free for the entire system.
- consumed_ram - How much ram, in megabytes, is being consumed by the agent and any processes it has launched.
- admin - Set to True if the current user is an administrator or ‘root’.
- user - The username of the current user.
- case_sensitive_files - True if the file system is case sensitive.
- case_sensitive_env - True if environment variables are case sensitive.
- machine_architecture - The architecture of the machine the agent is running on. This will return 32 or 64.
- operating_system - The operating system the agent is executing on. This value will be ‘linux’, ‘mac’ or ‘windows’. In rare circumstances this could also be ‘other’.
Raises: KeyError – Raised if one or more keys are not present in the global configuration object.
This should rarely if ever be a problem under normal circumstances. The exception to this rule is in unittests or standalone libraries with the global config object may not be populated.
-
tempdir
(new=False, remove_on_finish=True)[source]¶ Returns a temporary directory to be used within a job type. By default once called the directory will be created on disk and returned from this method.
Calling this method multiple times will return the same directory instead of creating a new directory unless
new
is set to True.Parameters:
-
get_uid_gid
(user, group)[source]¶ Overridable. This method to convert a named user and group into their respective user and group ids.
-
get_environment
()[source]¶ Constructs an environment dictionary that can be used when a process is spawned by a job type.
-
get_command_list
(commands)[source]¶ Convert a list of commands to a tuple with any environment variables expanded.
Parameters: commands (list) – A list of strings to expand. Each entry in list will be passed into and returned from expandvars()
.Raises: TypeError – Raised of commands
is not a list or tuple.Return type: tuple Returns: Returns the expanded list of commands.
-
get_csvlog_path
(protocol_uuid, create_time=None)[source]¶ Returns the path to the comma separated value (csv) log file. The agent stores logs from processes in a csv format so we can store additional information such as a timestamp, line number, stdout/stderr identification and the the log message itself.
Note
This method should not attempt to create the parent directories of the resulting path. This is already handled by the logger pool in a non-blocking fashion.
Parameters: - protocol_uuid (uuid.UUID) – The UUID of the job type’s protocol instance.
- create_time (datetime.datetime) – If provided then the create time of the log file will equal
this value. Otherwise this method will use the current UTC
time for
create_time
Raises: TypeError – Raised if
protocl_uuid
orcreate_time
are not the correct types.
-
get_command_data
()[source]¶ Overridable. This method returns the arguments necessary for executing a command. For job types which execute a single process per assignment, this is the most important method to implement.
Warning
This method should not be used when this jobtype requires more than one process for one assignment and may not get called at all if
start()
was overridden.The default implementation does nothing. When overriding this method you should return an instance of
COMMAND_DATA_CLASS
:return self.COMMAND_DATA( "/usr/bin/python", "-c", "print 'hello world'", env={"FOO": "bar"}, user="bob")
See
CommandData
’s class documentation for a full description of possible arguments.Please note however the default command data class,
CommandData
does not perform path expansion. So instead you have to handle this yourself withmap_path()
.
-
map_path
(path)[source]¶ Takes a string argument. Translates a given path for any OS to what it should be on this particular node. This does not communicate with the master.
Parameters: path (string) – The path to translate to an OS specific path for this node. Raises: TypeError – Raised if path
is not a string.
-
expandvars
(value, environment=None, expand=None)[source]¶ Expands variables inside of a string using an environment.
Parameters: - value (string) – The path to expand.
- environment (dict) – The environment to use for expanding
value
. If this value is None (the default) then we’ll useget_environment()
to build this value. - expand (bool) – When not provided we use the
jobtype_expandvars
configuration value to set the default. When this value is True we’ll perform environment variable expansion otherwise we returnvalue
untouched.
-
start
()[source]¶ This method is called when the job type should start working. Depending on the job type’s implementation this will prepare and start one more more processes.
-
stop
(assignment_failed=False, avoid_reassignment=False, error=None, signal='KILL')[source]¶ This method is called when the job type should stop running. This will terminate any processes associated with this job type and also inform the master of any state changes to an associated task or tasks.
Parameters: - assignment_failed (boolean) – Whether this means the assignment has genuinely failed. By default, we assume that stopping this assignment was the result of deliberate user action (like stopping the job or shutting down the agent), and won’t treat it as a failed assignment.
- avoid_reassignment (boolean) – If set to true, the agent will add itself to the lists of agents that failed the tasks in this assignment. Can be useful when we want to return the assignment to the master without increasing its failures counter, but still don’t want it to be reassigned to us.
- error (string) – If the assignment has failed, this string is upload as last_error for the failed tasks.
- signal (string) – The signal to send the any running processes. Valid options are KILL, TERM or INT.
-
format_error
(error)[source]¶ Takes some kind of object, typically an instance of
Exception
or :class`Failure`, and produces a human readable string.Return type: string or None Returns: Returns a string if we know how to format the error. Otherwise this method returns None
and logs an error.
-
set_states
(tasks, state, error=None)[source]¶ Wrapper around
set_state()
that that allows you to the state on the master for multiple tasks at once.
-
add_self_to_failed_on_agents
(*args, **kwargs)[source]¶ Adds this agent to the list of agents that failed to execute the given task, without explicitly setting this task to failed.
Parameters: task (dict) – The dictionary containing the task
-
set_task_started_now
(*args, **kwargs)[source]¶ Sets the time_started of the given task to the current time on the master.
This method is useful for batched tasks, where the actual work on a single task may start much later than the work the assignment as a whole.
Parameters: task (dict) – The dictionary containing the task we’re changing the start time for.
-
set_task_state
(*args, **kwargs)[source]¶ Sets the state of the given task
Parameters: - task (dict) – The dictionary containing the task we’re changing the state for.
- state (string) – The state to change
task
to - error (string,
Exception
) – If the state is changing to ‘error’ then also set thelast_error
column. Any exception instance that is passed to this keyword will be passed throughformat_exception()
first to format it.
-
get_local_task_state
(task_id)[source]¶ Returns None if the state of this task has not been changed locally since this assignment has started. This method does not communicate with the master.
-
is_successful
(protocol, reason)[source]¶ Overridable. This method that determines whether the process referred to by a protocol instance has exited successfully.
The default implementation returns
True
if the process’s return code was 0 andFalse` in all other cases. If you need to modify this behavior please be aware that ``reason
may be an integer or an instance oftwisted.internet.error.ProcessTerminated
if the process terminated without errors or an instance oftwisted.python.failure.Failure
if there were problems.Raises: NotImplementedError – Raised if we encounter a condition that the base implementation is unable to handle.
-
before_start
()[source]¶ Overridable. This method called directly before
start()
itself is called.The default implementation does nothing and values returned from this method are ignored.
-
before_spawn_process
(command, protocol)[source]¶ Overridable. This method called directly before a process is spawned.
By default this method does nothing except log information about the command we’re about to launch both the the agent’s log and to the log file on disk.
Parameters: - command (CommandData) – An instance of
CommandData
which contains the environment to use, command and arguments. Modifications to this object will be applied to the process being spawned. - protocol (ProcessProtocol) – An instance of
pyfarm.jobtypes.core.process.ProcessProtocol
which contains the protocol used to communicate between the process and this job type.
- command (CommandData) – An instance of
-
process_stopped
(protocol, reason)[source]¶ Overridable. This method called when a child process stopped running.
The default implementation will mark all tasks in the current assignment as done or failed of there was at least one failed process.
-
process_started
(protocol)[source]¶ Overridable. This method is called when a child process started running.
The default implementation will mark all tasks in the current assignment as running.
-
process_output
(protocol, output, line_fragments, line_handler)[source]¶ This is a mid-level method which takes output from a process protocol then splits and processes it to ensure we pass complete output lines to the other methods.
Implementors who wish to process the output line by line should override
preprocess_stdout_line()
,preprocess_stdout_line()
,process_stdout_line()
orprocess_stderr_line()
instead. This method is a glue method between other parts of the job type and should only be overridden if there’s a problem or you want to change how lines are split.Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedoutput
- output (string) – The blob of text or line produced
- line_fragments (dict) – The line fragment dictionary containing individual line
fragments. This will be either
self._stdout_line_fragments
orself._stderr_line_fragments
. - line_handler (callable) – The function to handle any lines produced. This will be either
handle_stdout_line()
orhandle_stderr_line()
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
-
handle_stdout_line
(protocol, stdout)[source]¶ Takes a
ProcessProtocol
instance andstdout
line produced byprocess_output()
and runs it through all the steps necessary to preprocess, format, log and handle the line.The default implementation will run
stdout
through several methods in order:Warning
This method is not private however it’s advisable to override the methods above instead of this one. Unlike this method, which is more generalized and invokes several other methods, the above provide more targeted functionality.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstdout
- stderr (string) – A complete line to
stderr
being emitted by the process
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
-
handle_stderr_line
(protocol, stderr)[source]¶ Overridable. Takes a
ProcessProtocol
instance andstderr
produced byprocess_output()
and runs it through all the steps necessary to preprocess, format, log and handle the line.The default implementation will run
stderr
through several methods in order:Warning
This method is overridable however it’s advisable to override the methods above instead. Unlike this method, which is more generalized and invokes several other methods, the above provide more targeted functionality.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstdout
- stderr (string) – A complete line to
stderr
being emitted by the process
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
-
preprocess_stdout_line
(protocol, stdout)[source]¶ Overridable. Provides the ability to manipulate
stdout
orprotocol
before it’s passed into any other line handling methods.The default implementation does nothing.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstdout
- stderr (string) – A complete line to
stdout
before any formatting or logging has occurred.
Return type: Returns: This method returns nothing by default but when overridden should return a string which will be used in line handling methods such as
format_stdout_line()
,log_stdout_line()
andprocess_stdout_line()
.- protocol (
-
preprocess_stderr_line
(protocol, stderr)[source]¶ Overridable. Formats a line from
stdout
before it’s passed onto methods such aslog_stdout_line()
andprocess_stdout_line()
.The default implementation does nothing.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstderr
- stderr (string) – A complete line to
stderr
before any formatting or logging has occurred.
Return type: Returns: This method returns nothing by default but when overridden should return a string which will be used in line handling methods such as
format_stderr_line()
,log_stderr_line()
andprocess_stderr_line()
.- protocol (
-
format_stdout_line
(protocol, stdout)[source]¶ Overridable. Formats a line from
stdout
before it’s passed onto methods such aslog_stdout_line()
andprocess_stdout_line()
.The default implementation does nothing.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstdout
- stdout (string) – A complete line from process to format and return.
Return type: Returns: This method returns nothing by default but when overridden should return a string which will be used in
log_stdout_line()
andprocess_stdout_line()
- protocol (
-
format_stderr_line
(protocol, stderr)[source]¶ Overridable. Formats a line from
stderr
before it’s passed onto methods such aslog_stderr_line()
andprocess_stderr_line()
.The default implementation does nothing.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstderr
- stderr (string) – A complete line from the process to format and return.
Return type: Returns: This method returns nothing by default but when overridden should return a string which will be used in
log_stderr_line()
andprocess_stderr_line()
- protocol (
-
log_stdout_line
(protocol, stdout)[source]¶ Overridable. Called when we receive a complete line on stdout from the process.
The default implementation will use the global logging pool to log
stdout
to a file.Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstdout
- stderr (string) – A complete line to
stdout
that has been formatted and is ready to log to a file.
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
-
log_stderr_line
(protocol, stderr)[source]¶ Overridable. Called when we receive a complete line on stderr from the process.
The default implementation will use the global logging pool to log
stderr
to a file.Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstderr
- stderr (string) – A complete line to
stderr
that has been formatted and is ready to log to a file.
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
-
process_stderr_line
(protocol, stderr)[source]¶ Overridable. This method is called when we receive a complete line to
stderr
. The line will be preformatted and will already have been sent for logging.The default implementation sends ``stderr`` and ``protocol`` to :meth:`process_stdout_line`.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstderr
- stderr (string) – A complete line to
stderr
after it has been formatted and logged.
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
-
process_stdout_line
(protocol, stdout)[source]¶ Overridable. This method is called when we receive a complete line to
stdout
. The line will be preformatted and will already have been sent for logging.The default implementation does nothing.
Parameters: - protocol (
ProcessProtocol
) – The protocol instance which producedstderr
- stdout (string) – A complete line to
stdout
after it has been formatted and logged.
Returns: This method returns nothing by default and any return value produced by this method will not be consumed by other methods.
- protocol (
- PERSISTENT_JOB_DATA (set) – A dictionary of job ids and data that
pyfarm.jobtypes.core.process module¶
Module responsible for connecting a Twisted process object and a job type. Additionally this module contains other classes which are useful in starting or managing a process.
-
class
pyfarm.jobtypes.core.process.
ReplaceEnvironment
(frozen_environment, environment=None)[source]¶ Bases:
object
A context manager which will replace
os.environ
’s, or dictionary of your choosing, for a short period of time. After exiting the context manager the original environment will be restored.This is useful if you have something like a process that’s using global environment and you want to ensure that global environment is always consistent.
Parameters: environment (dict) – If provided, use this as the environment dictionary instead of os.environ
-
class
pyfarm.jobtypes.core.process.
ProcessProtocol
(jobtype)[source]¶ Bases:
twisted.internet.protocol.ProcessProtocol
Subclass of
Protocol
which hooks into the various systems necessary to run and manage a process. More specifically, this helps to act as plumbing between the process being run and the job type.-
uuid
¶
-
pid
¶
-
process
¶ The underlying Twisted process object
-
psutil_process
¶ Returns a
psutil.Process
object for the running process
-
connectionMade
()[source]¶ Called when the process first starts and the file descriptors have opened.
-
Module contents¶
Module contents¶
Job Types¶
This package, pyfarm.jobtypes
, contains the code which
executes a task on an agent.