Track¶
Installation¶
pip install -r requirements
python setup.py install
Documentation¶
sphinx-build -W --color -c docs/src/ -b html docs/src/ docs/build/html
(cd docs/build/html && python -m http.server 8000 --bind 127.0.0.1)
Overview¶
from track import TrackClient
client = TrackClient('file://client_test.json')
client.set_project(name='test_client')
trial = client.new_trial()
trial.log_arguments(batch_size=256)
with trial:
trial.log_metrics(step=1, epoch_loss=1)
trial.log_metrics(accuracy=0.98)
client.save()
client.report()
Overview¶
Track as 3 kind of objects, Project, Trial Group and Trial.
- Project is a top level object that holds all of its trials and groups
- TrialGroup is a set of trials. They are used to order trials together. trials can belong to multiple groups
- Trial is the object holding all the information about a given training session. the trial object is the backbone of track and it is the object you will have to deal with the most often
Overview¶
from track import TrackClient
client = TrackClient('file://client_test.json')
project = client.set_project(name='paper_78997')
group = client.set_group(name='idea_4573')
trial = client.new_trial(name='final_trial_2', description='almost graduating')
trial.log_arguments(batch_size=256, lr=0.01,momentum=0.99)
trial.log_metadata(gpu='V100')
# start the trial explicitly
with trial:
for e in range(epochs):
for batch in dataset:
# trial helper that compute elapsed time inside a block
with trial.chrono('batch_time'):
...
loss += ...
trial.log_metrics(step=e, epoch_loss=loss)
trial.log_metrics(accuracy=0.98)
client.report()
You can find the sample of a report below
{
"revision": 1,
"name": "final_trial_2",
"description": "almost graduating",
"version": "a8c3",
"tags": {
"workers": 8,
"hpo": "byopt"
},
"parameters": {
"batch_size": 32,
"cuda": true,
"workers": 0,
"seed": 0,
"epochs": 2,
"arch": "convnet",
"lr": 0.1,
"momentum": 0.9,
"opt_level": "O0",
"break_after": null,
"data": "mnist",
"backend": null
},
"metadata": {},
"metrics": {
"epoch_loss": {
"0": 2.306920262972514,
"1": 2.307889754740397
}
},
"chronos": {
"runtime": 3142.5199086666107,
"batch_time": {
"avg": 0.6737696465350126,
"min": 0.019209623336791992,
"max": 445.9658739566803,
"sd": 12.500646799505962,
"count": 3751,
"unit": "s"
}
},
"errors": [],
"status": {
"value": 302,
"name": "Completed"
}
}
Log Metrics¶
User can log metrics with a step or without. step is used as key in a dictionary and should be unique
trial.log_metrics(step=e, epoch_loss=loss, metric2=value)
trial.log_metrics(cost=val)
Time things¶
You can easily time things with chrono. Do not forget if you are measuring GPU compute time should should synchronize to make sure the computation are done before computing the elapsed time.
with trial.chrono('long_compute'):
sleep(100)
Save arbitrary data¶
You can use metadata to save information on a specific trial that might not be reflected by its parameters
trial.log_metadata(had_short_hair_when_running_this_trial=False)
Experiment Report¶
Get a quick overview of all the data that was saved up during training
trial.report()
Backends¶
Track was made to support different backends, you can even implement your own!
Local Backend¶
Track implements a local storage backend for quick and simple experiments
client = TrackClient(f'file://report.json')
CockroachDB backend¶
Track implements a backend that can use a running cockroachdb instance as storage.
address = '127.0.0.1
port = 8123
client = TrackClient(f'cockroach://{address}:{port}')
Socket backend¶
Track implements a backend that uses sockets to forward request to a remote server
Server¶
Simple servers that receive request from the client and forwards all request to another backend. The example below forwards all request to the local backend, allowing to have a single process modifying the file.
from track.persistence.socketed import start_track_server
address = '127.0.0.1
port = 8123
layer = 'AES'
start_track_server('file:server_test.json', address, port, backend=layer)
Client¶
Start a client that forwards all request to a remote server
username = ...
password = ...
address = '127.0.0.1
port = 8123
layer = 'AES' # supported AES or (None, i.e put nothing)
client = TrackClient(f'socket://{username}:{password}@{address}:{port}?security_layer={layer}')
Bring Your Own Backend¶
To implement you own you can simply extend track.persistence.protocol.Protocol
from track.persistence import register
from track.persistence.protocol import Protocol
class MyOwnBackend(Protocol):
....
register('byob', MyOwnBackend)
You can then use it naturally
client = TrackClient('byob://....)
Simple example¶
Installation and setup¶
In this tutorial you will run a very simple MNIST example in pytorch using Track.
First, install Track,
then install pytorch
, torchvision
and clone the
PyTorch examples repository:
$ pip3 install torch torchvision
$ git clone git@github.com:pytorch/examples.git
Adapting the code of MNIST example¶
After cloning pytorch examples repository, cd to mnist folder:
$ cd examples/mnist
In main, just after parsing the arguments, you can initialize the track client and create a trial. The client specifies how will the data be saved on your computer, different methods are supported. Once the client is initialized, you can create a new trial.
A trial is a set of data retrieved for a set of arguments.
$ ....
$ args = parser.parse_args()
$ client = TrackClient('file:mnist_example.json')
$ trial = client.new_trial(arguments=args)
Then you can store any kind of data that you think will be useful. In our example we decided to save the error rate on the test set
$ def test(args, model, device, test_loader, trial):
$ ...
$ trial.log_metrics(error_rate=1 - (correct / len(test_loader.dataset)))
At the end of training file mnist_example.json will be generated holding all the data you saved during training.
track¶
track package¶
Subpackages¶
track.aggregators package¶
-
class
track.aggregators.aggregator.
Aggregator
[source]¶ Bases:
object
Attributes: val
Return the last observed value
Methods
lazy
(aggregator_t, \*\*kwargs)Lazily instantiate the underlying aggregator append to_json -
val
¶ Return the last observed value
-
class
track.aggregators.aggregator.
RingAggregator
(n, dtype='f')[source]¶ Bases:
track.aggregators.aggregator.Aggregator
Saves the n last elements. Start overriding the elements once n elements is reached
Attributes: val
Return the last observed value
Methods
lazy
(n, dtype)Lazily instantiate the underlying aggregator append to_json -
val
¶ Return the last observed value
-
class
track.aggregators.aggregator.
StatAggregator
(skip_obs=10)[source]¶ Bases:
track.aggregators.aggregator.Aggregator
Compute mean, sd, min, max; does not keep the entire history. This is useful if you are worried about memory usage and the values should not vary much. i.e keeping the entire history is not useful.
Attributes: - avg
- max
- min
- sd
- sum
- total
val
Return the last observed value
Methods
lazy
(skip)Lazily instantiate the underlying aggregator append from_json to_json -
avg
¶
-
max
¶
-
min
¶
-
sd
¶
-
sum
¶
-
total
¶
-
val
¶ Return the last observed value
-
class
track.aggregators.aggregator.
TimeSeriesAggregator
[source]¶ Bases:
track.aggregators.aggregator.Aggregator
Keeps the entire history of the metric
Attributes: val
Return the last observed value
Methods
lazy
()Lazily instantiate the underlying aggregator append to_json -
val
¶ Return the last observed value
-
class
track.aggregators.aggregator.
ValueAggregator
(val=None)[source]¶ Bases:
track.aggregators.aggregator.Aggregator
Does not Aggregate only keeps the latest value
Attributes: val
Return the last observed value
Methods
lazy
()Lazily instantiate the underlying aggregator append to_json -
val
¶ Return the last observed value
track.containers package¶
track.distributed package¶
-
class
track.distributed.cockroachdb.
CockRoachDB
(location, addrs, join=None, clean_on_exit=True)[source]¶ Bases:
object
cockroach db is a highly resilient database that allow us to remove the Master in a traditional distributed setup.
This spawn a cockroach node that will store its data in location
Attributes: - build
- client_flags
- node_id
- sql
- status
- webui
Methods
parse start stop wait -
build
¶
-
client_flags
¶
-
node_id
¶
-
sql
¶
-
status
¶
-
webui
¶
track.persistence package¶
-
class
track.persistence.local.
FileProtocol
(uri, strict=True, eager=True)[source]¶ Bases:
track.persistence.protocol.Protocol
Local File storage to manage experiments
Parameters: - uri: str
resource to use to store the experiment file://my_file.json
- strict: bool
forces the storage to be correct. if we use the file protocol as an in-memory storage we might get some inconsistencies we can use this flag to ignore them
- eager: bool
eagerly update the underlying files. This is necessary if multiple processes are reading from the file
Methods
commit
(self[, file_name_override])Forces to persist the change add_group_trial add_project_trial add_trial_tags fetch_and_update_group fetch_and_update_trial fetch_groups fetch_projects fetch_trials get_project get_trial get_trial_group log_trial_arguments log_trial_chrono_finish log_trial_chrono_start log_trial_finish log_trial_metadata log_trial_metrics log_trial_start new_project new_trial new_trial_group set_group_metadata set_trial_status
-
class
track.persistence.local.
LockFileRemover
(filename)[source]¶ Bases:
track.utils.signal.SignalHandler
Methods
atexit remove sigint sigterm
-
track.persistence.local.
execute_query
(obj, query)[source]¶ Check if the object obj matches the query.
The query is a dictionary specifying constraint on each of the object attributes
-
track.persistence.local.
lock_atomic_write
(fun)¶
-
track.persistence.local.
lock_guard
(readonly, atomic=False)[source]¶ Protect a function call with a lock. reload the database before the action and save it afterwards
-
track.persistence.local.
lock_read
(fun)¶
-
track.persistence.local.
lock_write
(fun)¶
-
class
track.persistence.multiplexer.
ProtocolMultiplexer
(*backends)[source]¶ Bases:
object
Methods
get_project
(self, \*args, \*\*kwargs)add_group_trial add_project_trial add_trial_tags commit fetch_and_update_group fetch_and_update_trial fetch_groups fetch_projects fetch_trials get_trial get_trial_group log_trial_arguments log_trial_chrono_finish log_trial_chrono_start log_trial_finish log_trial_metadata log_trial_metrics log_trial_start new_project new_trial new_trial_group set_trial_status
-
class
track.persistence.protocol.
Protocol
[source]¶ Bases:
object
Methods
add_group_trial
(self, group, trial)Add a trial to a group add_project_trial
(self, project, trial)Add a trial to a project add_trial_tags
(self, trial, \*\*kwargs)Add tags to a trial commit
(self, \*\*kwargs)Forces to persist the change fetch_and_update_group
(self, query, attr, …)Fetch and update a single group fetch_and_update_trial
(self, query, attr, …)Fetch and update a single trial fetch_groups
(self, query)Fetch groups according to a given query fetch_projects
(self, query)Fetch projects according to a given query fetch_trials
(self, query)Fetch trials according to a given query get_project
(self, project)Fetch a project according to the given definition get_trial
(self, trial)Fetch trials according to a given definition get_trial_group
(self, group)Fetch a group according to a given definition log_trial_arguments
(self, trial, \*\*kwargs)Save the arguments a trail log_trial_chrono_finish
(self, trial, name, …)Send the end signal for an event log_trial_chrono_start
(self, trial, name, …)Send the start signal for an event log_trial_finish
(self, trial, exc_type, …)Send the trial end signal log_trial_metadata
(self, trial, aggregator, …)Save metadata for a given trials log_trial_metrics
(self, trial, step, …)Save metrics for a given trials log_trial_start
(self, trial)Send the trial start signal new_project
(self, project)Insert a new project new_trial
(self, trial[, auto_increment])Insert a new trial new_trial_group
(self, group)Create a new group set_trial_status
(self, trial, status[, error])Change trial status -
add_group_trial
(self, group: track.structure.TrialGroup, trial: track.structure.Trial)[source]¶ Add a trial to a group
-
add_project_trial
(self, project: track.structure.Project, trial: track.structure.Trial)[source]¶ Add a trial to a project
Add tags to a trial
Parameters: - trial: Trial
trial reference
- kwargs:
key value pair of the data to save
-
fetch_and_update_group
(self, query, attr, *args, **kwargs)[source]¶ Fetch and update a single group
Parameters: - query: Dict
dictionary to fetch groups
- attr: str
name of the update function to call on each selected group
- *args:
additional positional arguments for the attr function
- **kwargs:
addtional keyword arguments for the attr function
Returns: - returns the modified group
-
fetch_and_update_trial
(self, query, attr, *args, **kwargs)[source]¶ Fetch and update a single trial
Parameters: - query: Dict
dictionary to fetch trials
- attr: str
name of the update function to call on each selected trials
- *args:
additional positional arguments for the attr function
- **kwargs:
addtional keyword arguments for the attr function
Returns: - returns the modified trial
-
fetch_trials
(self, query) → List[track.structure.Trial][source]¶ Fetch trials according to a given query
-
get_project
(self, project: track.structure.Project) → Union[track.structure.Project, NoneType][source]¶ Fetch a project according to the given definition
Parameters: - project: Project
project definition used for the lookup
Returns: - returns a project object or None
-
get_trial
(self, trial: track.structure.Trial) → List[track.structure.Trial][source]¶ Fetch trials according to a given definition
Parameters: - trial: Trial
trial definition used for the lookup
-
get_trial_group
(self, group: track.structure.TrialGroup) → Union[track.structure.TrialGroup, NoneType][source]¶ Fetch a group according to a given definition
Parameters: - group: TrialGroup
group definition used for the lookup
Returns: - returns a grouo
-
log_trial_arguments
(self, trial: track.structure.Trial, **kwargs)[source]¶ Save the arguments a trail
Parameters: - trial: Trial
trial for which the arguments are for
- kwargs:
key value pair of arguments
-
log_trial_chrono_finish
(self, trial, name, exc_type, exc_val, exc_tb)[source]¶ Send the end signal for an event
Parameters: - trial: Trial
trial sending the event
- name: str
name of the event
- exc_type:
Exception object
- exec_val
Exception value
- exc_tb:
Traceback
-
log_trial_chrono_start
(self, trial, name: str, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = <function StatAggregator.lazy.<locals>.<lambda> at 0x7f2d953ed048>, start_callback=None, end_callback=None)[source]¶ Send the start signal for an event
Parameters: - trial: Trial
trial sending the event
- name: str
name of the event
- aggregator: Aggregator
container used to accumulate elapsed time
- start_callback: Callable
function called at start time
- end_callback: Callable
function called at the end
-
log_trial_finish
(self, trial, exc_type, exc_val, exc_tb)[source]¶ Send the trial end signal
Parameters: - trial: Trial
reference to the trial that finished
-
log_trial_metadata
(self, trial: track.structure.Trial, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = <function ValueAggregator.lazy.<locals>.<lambda> at 0x7f2d953d7e18>, **kwargs)[source]¶ Save metadata for a given trials
Parameters: - trial: Trial
trial reference
- kwargs:
key value pair of the data to save
-
log_trial_metrics
(self, trial: track.structure.Trial, step: <built-in function any> = None, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = None, **kwargs)[source]¶ Save metrics for a given trials
Parameters: - trial: Trial
trial reference
- kwargs:
key value pair of the data to save
-
log_trial_start
(self, trial)[source]¶ Send the trial start signal
Parameters: - trial: Trial
reference to the trial being started
-
new_project
(self, project: track.structure.Project)[source]¶ Insert a new project
Parameters: - project: Project
project definition used for the insert
-
new_trial
(self, trial: track.structure.Trial, auto_increment=False)[source]¶ Insert a new trial
Parameters: - trial: Trial
trial definition used for the insert
- auto_increment: bool
If trial exist increment revision number
Returns: - Returns None if Trial already exists and auto_increment is False
-
new_trial_group
(self, group: track.structure.TrialGroup)[source]¶ Create a new group
Parameters: - group: TrialGroup
group definition used for the insert
-
set_trial_status
(self, trial: track.structure.Trial, status, error=None)[source]¶ Change trial status
Parameters: - trial: Trial
trial reference
- status:
new status to update the trial too
- error:
in case the user is changing to a state representing an error it can also provide an error identification string
-
- Implement a Remote Logger.
- Client forwards all the user’s request down to the server that executes them one by one.
-
class
track.persistence.socketed.
ServerSignalHandler
(server)[source]¶ Bases:
track.utils.signal.SignalHandler
Methods
atexit sigint sigterm
-
class
track.persistence.socketed.
SocketClient
(uri)[source]¶ Bases:
track.persistence.protocol.Protocol
Forwards all the local track requests to the track server that execute the requests and send back the results
Clients can provide a username and password for authentication
Methods
add_group_trial
(self, group, trial)Add a trial to a group add_project_trial
(self, project, trial)Add a trial to a project add_trial_tags
(self, trial, \*\*kwargs)Add tags to a trial authenticate
(self, uri)returns the username and password used for authentication purposes you can override this function to implement a custom authentication method commit
(self, \*\*kwargs)Forces to persist the change fetch_and_update_group
(self, query, attr, …)Fetch and update a single group fetch_and_update_trial
(self, query, attr, …)Fetch and update a single trial fetch_groups
(self, query)Fetch groups according to a given query fetch_projects
(self, query)Fetch projects according to a given query fetch_trials
(self, query)Fetch trials according to a given query get_project
(self, project)Fetch a project according to the given definition get_trial
(self, trial)Fetch trials according to a given definition get_trial_group
(self, group)Fetch a group according to a given definition log_trial_arguments
(self, trial, \*\*kwargs)Save the arguments a trail log_trial_chrono_finish
(self, trial, name, …)Send the end signal for an event log_trial_chrono_start
(self, trial, name, …)Send the start signal for an event log_trial_finish
(self, trial, exc_type, …)Send the trial end signal log_trial_metadata
(self, trial, aggregator, …)Save metadata for a given trials log_trial_metrics
(self, trial, step, …)Save metrics for a given trials log_trial_start
(self, trial)Send the trial start signal new_project
(self, project)Insert a new project new_trial
(self, trial)Insert a new trial new_trial_group
(self, group)Create a new group set_trial_status
(self, trial, status[, error])Change trial status -
add_group_trial
(self, group: track.structure.TrialGroup, trial: track.structure.Trial)[source]¶ Add a trial to a group
-
add_project_trial
(self, project: track.structure.Project, trial: track.structure.Trial)[source]¶ Add a trial to a project
Add tags to a trial
Parameters: - trial: Trial
trial reference
- kwargs:
key value pair of the data to save
-
authenticate
(self, uri)[source]¶ returns the username and password used for authentication purposes you can override this function to implement a custom authentication method
-
get_project
(self, project: track.structure.Project)[source]¶ Fetch a project according to the given definition
Parameters: - project: Project
project definition used for the lookup
Returns: - returns a project object or None
-
get_trial
(self, trial: track.structure.Trial)[source]¶ Fetch trials according to a given definition
Parameters: - trial: Trial
trial definition used for the lookup
-
get_trial_group
(self, group: track.structure.TrialGroup)[source]¶ Fetch a group according to a given definition
Parameters: - group: TrialGroup
group definition used for the lookup
Returns: - returns a grouo
-
log_trial_arguments
(self, trial: track.structure.Trial, **kwargs)[source]¶ Save the arguments a trail
Parameters: - trial: Trial
trial for which the arguments are for
- kwargs:
key value pair of arguments
-
log_trial_chrono_finish
(self, trial, name, exc_type, exc_val, exc_tb)[source]¶ Send the end signal for an event
Parameters: - trial: Trial
trial sending the event
- name: str
name of the event
- exc_type:
Exception object
- exec_val
Exception value
- exc_tb:
Traceback
-
log_trial_chrono_start
(self, trial, name: str, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = <function StatAggregator.lazy.<locals>.<lambda> at 0x7f2d9351fd90>, start_callback=None, end_callback=None)[source]¶ Send the start signal for an event
Parameters: - trial: Trial
trial sending the event
- name: str
name of the event
- aggregator: Aggregator
container used to accumulate elapsed time
- start_callback: Callable
function called at start time
- end_callback: Callable
function called at the end
-
log_trial_finish
(self, trial, exc_type, exc_val, exc_tb)[source]¶ Send the trial end signal
Parameters: - trial: Trial
reference to the trial that finished
-
log_trial_metadata
(self, trial: track.structure.Trial, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = None, **kwargs)[source]¶ Save metadata for a given trials
Parameters: - trial: Trial
trial reference
- kwargs:
key value pair of the data to save
-
log_trial_metrics
(self, trial: track.structure.Trial, step: <built-in function any> = None, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = None, **kwargs)[source]¶ Save metrics for a given trials
Parameters: - trial: Trial
trial reference
- kwargs:
key value pair of the data to save
-
log_trial_start
(self, trial)[source]¶ Send the trial start signal
Parameters: - trial: Trial
reference to the trial being started
-
new_project
(self, project: track.structure.Project)[source]¶ Insert a new project
Parameters: - project: Project
project definition used for the insert
-
new_trial
(self, trial: track.structure.Trial)[source]¶ Insert a new trial
Parameters: - trial: Trial
trial definition used for the insert
- auto_increment: bool
If trial exist increment revision number
Returns: - Returns None if Trial already exists and auto_increment is False
-
new_trial_group
(self, group: track.structure.TrialGroup)[source]¶ Create a new group
Parameters: - group: TrialGroup
group definition used for the insert
-
set_trial_status
(self, trial: track.structure.Trial, status, error=None)[source]¶ Change trial status
Parameters: - trial: Trial
trial reference
- status:
new status to update the trial too
- error:
in case the user is changing to a state representing an error it can also provide an error identification string
-
-
class
track.persistence.socketed.
SocketServer
(uri)[source]¶ Bases:
track.persistence.protocol.Protocol
Start a track server inside a asyncio loop
Parameters: - uri: str
socket://{hostname}:{port}?security_layer={}&backend={protocol} with
- Users inherit this class to implement their own custom authentication
Methods
add_group_trial
(self, group, trial)Add a trial to a group add_project_trial
(self, project, trial)Add a trial to a project add_trial_tags
(self, trial, \*\*kwargs)Add tags to a trial authenticate
(self, reader, username, password)User defined authentication function commit
(self, \*\*kwargs)Forces to persist the change fetch_and_update_group
(self, query, attr, …)Fetch and update a single group fetch_and_update_trial
(self, query, attr, …)Fetch and update a single trial fetch_groups
(self, query)Fetch groups according to a given query fetch_projects
(self, query)Fetch projects according to a given query fetch_trials
(self, query)Fetch trials according to a given query get_project
(self, project)Fetch a project according to the given definition get_trial
(self, trial)Fetch trials according to a given definition get_trial_group
(self, group)Fetch a group according to a given definition log_trial_arguments
(self, trial, \*\*kwargs)Save the arguments a trail log_trial_chrono_finish
(self, trial, name, …)Send the end signal for an event log_trial_chrono_start
(self, trial, name, …)Send the start signal for an event log_trial_finish
(self, trial, exc_type, …)Send the trial end signal log_trial_metadata
(self, trial, aggregator, …)Save metadata for a given trials log_trial_metrics
(self, trial, step, …)Save metrics for a given trials log_trial_start
(self, trial)Send the trial start signal new_project
(self, project)Insert a new project new_trial
(self, trial[, auto_increment])Insert a new trial new_trial_group
(self, group)Create a new group process_args
(self, args[, cache])replace ids by their object reference so the backend modifies the objects and not a copy run_server
(self)set_trial_status
(self, trial, status[, error])Change trial status close close_connection exec get_username handle_client is_authenticated wait_closed -
authenticate
(self, reader, username, password)[source]¶ User defined authentication function
Parameters: - reader: StreamReader
client socket / reader, can be used to link client socket -> username
- username: str
client username
- password: str
client password
-
track.persistence.socketed.
start_track_server
(protocol, hostname, port, security_layer=None)[source]¶ Start a track server inside a asyncio loop
Parameters: - protocol: str
URI that defines which backend to forward the request to
- hostname: str
server host name
- port: int
server port to listen to
- security_layer: str
backend used for encryption (only AES is supported)
-
class
track.persistence.storage.
LocalStorage
(target_file: str = None, _objects: Dict[uuid.UUID, <built-in function any>] = <factory>, _projects: Set[uuid.UUID] = <factory>, _groups: Set[uuid.UUID] = <factory>, _trials: Set[uuid.UUID] = <factory>, _project_names: Dict[str, uuid.UUID] = <factory>, _group_names: Dict[str, uuid.UUID] = <factory>, _trial_names: Dict[str, uuid.UUID] = <factory>, _old_rev_tags: Dict[str, int] = <factory>)[source]¶ Bases:
object
Attributes: - group_names
- groups
- objects
- project_names
- projects
- target_file
- trials
Methods
reload
(self[, filename])Reload storage and discard current objects smart_reload
(self[, filename])Updates current objects with new data commit get_current_version_tag get_previous_version_tag -
group_names
¶
-
groups
¶
-
objects
¶
-
project_names
¶
-
projects
¶
-
target_file
= None¶
-
trials
¶
track.utils package¶
-
class
track.utils.delay.
DelayedCall
(fun, kwargs)[source]¶ Bases:
object
Delay a call until later
Methods
__call__
(self, \*args, \*\*kwargs)Call self as a function. add_arguments get_future
-
class
track.utils.encrypted.
EncryptedSocket
(*args, **kwargs)[source]¶ Bases:
socket.socket
Socket with an encrypted layer
Attributes: family
Read-only access to the address family for this socket.
proto
the socket protocol
timeout
the socket timeout
type
Read-only access to the socket type.
Methods
accept
(self)Accept an incoming connection & initialize the encryption layer for that client bind
(address)Bind the socket to a local address. close
()Close the socket. connect
(address)Connect the socket to a remote address. connect_ex
()This is like connect(address), but returns an error code (the errno value) instead of raising an exception when an error occurs. detach
(self)Close the socket object without closing the underlying file descriptor. dup
(self)Duplicate the socket. fileno
()Return the integer file descriptor of the socket. get_inheritable
(self)Get the inheritable flag of the socket getblocking
()Returns True if socket is in blocking mode, or False if it is in non-blocking mode. getpeername
()Return the address of the remote endpoint. getsockname
()Return the address of the local endpoint. getsockopt
()Get a socket option. gettimeout
()Returns the timeout in seconds (float) associated with socket operations. listen
([backlog])Enable a server to accept connections. makefile
(self[, mode, buffering, encoding, …])The arguments are as for io.open() after the filename, except the only supported mode values are ‘r’ (default), ‘w’ and ‘b’. recv
(self, buffersize, flags[, context])Receive up to buffersize bytes from the socket. recv_into
()A version of recv() that stores its data into a buffer rather than creating a new string. recvfrom
(buffersize[, flags])Like recv(buffersize, flags) but also return the sender’s address info. recvfrom_into
(buffer[, nbytes[, flags]])Like recv_into(buffer[, nbytes[, flags]]) but also return the sender’s address info. recvmsg
(bufsize[, ancbufsize[, flags]])Receive normal data (up to bufsize bytes) and ancillary data from the socket. recvmsg_into
(buffers[, ancbufsize[, flags]])Receive normal data and ancillary data from the socket, scattering the non-ancillary data into a series of buffers. send
(self, data, flags)Send a data string to the socket. sendall
(data[, flags])Send a data string to the socket. sendfile
(self, file[, offset, count])Send a file until EOF is reached by using high-performance os.sendfile() and return the total number of bytes which were sent. sendmsg
()Send normal and ancillary data to the socket, gathering the non-ancillary data from a series of buffers and concatenating it into a single message. sendmsg_afalg
([msg], *, op[, iv[, assoclen[)Set operation mode, IV and length of associated data for an AF_ALG operation socket. sendto
()Like send(data, flags) but allows specifying the destination address. set_inheritable
(self, inheritable)Set the inheritable flag of the socket setblocking
(flag)Set the socket to blocking (flag is true) or non-blocking (false). setsockopt
(level, option, value, option, …)Set a socket option. settimeout
(timeout)Set a timeout on socket operations. shutdown
(flag)Shut down the reading side of the socket (flag == SHUT_RD), the writing side of the socket (flag == SHUT_WR), or both ends (flag == SHUT_RDWR). readsize -
accept
(self)[source]¶ Accept an incoming connection & initialize the encryption layer for that client
Returns: - returns (socket, addr) of the client
-
recv
(self, buffersize, flags: int = 0, context=None)[source]¶ Receive up to buffersize bytes from the socket. For the optional flags argument, see the Unix manual. When no data is available, block until at least one byte is available or until the remote end is closed. When the remote end is closed and all data is read, return the empty string.
-
class
track.utils.eta.
EstimatedTime
(stat_timer: track.utils.stat.StatStream, total: Union[int, List[int]], start: int = 0, name: str = None)[source]¶ Bases:
object
Compute estimated time to arrival given average time and remaining steps
Examples
>>> timer = StatStream() >>> total = (10, 1000) >>> eta = EstimatedTime(timer, total) >>> eta.estimate_time((1, 2))
Attributes: - total
Methods
count
(item[, offset])Return the current iteration it given the completion of each steps elapsed
(self, unit)Return the elapsed time since the class was created estimated_time
(self, step, unit)Estimate the time remaining before the end of the computation set_totals
(self, t)Set the total number of iteration for each step show_eta
(self, step[, msg, show])Print the estimate time until the processing is done -
static
count
(item, offset=0)[source]¶ Return the current iteration it given the completion of each steps
-
estimated_time
(self, step: int, unit: int = 60)[source]¶ Estimate the time remaining before the end of the computation
-
show_eta
(self, step, msg='', show=True)[source]¶ Print the estimate time until the processing is done
-
total
¶
-
class
track.utils.stat.
StatStream
(drop_first_obs=10)[source]¶ Bases:
object
Sharable object
Store the sum of the observations amd the the sum of the observations squared The first few observations are discarded (usually slower than the rest)
The average and the standard deviation is computed at the user’s request
In order to make the computation stable we store the first observation and subtract it to every other observations. The idea is if x ~ N(mu, sigma) x - x0 and the sum of x - x0 should be close(r) to 0 allowing for greater precision; without that trick
var
was getting negative on some iteration.Attributes: - avg
- count
- current_count
- current_obs
- drop_obs
- first_obs
- max
- min
- sd
- sum
- sum_sqr
- total
- val
- var
Methods
from_dict state_dict to_array to_dict to_json update -
avg
¶
-
count
¶
-
current_count
¶
-
current_obs
¶
-
drop_obs
¶
-
first_obs
¶
-
max
¶
-
min
¶
-
sd
¶
-
sum
¶
-
sum_sqr
¶
-
total
¶
-
val
¶
-
var
¶
-
class
track.utils.stat.
StatStreamStruct
[source]¶ Bases:
_ctypes.Structure
Attributes: - current_count
Structure/Union member
- current_obs
Structure/Union member
- drop_obs
Structure/Union member
- first_obs
Structure/Union member
- max
Structure/Union member
- min
Structure/Union member
- sum
Structure/Union member
- sum_sqr
Structure/Union member
-
current_count
¶ Structure/Union member
-
current_obs
¶ Structure/Union member
-
drop_obs
¶ Structure/Union member
-
first_obs
¶ Structure/Union member
-
max
¶ Structure/Union member
-
min
¶ Structure/Union member
-
sum
¶ Structure/Union member
-
sum_sqr
¶ Structure/Union member
-
class
track.utils.throttle.
ThrottleRepeatedCalls
(fun: Callable[[A], R], every=10)[source]¶ Bases:
object
Limit how often the function fun is called in number of times called
Methods
__call__
(self, \*args, \*\*kwargs)Call self as a function.
-
class
track.utils.throttle.
Throttler
(fun: Callable[[A], R], throttle=1)[source]¶ Bases:
object
Limit how often the function fun is called by calling it only every throttle time it has been called
Methods
__call__
(self, \*args, \*\*kwargs)Call self as a function.
-
class
track.utils.throttle.
TimeThrottler
(fun: Callable[[A], R], every=10)[source]¶ Bases:
object
Limit how often the function fun is called in seconds
Methods
__call__
(self, \*args, \*\*kwargs)Call self as a function.
Submodules¶
track.chrono module¶
track.client module¶
-
class
track.client.
TrackClient
(backend='none')[source]¶ Bases:
object
TrackClient. A client tracks a single Trial being ran
Parameters: - backend: str
Storage backend to use
Methods
add_tags
(self, \*\*kwargs)Insert tags to current trials get_arguments
(self, args, …[, show])See log_arguments()
for possible argumentsget_device
()Helper function that returns a cuda device if available else a cpu log_arguments
(self, args, …[, show])Store the arguments that was used to run the trial. new_trial
(self[, force])Create a new trial report
(self[, short])Print a digest of the logged metrics save
(self[, file_name_override])Saved logged metrics into a json file set_group
(self, group, NoneType] = None, …)Set or create a new group set_project
(self, project, NoneType] = None, …)Set or create a new project set_trial
(self, trial, NoneType] = None, …)Set a new trial set_version
(self[, version])Compute the version tag from the function call stack. finish start Insert tags to current trials
-
get_arguments
(self, args: Union[argparse.ArgumentParser, argparse.Namespace, Dict] = None, show=False, **kwargs) → argparse.Namespace[source]¶ See
log_arguments()
for possible arguments
-
log_arguments
(self, args: Union[argparse.ArgumentParser, argparse.Namespace, Dict] = None, show=False, **kwargs) → argparse.Namespace[source]¶ Store the arguments that was used to run the trial.
Parameters: - args: Union[ArgumentParser, Namespace, Dict]
save up the trial’s arguments
- show: bool
print the arguments on the command line
- kwargs
more trial’s arguments
Returns: - returns the trial’s arguments
-
new_trial
(self, force=False, **kwargs)[source]¶ Create a new trial
Parameters: - force: bool
by default once the trial is set it cannot be changed. use force to override this behaviour.
- kwargs:
See
Trial()
for possible arguments
Returns: - returns a trial logger
-
set_group
(self, group: Union[track.structure.TrialGroup, NoneType] = None, force: bool = False, get_only: bool = False, **kwargs)[source]¶ Set or create a new group
Parameters: - group: Optional[TrialGroup]
project definition you can use to create or set the project
- force: bool
by default once the trial group is set it cannot be changed. use force to override this behaviour.
- get_only: bool
if true does not insert the group if missing. default to false
- kwargs
arguments used to create a
TrialGroup
object if no TrialGroup object were provided. SeeTrialGroup()
for possible arguments
Returns: - returns created trial group
-
set_project
(self, project: Union[track.structure.Project, NoneType] = None, force: bool = False, get_only: bool = False, **kwargs)[source]¶ Set or create a new project
Parameters: - project: Optional[Project]
project definition you can use to create or set the project
- force: bool
by default once the project is set it cannot be changed. use force to override this behaviour.
- get_only: bool
if true does not insert the project if missing. default to false
- kwargs
arguments used to create a
Project
object if no project object were provided SeeProject()
for possible arguments
Returns: - returns created project
-
set_trial
(self, trial: Union[track.structure.Trial, NoneType] = None, force: bool = False, **kwargs)[source]¶ Set a new trial
Parameters: - trial: Optional[Trial]
project definition you can use to create or set the project
- force: bool
by default once the trial is set it cannot be changed. use force to override this behaviour.
- kwargs: {uid, hash, revision}
arguments used to create a
Trial
object if no Trial object were provided. You should specify uid or the pair (hash, revision). SeeTrial()
for possible arguments
Returns: - returns a trial logger
-
set_version
(self, version=None, version_fun: Callable[[], str] = None)[source]¶ Compute the version tag from the function call stack. Defaults to compute the hash of the executed file
Parameters: - version: str
version string you want to use for the trial
- version_fun: Callable[[], str]
version function to call to set the trial version
track.configuration module¶
track.logger module¶
-
class
track.logger.
LogSignalHandler
(logger)[source]¶ Bases:
track.utils.signal.SignalHandler
Methods
atexit sigint sigterm
-
class
track.logger.
LoggerChronoContext
(protocol, trial, acc=s<{'avg': 0.0, 'min': inf, 'max': -inf, 'sd': 0.0, 'count': 1, 'unit': 's'}>, name=None, **kwargs)[source]¶ Bases:
object
-
class
track.logger.
TrialLogger
(trial: track.structure.Trial, protocol: track.persistence.protocol.Protocol)[source]¶ Bases:
object
Unified logger interface. This object should be created through the TrackClient interface
Parameters: - trial: Trial
the trial that the logger modifies
- protocol: Protocol
the storage protocol used to persist the log calls
Methods
capture_output
(self[, output_size])capture standard output chrono
(self, name, aggregator, …[, …])Start a timer to measure the time spent in that block finish
(self[, exc_type, exc_val, exc_tb])finish trial, record end time and set the trial status to completed or interrupted log_arguments
(self, \*\*kwargs)log the trial arguments. log_metadata
(self, aggregator, …)insert metadata value inside a trial log_metrics
(self, step, aggregator, …)insert metrics values inside a trial set_status
(self, status[, error])update trial status start
(self)Start trial, records start time and set the trial status to running add_tags log_code log_directory log_file set_eta_total show_eta -
chrono
(self, name: str, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = <function StatAggregator.lazy.<locals>.<lambda> at 0x7f2d95171f28>, start_callback=None, end_callback=None)[source]¶ Start a timer to measure the time spent in that block
Parameters: - name: str
name of the timer
- aggregator:
how to save the values, by default it uses the
StatAggregator
and only the mean, sd, max, min values are kept once the training is done- start_callback: Callable
function that is called once the timer starts
- end_callback: Callable
function that is called once the timer ends
Returns: - returns a context manager that represents the timer
-
finish
(self, exc_type=None, exc_val=None, exc_tb=None)[source]¶ finish trial, record end time and set the trial status to completed or interrupted
-
log_arguments
(self, **kwargs)[source]¶ log the trial arguments. This function has not effect if the trial was already created.
-
log_metadata
(self, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = None, **kwargs)[source]¶ insert metadata value inside a trial
Parameters: - kwargs:
dictionary of metrics (metadata_name: value)
-
log_metrics
(self, step: <built-in function any> = None, aggregator: Callable[[], track.aggregators.aggregator.Aggregator] = None, **kwargs)[source]¶ insert metrics values inside a trial
Parameters: - step: any
a value representing a training step (could be epoch, timestamp, …)
- kwargs:
dictionary of metrics (metric_name: value)
- aggregator: Optional[Callable[[], Aggregator]]
how to store the values locally
track.serialization module¶
-
class
track.serialization.
SerializerChronoContext
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json
-
class
track.serialization.
SerializerDatetime
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json
-
class
track.serialization.
SerializerProject
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json
-
class
track.serialization.
SerializerStatStream
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json
-
class
track.serialization.
SerializerStatus
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json
-
class
track.serialization.
SerializerTrial
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json -
ignore_meta
= {'heartbeat', '_last_change', '_update_count'}¶
-
ignore_short
= {'hash', 'uid', 'dtype', 'project_id', 'group_id'}¶
-
-
class
track.serialization.
SerializerTrialGroup
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json maybe_unflatten to_json
-
class
track.serialization.
SerializerUUID
[source]¶ Bases:
track.serialization.SerializerAspect
Methods
from_json to_json
track.structure module¶
hold basic data type classes that all backends need to implement
-
class
track.structure.
CustomStatus
(name, value)[source]¶ Bases:
object
Attributes: - name
- value
-
name
¶
-
value
¶
-
class
track.structure.
Project
(_uid: str = None, name: Union[str, NoneType] = None, description: Union[str, NoneType] = None, metadata: Dict[str, any] = <factory>, groups: Set[track.structure.TrialGroup] = <factory>, trials: Set[track.structure.Trial] = <factory>) → None[source]¶ Bases:
object
Set of Trial Groups & trials If projects define tags than all children inherit those tags. children cannot override the tag of a parent
Attributes: - description
- name
- uid
Methods
compute_uid -
description
= None¶
-
name
= None¶
-
uid
¶
-
class
track.structure.
Status
[source]¶ Bases:
enum.Enum
An enumeration.
-
Broken
= 203¶
-
Completed
= 302¶
-
CreatedGroup
= 0¶
-
ErrorGroup
= 200¶
-
Exception
= 202¶
-
FinishedGroup
= 300¶
-
Interrupted
= 201¶
-
Running
= 101¶
-
RunningGroup
= 100¶
-
Suspended
= 301¶
-
-
class
track.structure.
Trial
(_hash: str = None, revision: int = 0, name: Union[str, NoneType] = None, description: Union[str, NoneType] = None, tags: Dict[str, any] = <factory>, version: Union[str, NoneType] = None, group_id: Union[int, NoneType] = None, project_id: Union[int, NoneType] = None, parameters: Dict[str, any] = <factory>, metadata: Dict[str, any] = <factory>, metrics: Dict[str, any] = <factory>, chronos: Dict[str, any] = <factory>, status: Union[track.structure.Status, NoneType] = <Status.CreatedGroup: 0>, errors: List[str] = <factory>) → None[source]¶ Bases:
object
A single training run
Attributes: - description
- group_id
- hash
- name
- project_id
- uid
- version
Methods
compute_hash -
description
= None¶
-
group_id
= None¶
-
hash
¶
-
name
= None¶
-
project_id
= None¶
-
revision
= 0¶
-
status
= 0¶
-
uid
¶
-
version
= None¶
-
class
track.structure.
TrialGroup
(_uid: str = None, name: Union[str, NoneType] = None, description: Union[str, NoneType] = None, metadata: Dict[str, any] = <factory>, trials: Set[track.structure.Trial] = <factory>, project_id: Union[int, NoneType] = None) → None[source]¶ Bases:
object
Namespace / Set of trials
Attributes: - description
- name
- project_id
- uid
Methods
compute_uid -
description
= None¶
-
name
= None¶
-
project_id
= None¶
-
uid
¶
track.versioning module¶
-
track.versioning.
default_version_hash
()[source]¶ get the current stack frames and from the file compute the version
Module contents¶
-
class
track.
TrackClient
(backend='none')[source]¶ Bases:
object
TrackClient. A client tracks a single Trial being ran
Parameters: - backend: str
Storage backend to use
Methods
add_tags
(self, \*\*kwargs)Insert tags to current trials get_arguments
(self, args, …[, show])See log_arguments()
for possible argumentsget_device
()Helper function that returns a cuda device if available else a cpu log_arguments
(self, args, …[, show])Store the arguments that was used to run the trial. new_trial
(self[, force])Create a new trial report
(self[, short])Print a digest of the logged metrics save
(self[, file_name_override])Saved logged metrics into a json file set_group
(self, group, NoneType] = None, …)Set or create a new group set_project
(self, project, NoneType] = None, …)Set or create a new project set_trial
(self, trial, NoneType] = None, …)Set a new trial set_version
(self[, version])Compute the version tag from the function call stack. finish start Insert tags to current trials
-
get_arguments
(self, args: Union[argparse.ArgumentParser, argparse.Namespace, Dict] = None, show=False, **kwargs) → argparse.Namespace[source]¶ See
log_arguments()
for possible arguments
-
log_arguments
(self, args: Union[argparse.ArgumentParser, argparse.Namespace, Dict] = None, show=False, **kwargs) → argparse.Namespace[source]¶ Store the arguments that was used to run the trial.
Parameters: - args: Union[ArgumentParser, Namespace, Dict]
save up the trial’s arguments
- show: bool
print the arguments on the command line
- kwargs
more trial’s arguments
Returns: - returns the trial’s arguments
-
new_trial
(self, force=False, **kwargs)[source]¶ Create a new trial
Parameters: - force: bool
by default once the trial is set it cannot be changed. use force to override this behaviour.
- kwargs:
See
Trial()
for possible arguments
Returns: - returns a trial logger
-
set_group
(self, group: Union[track.structure.TrialGroup, NoneType] = None, force: bool = False, get_only: bool = False, **kwargs)[source]¶ Set or create a new group
Parameters: - group: Optional[TrialGroup]
project definition you can use to create or set the project
- force: bool
by default once the trial group is set it cannot be changed. use force to override this behaviour.
- get_only: bool
if true does not insert the group if missing. default to false
- kwargs
arguments used to create a
TrialGroup
object if no TrialGroup object were provided. SeeTrialGroup()
for possible arguments
Returns: - returns created trial group
-
set_project
(self, project: Union[track.structure.Project, NoneType] = None, force: bool = False, get_only: bool = False, **kwargs)[source]¶ Set or create a new project
Parameters: - project: Optional[Project]
project definition you can use to create or set the project
- force: bool
by default once the project is set it cannot be changed. use force to override this behaviour.
- get_only: bool
if true does not insert the project if missing. default to false
- kwargs
arguments used to create a
Project
object if no project object were provided SeeProject()
for possible arguments
Returns: - returns created project
-
set_trial
(self, trial: Union[track.structure.Trial, NoneType] = None, force: bool = False, **kwargs)[source]¶ Set a new trial
Parameters: - trial: Optional[Trial]
project definition you can use to create or set the project
- force: bool
by default once the trial is set it cannot be changed. use force to override this behaviour.
- kwargs: {uid, hash, revision}
arguments used to create a Trial object if no Trial object were provided. You should specify uid or the pair (hash, revision). See
Trial()
for possible arguments
Returns: - returns a trial logger
-
set_version
(self, version=None, version_fun: Callable[[], str] = None)[source]¶ Compute the version tag from the function call stack. Defaults to compute the hash of the executed file
Parameters: - version: str
version string you want to use for the trial
- version_fun: Callable[[], str]
version function to call to set the trial version
-
class
track.
Project
(_uid: str = None, name: Union[str, NoneType] = None, description: Union[str, NoneType] = None, metadata: Dict[str, any] = <factory>, groups: Set[track.structure.TrialGroup] = <factory>, trials: Set[track.structure.Trial] = <factory>) → None[source]¶ Bases:
object
Set of Trial Groups & trials If projects define tags than all children inherit those tags. children cannot override the tag of a parent
Attributes: - description
- name
- uid
Methods
compute_uid -
description
= None¶
-
name
= None¶
-
uid
¶
-
class
track.
TrialGroup
(_uid: str = None, name: Union[str, NoneType] = None, description: Union[str, NoneType] = None, metadata: Dict[str, any] = <factory>, trials: Set[track.structure.Trial] = <factory>, project_id: Union[int, NoneType] = None) → None[source]¶ Bases:
object
Namespace / Set of trials
Attributes: - description
- name
- project_id
- uid
Methods
compute_uid -
description
= None¶
-
name
= None¶
-
project_id
= None¶
-
uid
¶