Welcome to Neural Pipeline’s documentation!

Getting started guide

First of all look at main classes of Neural Pipeline:

Training stages needed for customize training process. With it Trainer work by this scheme (dataflow scheme for single epoch):

_images/data_flow.svg

Implement dataset class

In Neural Pipeline dataset is iterable class. This means, that class need contain __getitem__ and __len__ methods.

For every i-th output, dataset need produce Python dict with keys ‘data’ and ‘target’.

Let’s create MNIST dataset, based on builtin PyTorch dataset:

from torchvision import datasets, transforms

class MNISTDataset(AbstractDataset):
    # define transforms
    transforms = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])

    def __init__(self, data_dir: str, is_train: bool):
        # instantiate PyTorch dataset
        self.dataset = datasets.MNIST(data_dir, train=is_train, download=True)

    # define method, that output dataset length
    def __len__(self):
        return len(self.dataset)

    # define method, that return single data by index
    def __getitem__(self, item):
        data, target = self.dataset[item]
        return {'data': self.transforms(data), 'target': target}

For work with this dataset we need wrap it by DataProducer:

from neural_pipeline import DataProducer

# create train and validation datasets objects
train_dataset = DataProducer([MNISTDataset('data/dataset', True)], batch_size=4, num_workers=2)
validation_dataset = DataProducer([MNISTDataset('data/dataset', False)], batch_size=4, num_workers=2)

Create TrainConfig

Now let’s define TrainConfig that will contains training hyperparameters.

In this tutorial we use predefined stages TrainStage and ValidationStage. TrainStage iterate by DataProducer and learn model in train() mode. Respectively ValidatioStage do same but in eval() mode.

from neural_pipeline import TrainConfig, TrainStage, ValidationStage

# define train stages
train_stages = [TrainStage(train_dataset), ValidationStage(validation_dataset)]

loss = torch.nn.NLLLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.5)

# define TrainConfig
train_config = TrainConfig(train_stages, loss, optimizer)

Create Trainer

First of all we need specify model, that will be trained:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

Now we need build our training process. It’s done by implements Trainer class:

from neural_pipeline import FileStructManager, Trainer

# define file structure for experiment
fsm = FileStructManager(base_dir='data', is_continue=False)

# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))

# specify training epochs number
trainer.set_epoch_num(50)

Last parameter or Trainer constructor - target device, that will be used for training.

Start training

Now we can just start training process:

trainer.train()

That’s all. Console output will look like that:

Epoch: [1]; train: [0.004141, 1.046422, 3.884116]; validation: [0.002027, 0.304710, 2.673034]
Epoch: [2]; train: [0.000519, 0.249564, 4.938250]; validation: [0.000459, 0.200972, 2.594026]
Epoch: [3]; train: [0.000182, 0.180328, 5.218509]; validation: [0.000135, 0.155546, 2.512275]
train: 31%|███ | 4651/15000 [00:31<01:07, 154.06it/s, loss=[0.154871]]

First 3 lines is standard output of ConsoleMonitor. This monitor included for MonitorHub by default. Every line show loss values of correspondence stage in format [min, mean, max] values.

Last line build by tqdm and outcomes from TrainStage and ValidationStage. This output show current mean value of metrics on training stage.

Add Tensorboard monitor

For getting most useful information about training we can connect Tensorboard.

For do it we need before training connect builtin TensorboardMonitor to Trainer:

from neural_pipeline.builtin.monitors.tensorboard import TensorboardMonitor

trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=False))

Now Tensorboard output will looks like:

_images/tensorboard_loss.jpg _images/tensorboard_hist.jpg

Continue training

If we need to do some more training epochs but doesn’t have previously defined objects we need to do this:

# define again all from previous steps
# ...

# define FileStructureManager with parameter is_continue=True
fsm = FileStructManager(base_dir='data', is_continue=True)

# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))

# specify training epochs number
trainer.set_epoch_num(50)

# add TensorboardMonitor with parameter is_continue=True
trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=True))

# set Trainer to resume mode and run training
trainer.resume(from_best_checkpoint=False).train()

Parameter from_best_checkpoint=False tell Trainer, that it need continue from last checkpoint. Neural Pipeline can save best checkpoints by specified rule. For more information about it read about enable_lr_decaying method of Trainer.

Don’t worry about incorrect training history displaying. If history also exists - monitors just add new data to it.

After this tutorial look to segmentation example for explore how to work with specific metrics.

API

Trainer

The main module for training process

class neural_pipeline.train.Trainer(train_config: neural_pipeline.train_config.train_config.TrainConfig, fsm: neural_pipeline.utils.fsm.FileStructManager, device: torch.device = None)[source]

Class, that run drive process.

Trainer get list of training stages and every epoch loop over it.

Training process looks like:

for epoch in epochs_num:
    for stage in training_stages:
        stage.run()
        monitor_hub.update_metrics(stage.metrics_processor().get_metrics())
    save_state()
    on_epoch_end_callback()
Parameters:
  • train_configTrainConfig object
  • fsmFileStructManager object
  • device – device for training process
exception TrainerException(msg)[source]
add_on_epoch_end_callback(callback: callable) → neural_pipeline.train.Trainer[source]

Add callback, that will be called after every epoch end

Parameters:callback – method, that will be called. This method may not get any parameters
Returns:self object
add_stop_rule(rule: callable) → neural_pipeline.train.Trainer[source]

Add the rule that control training process interruption

Params:
rule (callable): callable, that doesn’t get params and return boolean. When one of rules returns True training loop will be interrupted
Returns:
self object

Examples:

trainer.add_stop_rule(lambda: trainer.data_processor().get_lr() < 1e-6)
data_processor() → neural_pipeline.data_processor.data_processor.TrainDataProcessor[source]

Get data processor object

Returns:data processor
disable_best_states_saving() → neural_pipeline.train.Trainer[source]

Enable best states saving

Returns:self object
enable_best_states_saving(rule: callable) → neural_pipeline.train.Trainer[source]

Enable best states saving

Best stages will save when return of rule update minimum

Parameters:rule – callback which returns the value that is used for define when need store best metric
Returns:self object
enable_lr_decaying(coeff: float, patience: int, target_val_clbk: callable) → neural_pipeline.train.Trainer[source]

Enable rearing rate decaying. Learning rate decay when target_val_clbk returns doesn’t update minimum for patience steps

Parameters:
  • coeff – lr decay coefficient
  • patience – number of steps
  • target_val_clbk – callback which returns the value that is used for lr decaying
Returns:

self object

resume(from_best_checkpoint: bool) → neural_pipeline.train.Trainer[source]

Resume train from last checkpoint

Parameters:from_best_checkpoint – is need to continue from best checkpoint
Returns:self object
set_epoch_num(epoch_number: int) → neural_pipeline.train.Trainer[source]

Define number of epoch for training. One epoch - one iteration over all train stages

Parameters:epoch_number – number of training epoch
Returns:self object
train() → None[source]

Run training process

train_config() → neural_pipeline.train_config.train_config.TrainConfig[source]

Get train config

Returns:TrainConfig object

Train Config

class neural_pipeline.train_config.train_config.TrainConfig(model: torch.nn.modules.module.Module, train_stages: [], loss: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer)[source]

Train process setting storage

Parameters:
  • train_stages – list of stages for train loop
  • loss – loss criterion
  • optimizer – optimizer object
loss() → torch.nn.modules.module.Module[source]

Get loss object

Returns:loss object
optimizer() → torch.optim.optimizer.Optimizer[source]

Get optimizer object

Returns:optimizer object
stages() → [<class 'neural_pipeline.train_config.train_config.AbstractStage'>][source]

Get list of stages

Returns:list of stages
class neural_pipeline.train_config.train_config.ComparableTrainConfig(name: str = None)[source]

Train process setting storage with name. Used for train with few train configs in one time

Parameters:name – name of train config
get_metric_for_compare() → float[source]

Get metric for compare train configs

Returns:metric value or None, if compare doesn’t needed
get_params() → {}[source]

Get params of this config

Returns:
get_train_config() → neural_pipeline.train_config.train_config.TrainConfig[source]

Get train config

Returns:TrainConfig object
class neural_pipeline.train_config.train_config.TrainStage(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None, name: str = 'train')[source]

Standard training stage

When call run() it’s iterate process_batch() of data processor by data loader with is_tran=True flag.

After stop iteration ValidationStage accumulate losses from DataProcessor.

Parameters:
  • data_producerDataProducer object
  • metrics_processorMetricsProcessor
  • name – name of stage. By default ‘train’
disable_hard_negative_mining() → neural_pipeline.train_config.train_config.TrainStage[source]

Enable hard negative mining.

Returns:self object
enable_hard_negative_mining(part: float) → neural_pipeline.train_config.train_config.TrainStage[source]

Enable hard negative mining. Hard negatives was taken by losses values

Parameters:part – part of data that repeat after train stage
Returns:self object
on_epoch_end()[source]

Method, that calls after every epoch

run(data_processor: neural_pipeline.data_processor.data_processor.TrainDataProcessor) → None[source]

Run stage

Parameters:data_processorTrainDataProcessor object
class neural_pipeline.train_config.train_config.ValidationStage(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None, name: str = 'validation')[source]

Standard validation stage.

When call run() it’s iterate process_batch() of data processor by data loader with is_tran=False flag.

After stop iteration ValidationStage accumulate losses from DataProcessor.

Parameters:
  • data_producerDataProducer object
  • metrics_processorMetricsProcessor
  • name – name of stage. By default ‘validation’
class neural_pipeline.train_config.train_config.AbstractMetric(name: str)[source]

Abstract class for metrics. When it works in neural_pipeline, it store metric value for every call of calc()

Parameters:name – name of metric. Name wil be used in monitors, so be careful in use unsupported characters
calc(output: torch.Tensor, target: torch.Tensor) → numpy.ndarray[source]

Calculate metric by output from model and target

Parameters:
  • output – output from model
  • target – ground truth
get_values() → numpy.ndarray[source]

Get array of metric values

Returns:array of values
static max_val() → float[source]

Get maximum value of metric. This used for correct histogram visualisation in some monitors

Returns:maximum value
static min_val() → float[source]

Get minimum value of metric. This used for correct histogram visualisation in some monitors

Returns:minimum value
name() → str[source]

Get name of metric

Returns:metric name
reset() → None[source]

Reset array of metric values

class neural_pipeline.train_config.train_config.MetricsGroup(name: str)[source]

Class for unite metrics or another MetricsGroup’s in one namespace. Note: MetricsGroup may contain only 2 level of MetricsGroup’s. So MetricsGroup().add(MetricsGroup().add(MetricsGroup())) will raises MGException

Parameters:name – group name. Name wil be used in monitors, so be careful in use unsupported characters
exception MGException(msg: str)[source]

Exception for MetricsGroup

add(item: neural_pipeline.train_config.train_config.AbstractMetric) → neural_pipeline.train_config.train_config.MetricsGroup[source]

Add AbstractMetric or MetricsGroup

Parameters:item – object to add
Returns:self object
Return type:MetricsGroup
calc(output: torch.Tensor, target: torch.Tensor) → None[source]

Recursive calculate all metrics in this group and all nested group

Parameters:
  • output – predict value
  • target – target value
groups() → ['MetricsGroup'][source]

Get list of metrics groups

Returns:list of metrics groups
have_groups() → bool[source]

Is this group contains another metrics groups

Returns:True if contains, otherwise - False
metrics() → [<class 'neural_pipeline.train_config.train_config.AbstractMetric'>][source]

Get list of metrics

Returns:list of metrics
name() → str[source]

Get group name

Returns:name
reset() → None[source]

Recursive reset all metrics in this group and all nested group

class neural_pipeline.train_config.train_config.MetricsProcessor[source]

Collection for all AbstractMetric’s and MetricsGroup’s

add_metric(metric: neural_pipeline.train_config.train_config.AbstractMetric) → neural_pipeline.train_config.train_config.AbstractMetric[source]

Add AbstractMetric object

Parameters:metric – metric to add
Returns:metric object
Return type:AbstractMetric
add_metrics_group(group: neural_pipeline.train_config.train_config.MetricsGroup) → neural_pipeline.train_config.train_config.MetricsGroup[source]

Add MetricsGroup object

Parameters:group – metrics group to add
Returns:metrics group object
Return type:MetricsGroup
calc_metrics(output, target) → None[source]

Recursive calculate all metrics

Parameters:
  • output – predict value
  • target – target value
get_metrics() → {}[source]

Get metrics and groups as dict

Returns:dict of metrics and groups with keys [metrics, groups]
reset_metrics() → None[source]

Recursive reset all metrics values

class neural_pipeline.train_config.train_config.AbstractStage(name: str)[source]

Stage of training process. For example there may be 2 stages: train and validation. Every epochs in train loop is iteration by stages.

Parameters:name – name of stage
get_losses() → numpy.ndarray[source]

Get losses from this stage

Returns:array of losses or None if this stage doesn’t need losses
metrics_processor() → neural_pipeline.train_config.train_config.MetricsProcessor[source]

Get metrics processor

Returns::class:’MetricsProcessor` object or None
name() → str[source]

Get name of stage

Returns:name
on_epoch_end() → None[source]

Callback for train epoch end

run(data_processor: neural_pipeline.data_processor.data_processor.TrainDataProcessor) → None[source]

Run stage

class neural_pipeline.train_config.train_config.StandardStage(stage_name: str, is_train: bool, data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None)[source]

Standard stage for train process.

When call run() it’s iterate process_batch() of data processor by data loader

After stop iteration ValidationStage accumulate losses from DataProcessor.

Parameters:
get_losses() → numpy.ndarray[source]

Get losses from this stage

Returns:array of losses
metrics_processor() → neural_pipeline.train_config.train_config.MetricsProcessor[source]

Get merics processor of this stage

Returns:MetricsProcessor if specified otherwise None
on_epoch_end() → None[source]

Method, that calls after every epoch

run(data_processor: neural_pipeline.data_processor.data_processor.TrainDataProcessor) → None[source]

Run stage. This iterate by DataProducer and show progress in stdout

Parameters:data_processorDataProcessor object

Data Producer

class neural_pipeline.data_producer.data_producer.DataProducer(datasets: [<class 'neural_pipeline.data_producer.data_producer.AbstractDataset'>], batch_size: int = 1, num_workers: int = 0)[source]

Data Producer. Accumulate one or more datasets and pass it’s data by batches for processing. This use PyTorch builtin DataLoader for increase performance of data delivery.

Parameters:
  • datasets – list of datasets. Every dataset might be iterable (contans methods __getitem__ and __len__)
  • batch_size – size of output batch
  • num_workers – number of processes, that load data from datasets and pass it for output
get_data(dataset_idx: int, data_idx: int) → object[source]

Get single data by dataset idx and data_idx

Parameters:
  • dataset_idx – index of dataset
  • data_idx – index of data in this dataset
Returns:

dataset output

get_indices() → [<class 'str'>][source]

Get current indices

Returns:list of current indices or None if method set_indices() doesn’t called
get_loader(indices: [<class 'str'>] = None) → torch.utils.data.dataloader.DataLoader[source]

Get PyTorch DataLoader object, that aggregate DataProducer. If indices is specified - DataLoader will output data only by this indices. In this case indices will not passed.

Parameters:indices – list of indices. Each item of list is a string in format ‘{}_{}’.format(dataset_idx, data_idx)
Returns:DataLoader object
global_shuffle(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Is need global shuffling. If global shuffling enable - batches will compile from random indices of all datasets. In this case datasets order shuffling was ignoring

Parameters:is_need – is need global shuffling
Returns:self object
pass_indices(need_pass: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Pass indices of data in every batch. By default disabled

Parameters:need_pass – is need to pass indices
pin_memory(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Is need to pin memory on loading. Pinning memory was increase data loading performance (especially when data loads to GPU) but incompatible with swap

Parameters:is_need – is need
Returns:self object
set_indices(indices: [<class 'str'>]) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Set indices to DataProducer. After that, DataProducer start produce data only by indices

Parameters:indices – list of indices in format “<dataset_idx>_<data_idx>` like: [‘0_0’, ‘0_1’, ‘1_0’]
Returns:self object
shuffle_datasets_order(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Is need to shuffle datasets order. Shuffling performs after every 0 index access

Parameters:is_need – is need
Returns:self object

File structure management utils

Monitoring

Main module for monitoring training process

There is:

class neural_pipeline.monitoring.MonitorHub[source]

Aggregator of monitors. This class collect monitors and provide unified interface to it’s

add_monitor(monitor: neural_pipeline.monitoring.AbstractMonitor) → neural_pipeline.monitoring.MonitorHub[source]

Connect monitor to hub

Parameters:monitorAbstractMonitor object
Returns:
set_epoch_num(epoch_num: int) → None[source]

Set current epoch num

Parameters:epoch_num – num of current epoch
update_losses(losses: {}) → None[source]

Update monitor

Parameters:losses – losses values with keys ‘train’ and ‘validation’
update_metrics(metrics: {}) → None[source]

Update metrics in all monitors

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
class neural_pipeline.monitoring.AbstractMonitor[source]

Basic class for every monitor.

set_epoch_num(epoch_num: int) → None[source]

Set current epoch num

Parameters:epoch_num – num of current epoch
update_losses(losses: {}) → None[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
update_metrics(metrics: {}) → None[source]

Update metrics on monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
class neural_pipeline.monitoring.ConsoleMonitor[source]

Monitor, that used for write metrics to console.

Output looks like: Epoch: [#]; train: [-1, 0, 1]; validation: [-1, 0, 1]. This 3 numbers is [min, mean, max] values of training stage loss values

update_losses(losses: {}) → None[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
class neural_pipeline.monitoring.LogMonitor(fsm: neural_pipeline.utils.fsm.FileStructManager)[source]

Monitor, used for logging metrics. It’s write full log and can also write last metrics in separate file if required

All output files in JSON format and stores in <base_dir_path>/monitors/metrics_log

Parameters:fsmFileStructManager object
close() → None[source]

Close monitor

get_final_metrics_file() → str[source]

Get final metrics file path

Returns:path or None if writing doesn’t enabled by write_final_metrics()
update_losses(losses: {}) → None[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
update_metrics(metrics: {}) → None[source]

Update metrics on monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
write_final_metrics(path: str = None) → neural_pipeline.monitoring.LogMonitor[source]

Enable saving final metrics to separate file

Parameters:path – path to result file. If not defined, file will placed near full metrics log and named ‘metrics.json`
Returns:self object

Data Processor

class neural_pipeline.data_processor.data_processor.DataProcessor(model: torch.nn.modules.module.Module, device: torch.device = None)[source]

DataProcessor manage: model, data processing, device choosing

Args:
model (Module): model, that will be used for process data device (torch.device): what device pass data for processing
load() → None[source]

Load model weights from checkpoint

model() → torch.nn.modules.module.Module[source]

Get current module

predict(data: torch.Tensor) → object[source]

Make predict by data

Parameters:data – data as torch.Tensor or dict with key data
Returns:processed output
Return type:the model output type
save_state() → None[source]

Save state of optimizer and perform epochs number

set_pick_model_input(pick_model_input: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]

Set callback, that will get output from DataLoader and return model input.

Default mode:


lambda data: data[‘data’]

Args:
pick_model_input (callable): pick model input callable. This callback need to get one parameter: dataset output
Returns:
self object

Examples:

data_processor.set_pick_model_input(lambda data: data['data'])
data_processor.set_pick_model_input(lambda data: data[0])
class neural_pipeline.data_processor.data_processor.TrainDataProcessor(train_config: TrainConfig, device: torch.device = None)[source]

TrainDataProcessor is make all of DataProcessor but produce training process.

Parameters:train_config – train config
exception TDPException(msg)[source]
get_lr() → float[source]

Get learning rate from optimizer

get_state() → {}[source]

Get model and optimizer state dicts

Returns:dict with keys [weights, optimizer]
load() → None[source]

Load state of model, optimizer and TrainDataProcessor from checkpoint

predict(data, is_train=False) → torch.Tensor[source]

Make predict by data. If is_train is True - this operation will compute gradients. If is_train is False - this will work with model.eval() and torch.no_grad

Parameters:
  • data – data in dict
  • is_train – is data processor need train on data or just predict
Returns:

processed output

Return type:

model return type

process_batch(batch: {}, is_train: bool, metrics_processor: AbstractMetricsProcessor = None) → numpy.ndarray[source]

Process one batch of data

Parameters:
  • batch – dict, contains ‘data’ and ‘target’ keys. The values for key must be instance of torch.Tensor or dict
  • is_train – is batch process for train
  • metrics_processor – metrics processor for collect metrics after batch is processed
Returns:

array of losses with shape (N, …) where N is batch size

save_state() → None[source]

Save state of optimizer and perform epochs number

set_data_preprocess(data_preprocess: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]

Set callback, that will get output from DataLoader and return preprocessed data. For example may be used for pass data to device.

Default mode:


_pass_data_to_device()

Args:
data_preprocess (callable): preprocess callable. This callback need to get one parameter: dataset output
Returns:
self object

Examples:

from neural_pipeline.utils import dict_recursive_bypass
data_processor.set_data_preprocess(lambda data: dict_recursive_bypass(data, lambda v: v.cuda()))
set_pick_target(pick_target: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]

Set callback, that will get output from DataLoader and return target.

Default mode:


lambda data: data[‘target’]

Args:
pick_target (callable): pick target callable. This callback need to get one parameter: dataset output
Returns:
self object

Examples:

data_processor.set_pick_target(lambda data: data['target'])
data_processor.set_pick_target(lambda data: data[1])
update_lr(lr: float) → None[source]

Update learning rate straight to optimizer

Parameters:lr – target learning rate

Model

class neural_pipeline.data_processor.model.Model(base_model: torch.nn.modules.module.Module)[source]

Wrapper for torch.nn.Module. This class provide initialization, call and serialization for it

Parameters:base_modeltorch.nn.Module object
exception ModelException(msg)[source]
load_weights(weights_file: str = None) → None[source]

Load weight from checkpoint

model() → torch.nn.modules.module.Module[source]

Get internal torch.nn.Module object

Returns:internal torch.nn.Module object
save_weights(weights_file: str = None) → None[source]

Serialize weights to file

set_checkpoints_manager(manager: neural_pipeline.utils.fsm.CheckpointsManager) → neural_pipeline.data_processor.model.Model[source]

Set checkpoints manager, that will be used for identify path for weights file reading an writing

Parameters:managerCheckpointsManager instance
Returns:self object
to_device(device: torch.device) → neural_pipeline.data_processor.model.Model[source]

Pass model to specified device

Predictor

The main module for run inference

class neural_pipeline.predict.Predictor(model: neural_pipeline.data_processor.model.Model, fsm: neural_pipeline.utils.fsm.FileStructManager, from_best_state: bool = False)[source]

Predictor run inference by training parameters

Parameters:
  • model – model object, used for predict
  • fsmFileStructManager object
predict(data: torch.Tensor)[source]

Predict ine data

Parameters:data – data as torch.Tensor or dict with key data
Returns:processed output
Return type:model output type

Builtin modules

In builtin module contains all modules that can’t be tested, or have specific field of application.

Tensorboard

This module contains Tensorboard monitor interface

class neural_pipeline.builtin.monitors.tensorboard.TensorboardMonitor(fsm: neural_pipeline.utils.fsm.FileStructManager, is_continue: bool, network_name: str = None)[source]

Class, that manage metrics end events monitoring. It worked with tensorboard. Monitor get metrics after epoch ends and visualise it. Metrics may be float or np.array values. If metric is np.array - it will be shown as histogram and scalars (scalar plots contains mean valuse from array).

Parameters:
  • fsm – file structure manager
  • is_continue – is data processor continue training
  • network_name – network name
update_losses(losses: {}) → None[source]

Update monitor

Parameters:losses – losses values with keys ‘train’ and ‘validation’
update_metrics(metrics: {}) → None[source]

Update monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
update_scalar(name: str, value: float, epoch_idx: int = None) → None[source]

Update scalar on tensorboard

Parameters:
  • name – the classic tag for TensorboardX
  • value – scalar value
  • epoch_idx – epoch idx. If doesn’t set - use last epoch idx stored in this class
visualize_model(model: neural_pipeline.data_processor.model.Model, tensor) → None[source]

Visualize model graph

Parameters:
  • modeltorch.nn.Module object
  • tensor – dummy input for trace model
write_to_txt_log(text: str, tag: str = None) → None[source]

Write to txt log

Parameters:
  • text – text that will be writed
  • tag – tag

Matplotlib

This module contains Matplotlib monitor interface

class neural_pipeline.builtin.monitors.mpl.MPLMonitor[source]

This monitor show all data in Matplotlib plots

realtime(is_realtime: bool) → neural_pipeline.builtin.monitors.mpl.MPLMonitor[source]

Is need to show data updates in realtime

Parameters:is_realtime – is need realtime
Returns:self object
update_losses(losses: {})[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
update_metrics(metrics: {}) → None[source]

Update metrics on monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’

AlbUNet

This module created AlbUNet: U-Net with ResNet encoder. This model writed by Alexander Buslaev and spoiled by me.

This model can be constructed with ‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’, ‘resnet152’ encoders.

For create model just call resnet<number> method

neural_pipeline.builtin.models.albunet.resnet18(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-18 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet34(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-34 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet50(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-50 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet101(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-101 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet152(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-152 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet

DVC

Portrait segmentation network.

This based on PyTorch, NeuralPipeline and high-level pipeline build by [DVC](dvc.org).

Creation repo tutorial (explain, that code also exists):

This steps also done and results contains in repo. For reproduce this step make: ` dvc destroy git commit -m 'deinit DVC' ` ###Clone repo

1) add PixArt dataset as submodule ` git submodule add http://172.26.40.23:3000/datasets/pixart.git datasets/ ` 2) load all from submodule ` git submodule update --init `

###Build DVC pipeline:

1) initialize DVC ` dvc init git commit -m 'add DVC' ` 2) Setup pipeline ` dvc run -d train.py -M data/monitors/metrics_log/metrics.json -o data/checkpoints/last/last_checkpoint.zip --no-exec python train.py dvc run -d predict.py -d data/checkpoints/last/last_checkpoint.zip -o result --no-exec python predict.py ` 3) Run pipeline ` dvc repro result.dvc ` 4) Last steps

After pipeline execution end, we get metrics.json file with metrics values and pipeline modified steps files. Let’s add it to git history ` git add data/checkpoints/last/.gitignore last_checkpoint.zip.dvc result.dvc metrics.json -f `

###Run another experiment We add hard negative mining to our training process. So we need to run new experiment and then compare it with existing

  1. Create new branch

` git checkout -b hnm dvc checkout `

  1. Repeat all steps from previous section
  2. Compare metrics

` dvc metrics show -a `

Output will look like that:

``` hnm:

metrics.json: {“train”: {“jaccard”: 0.8874640464782715, “dice”: 0.9423233270645142, “loss”: 0.7522647976875305}, “validation”: {“jaccard”: 0.8573445081710815, “dice”: 0.9246319532394409, “loss”: 0.7623925805091858}}
master:
metrics.json: {“train”: {“jaccard”: 0.8774164915084839, “dice”: 0.9357065558433533, “loss”: 0.7595105767250061}, “validation”: {“jaccard”: 0.8574965596199036, “dice”: 0.927370011806488, “loss”: 0.7602806687355042}}

```

###Show DVC pipeline: ` dvc pipeline show --ascii result.dvc ` U may see this output: ``` +————————-+ | last_checkpoint.zip.dvc | +————————-+

result.dvc

```

## Reproduce results: Call dvc repro will run pipeline. But we need define last step of pipeline. So as a parameter we pass last pipeline step file name: ` dvc repro result.dvc `

After pipeline stop executing, you can see metrics (-a - show metrics from all branches): ` dvc metrics show -a `