Welcome to Neural Pipeline’s documentation!

Getting started guide

First of all look at main classes of Neural Pipeline:

Training stages needed for customize training process. With it Trainer work by this scheme (dataflow scheme for single epoch):

_images/data_flow.svg

Create dataset

In Neural Pipeline dataset is iterable class. This means, that class need contain __getitem__ and __len__ methods.

For every i-th output, dataset need produce Python dict with keys ‘data’ and ‘target’.

Let’s create MNIST dataset, based on builtin PyTorch dataset:

from torchvision import datasets, transforms

class MNISTDataset(AbstractDataset):
    # define transforms
    transforms = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])

    def __init__(self, data_dir: str, is_train: bool):
        # instantiate PyTorch dataset
        self.dataset = datasets.MNIST(data_dir, train=is_train, download=True)

    # define method, that output dataset length
    def __len__(self):
        return len(self.dataset)

    # define method, that return single data by index
    def __getitem__(self, item):
        data, target = self.dataset[item]
        return {'data': self.transforms(data), 'target': target}

For work with this dataset we need wrap it by DataProducer:

from neural_pipeline import DataProducer

# create train and validation datasets objects
train_dataset = DataProducer([MNISTDataset('data/dataset', True)], batch_size=4, num_workers=2)
validation_dataset = DataProducer([MNISTDataset('data/dataset', False)], batch_size=4, num_workers=2)

Create TrainConfig

Now let’s define TrainConfig that will contains training hyperparameters.

In this tutorial we use predefined stages TrainStage and ValidationStage. TrainStage iterate by DataProducer and learn model in train() mode. Respectively ValidatioStage do same but in eval() mode.

from neural_pipeline import TrainConfig, TrainStage, ValidationStage

# define train stages
train_stages = [TrainStage(train_dataset), ValidationStage(validation_dataset)]

loss = torch.nn.NLLLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.5)

# define TrainConfig
train_config = TrainConfig(train_stages, loss, optimizer)

Create Trainer

First of all we need specify model, that will be trained:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

Now we need build our training process. It’s done by implements Trainer class:

from neural_pipeline import FileStructManager, Trainer

# define file structure for experiment
fsm = FileStructManager(base_dir='data', is_continue=False)

# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))

# specify training epochs number
trainer.set_epoch_num(50)

Last parameter or Trainer constructor - target device, that will be used for training.

Start training

Now we can just start training process:

trainer.train()

That’s all. Console output will look like that:

Epoch: [1]; train: [0.004141, 1.046422, 3.884116]; validation: [0.002027, 0.304710, 2.673034]
Epoch: [2]; train: [0.000519, 0.249564, 4.938250]; validation: [0.000459, 0.200972, 2.594026]
Epoch: [3]; train: [0.000182, 0.180328, 5.218509]; validation: [0.000135, 0.155546, 2.512275]
train: 31%|███ | 4651/15000 [00:31<01:07, 154.06it/s, loss=[0.154871]]

First 3 lines is standard output of ConsoleMonitor. This monitor included for MonitorHub by default. Every line show loss values of correspondence stage in format [min, mean, max] values.

Last line build by tqdm and outcomes from TrainStage and ValidationStage. This output show current mean value of metrics on training stage.

Add Tensorboard monitor

For getting most useful information about training we can connect Tensorboard.

For do it we need before training connect builtin TensorboardMonitor to Trainer:

from neural_pipeline.builtin.monitors.tensorboard import TensorboardMonitor

trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=False))

Now Tensorboard output will looks like:

_images/tensorboard_loss.jpg _images/tensorboard_hist.jpg

Continue training

If we need to do some more training epochs but doesn’t have previously defined objects we need to do this:

# define again all from previous steps
# ...

# define FileStructureManager with parameter is_continue=True
fsm = FileStructManager(base_dir='data', is_continue=True)

# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))

# specify training epochs number
trainer.set_epoch_num(50)

# add TensorboardMonitor with parameter is_continue=True
trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=True))

# set Trainer to resume mode and run training
trainer.resume(from_best_checkpoint=False).train()

Parameter from_best_checkpoint=False tell Trainer, that it need continue from last checkpoint. Neural Pipeline can save best checkpoints by specified rule. For more information about it read about enable_lr_decaying method of Trainer.

Don’t worry about incorrect training history displaying. If history also exists - monitors just add new data to it.

After this tutorial look to segmentation example for explore how to work with specific metrics.

API

Trainer

The main module for training process

class neural_pipeline.train.Trainer(model: torch.nn.modules.module.Module, train_config: neural_pipeline.train_config.train_config.TrainConfig, fsm: neural_pipeline.utils.file_structure_manager.FileStructManager, device: torch.device = None)[source]

Class, that run drive process.

Trainer get list of training stages and every epoch loop over it.

Training process looks like:

for epoch in epochs_num:
    for stage in training_stages:
        stage.run()
        monitor_hub.update_metrics(stage.metrics_processor().get_metrics())
    save_state()
    on_epoch_end_callback()
Parameters:
  • model – model for training
  • train_configTrainConfig object
  • fsmFileStructManager object
  • device – device for training process
exception TrainerException(msg)[source]
add_on_epoch_end_callback(callback: callable) → neural_pipeline.train.Trainer[source]

Add callback, that will be called after every epoch end

Parameters:callback – method, that will be called. This method may not get any parameters
Returns:self object
data_processor() → neural_pipeline.data_processor.data_processor.TrainDataProcessor[source]

Get data processor object

Returns:data processor
disable_best_states_saving() → neural_pipeline.train.Trainer[source]

Enable best states saving

Returns:self object
enable_best_states_saving(rule: callable) → neural_pipeline.train.Trainer[source]

Enable best states saving

Best stages will save when return of rule update minimum

Parameters:rule – callback which returns the value that is used for define when need store best metric
Returns:self object
enable_lr_decaying(coeff: float, patience: int, target_val_clbk: callable) → neural_pipeline.train.Trainer[source]

Enable rearing rate decaying. Learning rate decay when target_val_clbk returns doesn’t update minimum for patience steps

Parameters:
  • coeff – lr decay coefficient
  • patience – number of steps
  • target_val_clbk – callback which returns the value that is used for lr decaying
Returns:

self object

resume(from_best_checkpoint: bool) → neural_pipeline.train.Trainer[source]

Resume train from last checkpoint

Parameters:from_best_checkpoint – is need to continue from best checkpoint
Returns:self object
set_epoch_num(epoch_number: int) → neural_pipeline.train.Trainer[source]

Define number of epoch for training. One epoch - one iteration over all train stages

Parameters:epoch_number – number of training epoch
Returns:self object
train() → None[source]

Run training process

Train Config

class neural_pipeline.train_config.train_config.TrainConfig(train_stages: [], loss: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer)[source]

Train process setting storage

Parameters:
  • train_stages – list of stages for train loop
  • loss – loss criterion
  • optimizer – optimizer object
loss() → torch.nn.modules.module.Module[source]

Get loss object

Returns:loss object
optimizer() → torch.optim.optimizer.Optimizer[source]

Get optimizer object

Returns:optimizer object
stages() → [<class 'neural_pipeline.train_config.train_config.AbstractStage'>][source]

Get list of stages

Returns:list of stages
class neural_pipeline.train_config.train_config.TrainStage(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None, name: str = 'train')[source]

Standard training stage

When call run() it’s iterate process_batch() of data processor by data loader with is_tran=True flag.

After stop iteration ValidationStage accumulate losses from DataProcessor.

Parameters:
  • data_producerDataProducer object
  • metrics_processorMetricsProcessor
  • name – name of stage. By default ‘train’
disable_hard_negative_mining() → neural_pipeline.train_config.train_config.TrainStage[source]

Enable hard negative mining.

Returns:self object
enable_hard_negative_mining(part: float) → neural_pipeline.train_config.train_config.TrainStage[source]

Enable hard negative mining. Hard negatives was taken by losses values

Parameters:part – part of data that repeat after train stage
Returns:self object
on_epoch_end()[source]

Method, that calls after every epoch

run(data_processor: neural_pipeline.data_processor.data_processor.TrainDataProcessor) → None[source]

Run stage

Parameters:data_processorTrainDataProcessor object
class neural_pipeline.train_config.train_config.ValidationStage(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None, name: str = 'validation')[source]

Standard validation stage.

When call run() it’s iterate process_batch() of data processor by data loader with is_tran=False flag.

After stop iteration ValidationStage accumulate losses from DataProcessor.

Parameters:
  • data_producerDataProducer object
  • metrics_processorMetricsProcessor
  • name – name of stage. By default ‘validation’
class neural_pipeline.train_config.train_config.AbstractMetric(name: str)[source]

Abstract class for metrics. When it works in neural_pipeline, it store metric value for every call of calc()

Parameters:name – name of metric. Name wil be used in monitors, so be careful in use unsupported characters
calc(output: torch.Tensor, target: torch.Tensor) → numpy.ndarray[source]

Calculate metric by output from model and target

Parameters:
  • output – output from model
  • target – ground truth
get_values() → numpy.ndarray[source]

Get array of metric values

Returns:array of values
static max_val() → float[source]

Get maximum value of metric. This used for correct histogram visualisation in some monitors

Returns:maximum value
static min_val() → float[source]

Get minimum value of metric. This used for correct histogram visualisation in some monitors

Returns:minimum value
name() → str[source]

Get name of metric

Returns:metric name
reset() → None[source]

Reset array of metric values

class neural_pipeline.train_config.train_config.MetricsGroup(name: str)[source]

Class for unite metrics or another MetricsGroup’s in one namespace. Note: MetricsGroup may contain only 2 level of MetricsGroup’s. So MetricsGroup().add(MetricsGroup().add(MetricsGroup())) will raises MGException

Parameters:name – group name. Name wil be used in monitors, so be careful in use unsupported characters
exception MGException(msg: str)[source]

Exception for MetricsGroup

add(item: neural_pipeline.train_config.train_config.AbstractMetric) → neural_pipeline.train_config.train_config.MetricsGroup[source]

Add AbstractMetric or MetricsGroup

Parameters:item – object to add
Returns:self object
Return type:MetricsGroup
calc(output: torch.Tensor, target: torch.Tensor) → None[source]

Recursive calculate all metrics in this group and all nested group

Parameters:
  • output – predict value
  • target – target value
groups() → ['MetricsGroup'][source]

Get list of metrics groups

Returns:list of metrics groups
have_groups() → bool[source]

Is this group contains another metrics groups

Returns:True if contains, otherwise - False
metrics() → [<class 'neural_pipeline.train_config.train_config.AbstractMetric'>][source]

Get list of metrics

Returns:list of metrics
name() → str[source]

Get group name

Returns:name
reset() → None[source]

Recursive reset all metrics in this group and all nested group

class neural_pipeline.train_config.train_config.MetricsProcessor[source]

Collection for all AbstractMetric’s and MetricsGroup’s

add_metric(metric: neural_pipeline.train_config.train_config.AbstractMetric) → neural_pipeline.train_config.train_config.AbstractMetric[source]

Add AbstractMetric object

Parameters:metric – metric to add
Returns:metric object
Return type:AbstractMetric
add_metrics_group(group: neural_pipeline.train_config.train_config.MetricsGroup) → neural_pipeline.train_config.train_config.MetricsGroup[source]

Add MetricsGroup object

Parameters:group – metrics group to add
Returns:metrics group object
Return type:MetricsGroup
calc_metrics(output, target) → None[source]

Recursive calculate all metrics

Parameters:
  • output – predict value
  • target – target value
get_metrics() → {}[source]

Get metrics and groups as dict

Returns:dict of metrics and groups with keys [metrics, groups]
reset_metrics() → None[source]

Recursive reset all metrics values

class neural_pipeline.train_config.train_config.AbstractStage(name: str)[source]

Stage of training process. For example there may be 2 stages: train and validation. Every epochs in train loop is iteration by stages.

Parameters:name – name of stage
get_losses() → numpy.ndarray[source]

Get losses from this stage

Returns:array of losses or None if this stage doesn’t need losses
metrics_processor() → neural_pipeline.train_config.train_config.MetricsProcessor[source]

Get metrics processor

Returns::class:’MetricsProcessor` object or None
name() → str[source]

Get name of stage

Returns:name
on_epoch_end() → None[source]

Callback for train epoch end

run(data_processor: neural_pipeline.data_processor.data_processor.TrainDataProcessor) → None[source]

Run stage

class neural_pipeline.train_config.train_config.StandardStage(stage_name: str, is_train: bool, data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None)[source]

Standard stage for train process.

When call run() it’s iterate process_batch() of data processor by data loader

After stop iteration ValidationStage accumulate losses from DataProcessor.

Parameters:
get_losses() → numpy.ndarray[source]

Get losses from this stage

Returns:array of losses
metrics_processor() → neural_pipeline.train_config.train_config.MetricsProcessor[source]

Get merics processor of this stage

Returns:MetricsProcessor if specified otherwise None
on_epoch_end() → None[source]

Method, that calls after every epoch

run(data_processor: neural_pipeline.data_processor.data_processor.TrainDataProcessor) → None[source]

Run stage. This iterate by DataProducer and show progress in stdout

Parameters:data_processorDataProcessor object

Data Producer

class neural_pipeline.data_producer.data_producer.DataProducer(datasets: [<class 'neural_pipeline.data_producer.data_producer.AbstractDataset'>], batch_size: int = 1, num_workers: int = 0)[source]

Data Producer. Accumulate one or more datasets and pass it’s data by batches for processing. This use PyTorch builtin DataLoader for increase performance of data delivery.

Parameters:
  • datasets – list of datasets. Every dataset might be iterable (contans methods __getitem__ and __len__)
  • batch_size – size of output batch
  • num_workers – number of processes, that load data from datasets and pass it for output
get_data(dataset_idx: int, data_idx: int) → object[source]

Get single data by dataset idx and data_idx

Parameters:
  • dataset_idx – index of dataset
  • data_idx – index of data in this dataset
Returns:

dataset output

get_loader(indices: [<class 'str'>] = None) → torch.utils.data.dataloader.DataLoader[source]

Get PyTorch DataLoader object, that aggregate DataProducer. If indices is specified - DataLoader wil output data only by this indices. In this case indices will not passed.

Parameters:indices – list of indices. Each item of list is a string in format ‘{}_{}’.format(dataset_idx, data_idx)
Returns:DataLoader object
global_shuffle(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Is need global shuffling. If global shuffling enable - batches will compile from random indices of all datasets. In this case datasets order shuffling was ignoring

Parameters:is_need – is need global shuffling
Returns:self object
pass_indices(need_pass: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Pass indices of data in every batch. By default disabled

Parameters:need_pass – is need to pass indices
pin_memory(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Is need to pin memory on loading. Pinning memory was increase data loading performance (especially when data loads to GPU) but incompatible with swap

Parameters:is_need – is need
Returns:self object
shuffle_datasets_order(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]

Is need to shuffle datasets order. Shuffling performs after every 0 index access

Parameters:is_need – is need
Returns:self object

File structure management utils

This module contains all classes, that work with file structure

class neural_pipeline.utils.file_structure_manager.FileStructManager(base_dir: str, is_continue: bool, exists_ok: bool = False)[source]

Class, that provide directories registration in base directory.

All modules, that use file structure under base directory should register their paths in this class by pass module to method register_dir(). If directory also registered registration method will raise exception FSMException

Parameters:
  • base_dir – path to directory with checkpoints
  • is_continue – is FileStructManager used for continue training or predict
  • exists_ok – if True - all checks for existing directories will be disabled
exception FSMException(message: str)[source]
get_path(obj: neural_pipeline.utils.file_structure_manager.FolderRegistrable, create_if_non_exists: bool = False, check: bool = True) → str[source]

Get path of registered object

Parameters:
  • obj – object
  • create_if_non_exists – is need to create object’s directory if it doesn’t exists
  • check – is need to check object’s directory existing
Returns:

path to directory

Raises:

FSMException – if directory exists and check == True

in_continue_mode() → bool[source]

Is FileStructManager in continue mode

Returns:True if in continue
register_dir(obj: neural_pipeline.utils.file_structure_manager.FolderRegistrable, check_name_registered: bool = True, check_dir_registered: bool = True) → None[source]

Register directory in file structure

Parameters:
  • obj – object to registration
  • check_name_registered – is need to check if object name also registered
  • check_dir_registered – is need to check if object path also registered
Raises:

FileStructManager – if path or object name also registered and if path also exists (in depends of optional parameters values)

class neural_pipeline.utils.file_structure_manager.CheckpointsManager(fsm: neural_pipeline.utils.file_structure_manager.FileStructManager, prefix: str = None)[source]

Class that manage checkpoints for DataProcessor.

All states pack to zip file. It contains few files: model weights, optimizer state, data processor state

Parameters:
  • fsm – :class:’FileStructureManager’ instance
  • prefix – prefix of saved and loaded files
exception SMException(message: str)[source]

Exception for CheckpointsManager

clear_files() → None[source]

Clear unpacked files

optimizer_state_file() → str[source]

Get optimizer state file path

Returns:path
pack() → None[source]

Pack all files in zip

trainer_file() → str[source]

Get trainer state file path

Returns:path
unpack() → None[source]

Unpack state files

weights_file() → str[source]

Get model weights file path

Returns:path
class neural_pipeline.utils.file_structure_manager.FolderRegistrable(fsm: neural_pipeline.utils.file_structure_manager.FileStructManager)[source]

Abstract class for implement classes, that use folders

Parameters:fsm – FileStructureManager class instance

Monitoring

Main module for monitoring training process

There is:

class neural_pipeline.monitoring.MonitorHub[source]

Aggregator of monitors. This class collect monitors and provide unified interface to it’s

add_monitor(monitor: neural_pipeline.monitoring.AbstractMonitor) → neural_pipeline.monitoring.MonitorHub[source]

Connect monitor to hub

Parameters:monitorAbstractMonitor object
Returns:
set_epoch_num(epoch_num: int) → None[source]

Set current epoch num

Parameters:epoch_num – num of current epoch
update_losses(losses: {}) → None[source]

Update monitor

Parameters:losses – losses values with keys ‘train’ and ‘validation’
update_metrics(metrics: {}) → None[source]

Update metrics in all monitors

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
class neural_pipeline.monitoring.AbstractMonitor[source]

Basic class for every monitor.

set_epoch_num(epoch_num: int) → None[source]

Set current epoch num

Parameters:epoch_num – num of current epoch
update_losses(losses: {}) → None[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
update_metrics(metrics: {}) → None[source]

Update metrics on monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
class neural_pipeline.monitoring.ConsoleMonitor[source]

Monitor, that used for write metrics to console.

Output looks like: Epoch: [#]; train: [-1, 0, 1]; validation: [-1, 0, 1]. This 3 numbers is [min, mean, max] values of training stage loss values

update_losses(losses: {}) → None[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
class neural_pipeline.monitoring.LogMonitor(fsm: neural_pipeline.utils.file_structure_manager.FileStructManager)[source]

Monitor, used for logging metrics. It’s write full log and can also write last metrics in separate file if required

All output files in JSON format and stores in <base_dir_path>/monitors/metrics_log

Parameters:fsmFileStructManager object
close() → None[source]

Close monitor

get_final_metrics_file() → str[source]

Get final metrics file path

Returns:path or None if writing doesn’t enabled by write_final_metrics()
update_losses(losses: {}) → None[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
update_metrics(metrics: {}) → None[source]

Update metrics on monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
write_final_metrics(path: str = None) → neural_pipeline.monitoring.LogMonitor[source]

Enable saving final metrics to separate file

Parameters:path – path to result file. If not defined, file will placed near full metrics log and named ‘metrics.json`
Returns:self object

Data Processor

class neural_pipeline.data_processor.data_processor.DataProcessor(model: torch.nn.modules.module.Module, device: torch.device = None)[source]

DataProcessor manage: model, data processing, device choosing

Parameters:
  • model – model, that will be used for process data
  • device – what device pass model and data for processing
load() → None[source]

Load model weights from checkpoint

model() → torch.nn.modules.module.Module[source]

Get current module

predict(data: torch.Tensor) → object[source]

Make predict by data

Parameters:data – data as torch.Tensor or dict with key data
Returns:processed output
Return type:the model output type
save_state() → None[source]

Save state of optimizer and perform epochs number

class neural_pipeline.data_processor.data_processor.TrainDataProcessor(model: torch.nn.modules.module.Module, train_config: TrainConfig, device: torch.device = None)[source]

TrainDataProcessor is make all of DataProcessor but produce training process.

Parameters:
  • model – model, that will be used for process data
  • train_config – train config
  • device – what device pass model, data and optimizer for processing
exception TDPException(msg)[source]
get_lr() → float[source]

Get learning rate from optimizer

get_state() → {}[source]

Get model and optimizer state dicts

Returns:dict with keys [weights, optimizer]
load() → None[source]

Load state of model, optimizer and TrainDataProcessor from checkpoint

predict(data, is_train=False) → torch.Tensor[source]

Make predict by data. If is_train was True

Parameters:
  • data – data in dict
  • is_train – is data processor need train on data or just predict
Returns:

processed output

Return type:

model return type

process_batch(batch: {}, is_train: bool, metrics_processor: AbstractMetricsProcessor = None) → numpy.ndarray[source]

Process one batch of data

Parameters:
  • batch – dict, contains ‘data’ and ‘target’ keys. The values for key must be instance of torch.Tensor or dict
  • is_train – is batch process for train
  • metrics_processor – metrics processor for collect metrics after batch is processed
Returns:

array of losses with shape (N, …) where N is batch size

save_state() → None[source]

Save state of optimizer and perform epochs number

update_lr(lr: float) → None[source]

Update learning rate straight to optimizer

Parameters:lr – target learning rate

Model

class neural_pipeline.data_processor.model.Model(base_model: torch.nn.modules.module.Module)[source]

Wrapper for torch.nn.Module. This class provide initialization, call and serialization for it

Parameters:base_modeltorch.nn.Module object
exception ModelException(msg)[source]
load_weights(weights_file: str = None) → None[source]

Load weight from checkpoint

model() → torch.nn.modules.module.Module[source]

Get internal torch.nn.Module object

Returns:internal torch.nn.Module object
save_weights(weights_file: str = None) → None[source]

Serialize weights to file

set_checkpoints_manager(manager: neural_pipeline.utils.file_structure_manager.CheckpointsManager) → neural_pipeline.data_processor.model.Model[source]

Set checkpoints manager, that will be used for identify path for weights file reading an writing

Parameters:managerCheckpointsManager instance
Returns:self object
to_device(device: torch.device) → neural_pipeline.data_processor.model.Model[source]

Pass model to specified device

Predictor

The main module for run inference

class neural_pipeline.predict.Predictor(model: neural_pipeline.data_processor.model.Model, fsm: neural_pipeline.utils.file_structure_manager.FileStructManager, device: torch.device = None)[source]

Predictor run inference by training parameters

Parameters:
  • model – model object, used for predict
  • fsmFileStructManager object
  • device – device for run inference
predict(data: torch.Tensor)[source]

Predict ine data

Parameters:data – data as torch.Tensor or dict with key data
Returns:processed output
Return type:model output type
predict_dataset(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, callback: callable) → None[source]

Run prediction iterates by data_producer

Parameters:
  • data_producerDataProducer object
  • callback – callback, that call for every data prediction and get it’s result as parameter

Builtin modules

In builtin module contains all modules that can’t be tested, or have specific field of application.

Tensorboard

This module contains Tensorboard monitor interface

class neural_pipeline.builtin.monitors.tensorboard.TensorboardMonitor(fsm: neural_pipeline.utils.file_structure_manager.FileStructManager, is_continue: bool, network_name: str = None)[source]

Class, that manage metrics end events monitoring. It worked with tensorboard. Monitor get metrics after epoch ends and visualise it. Metrics may be float or np.array values. If metric is np.array - it will be shown as histogram and scalars (scalar plots contains mean valuse from array).

Parameters:
  • fsm – file structure manager
  • is_continue – is data processor continue training
  • network_name – network name
update_losses(losses: {}) → None[source]

Update monitor

Parameters:losses – losses values with keys ‘train’ and ‘validation’
update_metrics(metrics: {}) → None[source]

Update monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’
update_scalar(name: str, value: float, epoch_idx: int = None) → None[source]

Update scalar on tensorboard

Parameters:
  • name – the classic tag for TensorboardX
  • value – scalar value
  • epoch_idx – epoch idx. If doesn’t set - use last epoch idx stored in this class
visualize_model(model: neural_pipeline.data_processor.model.Model, tensor) → None[source]

Visualize model graph

Parameters:
  • modeltorch.nn.Module object
  • tensor – dummy input for trace model
write_to_txt_log(text: str, tag: str = None) → None[source]

Write to txt log

Parameters:
  • text – text that will be writed
  • tag – tag

Matplotlib

This module contains Matplotlib monitor interface

class neural_pipeline.builtin.monitors.mpl.MPLMonitor[source]

This monitor show all data in Matplotlib plots

realtime(is_realtime: bool) → neural_pipeline.builtin.monitors.mpl.MPLMonitor[source]

Is need to show data updates in realtime

Parameters:is_realtime – is need realtime
Returns:self object
update_losses(losses: {})[source]

Update losses on monitor

Parameters:losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
update_metrics(metrics: {}) → None[source]

Update metrics on monitor

Parameters:metrics – metrics dict with keys ‘metrics’ and ‘groups’

AlbUNet

This module created AlbUNet: U-Net with ResNet encoder. This model writed by Alexander Buslaev and spoiled by me.

This model can be constructed with ‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’, ‘resnet152’ encoders.

For create model just call resnet<number> method

neural_pipeline.builtin.models.albunet.resnet18(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-18 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet34(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-34 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet50(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-50 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet101(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-101 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet
neural_pipeline.builtin.models.albunet.resnet152(classes_num: int, in_channels: int, pretrained: bool = True)[source]

Constructs a AlbUNet with ResNet-152 encoder.

Parameters:
  • classes_num – number of classes (number of masks in output)
  • in_channels – number of input channels
  • pretrained – If True, returns a model with encoder pre-trained on ImageNet