Welcome to Neural Pipeline’s documentation!¶
Getting started guide¶
First of all look at main classes of Neural Pipeline:
- Trainer - class, that implements training process
- TrainConfig - class, that store hyperparameters
- AbstractTrainStage - base class for single stage of training process. Don’t worry, Neural Pipeline have predefined classes for common use cases: TrainStage, ValidationStage and more common - StandardStage
- DataProducer - class, that unite datasets and unite it’s interface
- FileStructManager - class, that manage file structure
Training stages needed for customize training process. With it Trainer work by this scheme (dataflow scheme for single epoch):
Implement dataset class¶
In Neural Pipeline dataset is iterable class. This means, that class need contain __getitem__
and __len__
methods.
For every i-th output, dataset need produce Python dict
with keys ‘data’ and ‘target’.
Let’s create MNIST dataset, based on builtin PyTorch dataset:
from torchvision import datasets, transforms
class MNISTDataset(AbstractDataset):
# define transforms
transforms = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
def __init__(self, data_dir: str, is_train: bool):
# instantiate PyTorch dataset
self.dataset = datasets.MNIST(data_dir, train=is_train, download=True)
# define method, that output dataset length
def __len__(self):
return len(self.dataset)
# define method, that return single data by index
def __getitem__(self, item):
data, target = self.dataset[item]
return {'data': self.transforms(data), 'target': target}
For work with this dataset we need wrap it by DataProducer
:
from neural_pipeline import DataProducer
# create train and validation datasets objects
train_dataset = DataProducer([MNISTDataset('data/dataset', True)], batch_size=4, num_workers=2)
validation_dataset = DataProducer([MNISTDataset('data/dataset', False)], batch_size=4, num_workers=2)
Create TrainConfig¶
Now let’s define TrainConfig
that will contains training hyperparameters.
In this tutorial we use predefined stages TrainStage
and ValidationStage
. TrainStage
iterate by DataProducer
and learn model in train()
mode.
Respectively ValidatioStage
do same but in eval()
mode.
from neural_pipeline import TrainConfig, TrainStage, ValidationStage
# define train stages
train_stages = [TrainStage(train_dataset), ValidationStage(validation_dataset)]
loss = torch.nn.NLLLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.5)
# define TrainConfig
train_config = TrainConfig(train_stages, loss, optimizer)
Create Trainer¶
First of all we need specify model, that will be trained:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5, 1)
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4 * 4 * 50, 500)
self.fc2 = nn.Linear(500, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4 * 4 * 50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
Now we need build our training process. It’s done by implements Trainer
class:
from neural_pipeline import FileStructManager, Trainer
# define file structure for experiment
fsm = FileStructManager(base_dir='data', is_continue=False)
# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))
# specify training epochs number
trainer.set_epoch_num(50)
Last parameter or Trainer
constructor - target device, that will be used for training.
Start training¶
Now we can just start training process:
trainer.train()
That’s all. Console output will look like that:
First 3 lines is standard output of ConsoleMonitor.
This monitor included for MonitorHub
by default.
Every line show loss values of correspondence stage in format [min, mean, max] values.
Last line build by tqdm and outcomes from TrainStage
and ValidationStage
. This output show current mean value of metrics on training stage.
Add Tensorboard monitor¶
For getting most useful information about training we can connect Tensorboard.
For do it we need before training connect builtin TensorboardMonitor to Trainer:
from neural_pipeline.builtin.monitors.tensorboard import TensorboardMonitor
trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=False))
Now Tensorboard output will looks like:


Continue training¶
If we need to do some more training epochs but doesn’t have previously defined objects we need to do this:
# define again all from previous steps
# ...
# define FileStructureManager with parameter is_continue=True
fsm = FileStructManager(base_dir='data', is_continue=True)
# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))
# specify training epochs number
trainer.set_epoch_num(50)
# add TensorboardMonitor with parameter is_continue=True
trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=True))
# set Trainer to resume mode and run training
trainer.resume(from_best_checkpoint=False).train()
Parameter from_best_checkpoint=False
tell Trainer, that it need continue from last checkpoint.
Neural Pipeline can save best checkpoints by specified rule. For more information about it read about enable_lr_decaying method of Trainer.
Don’t worry about incorrect training history displaying. If history also exists - monitors just add new data to it.
After this tutorial look to segmentation example for explore how to work with specific metrics.
API¶
Trainer¶
The main module for training process
-
class
neural_pipeline.train.
Trainer
(train_config: neural_pipeline.train_config.train_config.TrainConfig, fsm: neural_pipeline.utils.fsm.FileStructManager, device: torch.device = None)[source]¶ Class, that run drive process.
Trainer get list of training stages and every epoch loop over it.
Training process looks like:
for epoch in epochs_num: for stage in training_stages: stage.run() monitor_hub.update_metrics(stage.metrics_processor().get_metrics()) save_state() on_epoch_end_callback()
Parameters: - train_config –
TrainConfig
object - fsm –
FileStructManager
object - device – device for training process
-
add_on_epoch_end_callback
(callback: callable) → neural_pipeline.train.Trainer[source]¶ Add callback, that will be called after every epoch end
Parameters: callback – method, that will be called. This method may not get any parameters Returns: self object
-
add_stop_rule
(rule: callable) → neural_pipeline.train.Trainer[source]¶ Add the rule that control training process interruption
- Params:
- rule (callable): callable, that doesn’t get params and return boolean. When one of rules returns True training loop will be interrupted
- Returns:
- self object
Examples:
trainer.add_stop_rule(lambda: trainer.data_processor().get_lr() < 1e-6)
-
data_processor
() → neural_pipeline.data_processor.data_processor.TrainDataProcessor[source]¶ Get data processor object
Returns: data processor
-
disable_best_states_saving
() → neural_pipeline.train.Trainer[source]¶ Enable best states saving
Returns: self object
-
enable_best_states_saving
(rule: callable) → neural_pipeline.train.Trainer[source]¶ Enable best states saving
Best stages will save when return of rule update minimum
Parameters: rule – callback which returns the value that is used for define when need store best metric Returns: self object
-
enable_lr_decaying
(coeff: float, patience: int, target_val_clbk: callable) → neural_pipeline.train.Trainer[source]¶ Enable rearing rate decaying. Learning rate decay when target_val_clbk returns doesn’t update minimum for patience steps
Parameters: - coeff – lr decay coefficient
- patience – number of steps
- target_val_clbk – callback which returns the value that is used for lr decaying
Returns: self object
-
resume
(from_best_checkpoint: bool) → neural_pipeline.train.Trainer[source]¶ Resume train from last checkpoint
Parameters: from_best_checkpoint – is need to continue from best checkpoint Returns: self object
- train_config –
Train Config¶
-
class
neural_pipeline.train_config.train_config.
TrainConfig
(model: torch.nn.modules.module.Module, train_stages: [], loss: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer)[source]¶ Train process setting storage
Parameters: - train_stages – list of stages for train loop
- loss – loss criterion
- optimizer – optimizer object
-
class
neural_pipeline.train_config.train_config.
ComparableTrainConfig
(name: str = None)[source]¶ Train process setting storage with name. Used for train with few train configs in one time
Parameters: name – name of train config
-
class
neural_pipeline.train_config.train_config.
TrainStage
(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None, name: str = 'train')[source]¶ Standard training stage
When call
run()
it’s iterateprocess_batch()
of data processor by data loader withis_tran=True
flag.After stop iteration ValidationStage accumulate losses from
DataProcessor
.Parameters: - data_producer –
DataProducer
object - metrics_processor –
MetricsProcessor
- name – name of stage. By default ‘train’
-
disable_hard_negative_mining
() → neural_pipeline.train_config.train_config.TrainStage[source]¶ Enable hard negative mining.
Returns: self object
- data_producer –
-
class
neural_pipeline.train_config.train_config.
ValidationStage
(data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None, name: str = 'validation')[source]¶ Standard validation stage.
When call
run()
it’s iterateprocess_batch()
of data processor by data loader withis_tran=False
flag.After stop iteration ValidationStage accumulate losses from
DataProcessor
.Parameters: - data_producer –
DataProducer
object - metrics_processor –
MetricsProcessor
- name – name of stage. By default ‘validation’
- data_producer –
-
class
neural_pipeline.train_config.train_config.
AbstractMetric
(name: str)[source]¶ Abstract class for metrics. When it works in neural_pipeline, it store metric value for every call of
calc()
Parameters: name – name of metric. Name wil be used in monitors, so be careful in use unsupported characters -
calc
(output: torch.Tensor, target: torch.Tensor) → numpy.ndarray[source]¶ Calculate metric by output from model and target
Parameters: - output – output from model
- target – ground truth
-
static
max_val
() → float[source]¶ Get maximum value of metric. This used for correct histogram visualisation in some monitors
Returns: maximum value
-
-
class
neural_pipeline.train_config.train_config.
MetricsGroup
(name: str)[source]¶ Class for unite metrics or another
MetricsGroup
’s in one namespace. Note: MetricsGroup may contain only 2 level ofMetricsGroup
’s. SoMetricsGroup().add(MetricsGroup().add(MetricsGroup()))
will raisesMGException
Parameters: name – group name. Name wil be used in monitors, so be careful in use unsupported characters -
add
(item: neural_pipeline.train_config.train_config.AbstractMetric) → neural_pipeline.train_config.train_config.MetricsGroup[source]¶ Add
AbstractMetric
orMetricsGroup
Parameters: item – object to add Returns: self object Return type: MetricsGroup
-
calc
(output: torch.Tensor, target: torch.Tensor) → None[source]¶ Recursive calculate all metrics in this group and all nested group
Parameters: - output – predict value
- target – target value
-
have_groups
() → bool[source]¶ Is this group contains another metrics groups
Returns: True if contains, otherwise - False
-
-
class
neural_pipeline.train_config.train_config.
MetricsProcessor
[source]¶ Collection for all
AbstractMetric
’s andMetricsGroup
’s-
add_metric
(metric: neural_pipeline.train_config.train_config.AbstractMetric) → neural_pipeline.train_config.train_config.AbstractMetric[source]¶ Add
AbstractMetric
objectParameters: metric – metric to add Returns: metric object Return type: AbstractMetric
-
add_metrics_group
(group: neural_pipeline.train_config.train_config.MetricsGroup) → neural_pipeline.train_config.train_config.MetricsGroup[source]¶ Add
MetricsGroup
objectParameters: group – metrics group to add Returns: metrics group object Return type: MetricsGroup
-
calc_metrics
(output, target) → None[source]¶ Recursive calculate all metrics
Parameters: - output – predict value
- target – target value
-
-
class
neural_pipeline.train_config.train_config.
AbstractStage
(name: str)[source]¶ Stage of training process. For example there may be 2 stages: train and validation. Every epochs in train loop is iteration by stages.
Parameters: name – name of stage -
get_losses
() → numpy.ndarray[source]¶ Get losses from this stage
Returns: array of losses or None if this stage doesn’t need losses
-
-
class
neural_pipeline.train_config.train_config.
StandardStage
(stage_name: str, is_train: bool, data_producer: neural_pipeline.data_producer.data_producer.DataProducer, metrics_processor: neural_pipeline.train_config.train_config.MetricsProcessor = None)[source]¶ Standard stage for train process.
When call
run()
it’s iterateprocess_batch()
of data processor by data loaderAfter stop iteration ValidationStage accumulate losses from
DataProcessor
.Parameters: - data_producer –
DataProducer
object - metrics_processor –
MetricsProcessor
-
metrics_processor
() → neural_pipeline.train_config.train_config.MetricsProcessor[source]¶ Get merics processor of this stage
Returns: MetricsProcessor
if specified otherwise None
- data_producer –
Data Producer¶
-
class
neural_pipeline.data_producer.data_producer.
DataProducer
(datasets: [<class 'neural_pipeline.data_producer.data_producer.AbstractDataset'>], batch_size: int = 1, num_workers: int = 0)[source]¶ Data Producer. Accumulate one or more datasets and pass it’s data by batches for processing. This use PyTorch builtin
DataLoader
for increase performance of data delivery.Parameters: - datasets – list of datasets. Every dataset might be iterable (contans methods
__getitem__
and__len__
) - batch_size – size of output batch
- num_workers – number of processes, that load data from datasets and pass it for output
-
get_data
(dataset_idx: int, data_idx: int) → object[source]¶ Get single data by dataset idx and data_idx
Parameters: - dataset_idx – index of dataset
- data_idx – index of data in this dataset
Returns: dataset output
-
get_indices
() → [<class 'str'>][source]¶ Get current indices
Returns: list of current indices or None if method set_indices()
doesn’t called
-
get_loader
(indices: [<class 'str'>] = None) → torch.utils.data.dataloader.DataLoader[source]¶ Get PyTorch
DataLoader
object, that aggregateDataProducer
. Ifindices
is specified - DataLoader will output data only by this indices. In this case indices will not passed.Parameters: indices – list of indices. Each item of list is a string in format ‘{}_{}’.format(dataset_idx, data_idx) Returns: DataLoader
object
-
global_shuffle
(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]¶ Is need global shuffling. If global shuffling enable - batches will compile from random indices of all datasets. In this case datasets order shuffling was ignoring
Parameters: is_need – is need global shuffling Returns: self object
-
pass_indices
(need_pass: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]¶ Pass indices of data in every batch. By default disabled
Parameters: need_pass – is need to pass indices
-
pin_memory
(is_need: bool) → neural_pipeline.data_producer.data_producer.DataProducer[source]¶ Is need to pin memory on loading. Pinning memory was increase data loading performance (especially when data loads to GPU) but incompatible with swap
Parameters: is_need – is need Returns: self object
-
set_indices
(indices: [<class 'str'>]) → neural_pipeline.data_producer.data_producer.DataProducer[source]¶ Set indices to
DataProducer
. After that,DataProducer
start produce data only by indicesParameters: indices – list of indices in format “<dataset_idx>_<data_idx>` like: [‘0_0’, ‘0_1’, ‘1_0’] Returns: self object
- datasets – list of datasets. Every dataset might be iterable (contans methods
File structure management utils¶
Monitoring¶
Main module for monitoring training process
There is:
MonitorHub
- monitors collection for connect all monitors toTrainer
AbstractMonitor
- basic class for all monitors, that will be connected toMonitorHub
ConsoleMonitor
- monitor, that used for write epoch results to consoleLogMonitor
- monitor, used for metrics logging
-
class
neural_pipeline.monitoring.
MonitorHub
[source]¶ Aggregator of monitors. This class collect monitors and provide unified interface to it’s
-
add_monitor
(monitor: neural_pipeline.monitoring.AbstractMonitor) → neural_pipeline.monitoring.MonitorHub[source]¶ Connect monitor to hub
Parameters: monitor – AbstractMonitor
objectReturns:
-
set_epoch_num
(epoch_num: int) → None[source]¶ Set current epoch num
Parameters: epoch_num – num of current epoch
-
-
class
neural_pipeline.monitoring.
AbstractMonitor
[source]¶ Basic class for every monitor.
-
set_epoch_num
(epoch_num: int) → None[source]¶ Set current epoch num
Parameters: epoch_num – num of current epoch
-
-
class
neural_pipeline.monitoring.
ConsoleMonitor
[source]¶ Monitor, that used for write metrics to console.
Output looks like:
Epoch: [#]; train: [-1, 0, 1]; validation: [-1, 0, 1]
. This 3 numbers is [min, mean, max] values of training stage loss values
-
class
neural_pipeline.monitoring.
LogMonitor
(fsm: neural_pipeline.utils.fsm.FileStructManager)[source]¶ Monitor, used for logging metrics. It’s write full log and can also write last metrics in separate file if required
All output files in JSON format and stores in
<base_dir_path>/monitors/metrics_log
Parameters: fsm – FileStructManager
object-
get_final_metrics_file
() → str[source]¶ Get final metrics file path
Returns: path or None if writing doesn’t enabled by write_final_metrics()
-
update_losses
(losses: {}) → None[source]¶ Update losses on monitor
Parameters: losses – losses values dict with keys is names of stages in train pipeline (e.g. [train, validation])
-
Data Processor¶
-
class
neural_pipeline.data_processor.data_processor.
DataProcessor
(model: torch.nn.modules.module.Module, device: torch.device = None)[source]¶ DataProcessor manage: model, data processing, device choosing
- Args:
- model (Module): model, that will be used for process data device (torch.device): what device pass data for processing
-
predict
(data: torch.Tensor) → object[source]¶ Make predict by data
Parameters: data – data as torch.Tensor
or dict with keydata
Returns: processed output Return type: the model output type
-
set_pick_model_input
(pick_model_input: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]¶ Set callback, that will get output from
DataLoader
and return model input.Default mode:
lambda data: data[‘data’]
- Args:
- pick_model_input (callable): pick model input callable. This callback need to get one parameter: dataset output
- Returns:
- self object
Examples:
data_processor.set_pick_model_input(lambda data: data['data']) data_processor.set_pick_model_input(lambda data: data[0])
-
class
neural_pipeline.data_processor.data_processor.
TrainDataProcessor
(train_config: TrainConfig, device: torch.device = None)[source]¶ TrainDataProcessor is make all of DataProcessor but produce training process.
Parameters: train_config – train config -
get_state
() → {}[source]¶ Get model and optimizer state dicts
Returns: dict with keys [weights, optimizer]
-
predict
(data, is_train=False) → torch.Tensor[source]¶ Make predict by data. If
is_train
isTrue
- this operation will compute gradients. Ifis_train
isFalse
- this will work withmodel.eval()
andtorch.no_grad
Parameters: - data – data in dict
- is_train – is data processor need train on data or just predict
Returns: processed output
Return type: model return type
-
process_batch
(batch: {}, is_train: bool, metrics_processor: AbstractMetricsProcessor = None) → numpy.ndarray[source]¶ Process one batch of data
Parameters: - batch – dict, contains ‘data’ and ‘target’ keys. The values for key must be instance of torch.Tensor or dict
- is_train – is batch process for train
- metrics_processor – metrics processor for collect metrics after batch is processed
Returns: array of losses with shape (N, …) where N is batch size
-
set_data_preprocess
(data_preprocess: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]¶ Set callback, that will get output from
DataLoader
and return preprocessed data. For example may be used for pass data to device.Default mode:
_pass_data_to_device()
- Args:
- data_preprocess (callable): preprocess callable. This callback need to get one parameter: dataset output
- Returns:
- self object
Examples:
from neural_pipeline.utils import dict_recursive_bypass data_processor.set_data_preprocess(lambda data: dict_recursive_bypass(data, lambda v: v.cuda()))
-
set_pick_target
(pick_target: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]¶ Set callback, that will get output from
DataLoader
and return target.Default mode:
lambda data: data[‘target’]
- Args:
- pick_target (callable): pick target callable. This callback need to get one parameter: dataset output
- Returns:
- self object
Examples:
data_processor.set_pick_target(lambda data: data['target']) data_processor.set_pick_target(lambda data: data[1])
-
Model¶
-
class
neural_pipeline.data_processor.model.
Model
(base_model: torch.nn.modules.module.Module)[source]¶ Wrapper for
torch.nn.Module
. This class provide initialization, call and serialization for itParameters: base_model – torch.nn.Module
object-
model
() → torch.nn.modules.module.Module[source]¶ Get internal
torch.nn.Module
objectReturns: internal torch.nn.Module
object
-
Predictor¶
The main module for run inference
-
class
neural_pipeline.predict.
Predictor
(model: neural_pipeline.data_processor.model.Model, fsm: neural_pipeline.utils.fsm.FileStructManager, from_best_state: bool = False)[source]¶ Predictor run inference by training parameters
Parameters: - model – model object, used for predict
- fsm –
FileStructManager
object
Builtin modules¶
In builtin module contains all modules that can’t be tested, or have specific field of application.
Tensorboard¶
This module contains Tensorboard monitor interface
-
class
neural_pipeline.builtin.monitors.tensorboard.
TensorboardMonitor
(fsm: neural_pipeline.utils.fsm.FileStructManager, is_continue: bool, network_name: str = None)[source]¶ Class, that manage metrics end events monitoring. It worked with tensorboard. Monitor get metrics after epoch ends and visualise it. Metrics may be float or np.array values. If metric is np.array - it will be shown as histogram and scalars (scalar plots contains mean valuse from array).
Parameters: - fsm – file structure manager
- is_continue – is data processor continue training
- network_name – network name
-
update_losses
(losses: {}) → None[source]¶ Update monitor
Parameters: losses – losses values with keys ‘train’ and ‘validation’
-
update_metrics
(metrics: {}) → None[source]¶ Update monitor
Parameters: metrics – metrics dict with keys ‘metrics’ and ‘groups’
-
update_scalar
(name: str, value: float, epoch_idx: int = None) → None[source]¶ Update scalar on tensorboard
Parameters: - name – the classic tag for TensorboardX
- value – scalar value
- epoch_idx – epoch idx. If doesn’t set - use last epoch idx stored in this class
Matplotlib¶
This module contains Matplotlib monitor interface
-
class
neural_pipeline.builtin.monitors.mpl.
MPLMonitor
[source]¶ This monitor show all data in Matplotlib plots
-
realtime
(is_realtime: bool) → neural_pipeline.builtin.monitors.mpl.MPLMonitor[source]¶ Is need to show data updates in realtime
Parameters: is_realtime – is need realtime Returns: self object
-
AlbUNet¶
This module created AlbUNet: U-Net with ResNet encoder. This model writed by Alexander Buslaev and spoiled by me.
This model can be constructed with ‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’, ‘resnet152’ encoders.
For create model just call resnet<number>
method
-
neural_pipeline.builtin.models.albunet.
resnet18
(classes_num: int, in_channels: int, pretrained: bool = True)[source]¶ Constructs a AlbUNet with ResNet-18 encoder.
Parameters: - classes_num – number of classes (number of masks in output)
- in_channels – number of input channels
- pretrained – If True, returns a model with encoder pre-trained on ImageNet
-
neural_pipeline.builtin.models.albunet.
resnet34
(classes_num: int, in_channels: int, pretrained: bool = True)[source]¶ Constructs a AlbUNet with ResNet-34 encoder.
Parameters: - classes_num – number of classes (number of masks in output)
- in_channels – number of input channels
- pretrained – If True, returns a model with encoder pre-trained on ImageNet
-
neural_pipeline.builtin.models.albunet.
resnet50
(classes_num: int, in_channels: int, pretrained: bool = True)[source]¶ Constructs a AlbUNet with ResNet-50 encoder.
Parameters: - classes_num – number of classes (number of masks in output)
- in_channels – number of input channels
- pretrained – If True, returns a model with encoder pre-trained on ImageNet
-
neural_pipeline.builtin.models.albunet.
resnet101
(classes_num: int, in_channels: int, pretrained: bool = True)[source]¶ Constructs a AlbUNet with ResNet-101 encoder.
Parameters: - classes_num – number of classes (number of masks in output)
- in_channels – number of input channels
- pretrained – If True, returns a model with encoder pre-trained on ImageNet
-
neural_pipeline.builtin.models.albunet.
resnet152
(classes_num: int, in_channels: int, pretrained: bool = True)[source]¶ Constructs a AlbUNet with ResNet-152 encoder.
Parameters: - classes_num – number of classes (number of masks in output)
- in_channels – number of input channels
- pretrained – If True, returns a model with encoder pre-trained on ImageNet
DVC¶
Portrait segmentation network.
This based on PyTorch, NeuralPipeline and high-level pipeline build by [DVC](dvc.org).
Creation repo tutorial (explain, that code also exists):¶
This steps also done and results contains in repo. For reproduce this step make:
`
dvc destroy
git commit -m 'deinit DVC'
`
###Clone repo
1) add PixArt dataset as submodule
`
git submodule add http://172.26.40.23:3000/datasets/pixart.git datasets/
`
2) load all from submodule
`
git submodule update --init
`
###Build DVC pipeline:
1) initialize DVC
`
dvc init
git commit -m 'add DVC'
`
2) Setup pipeline
`
dvc run -d train.py -M data/monitors/metrics_log/metrics.json -o data/checkpoints/last/last_checkpoint.zip --no-exec python train.py
dvc run -d predict.py -d data/checkpoints/last/last_checkpoint.zip -o result --no-exec python predict.py
`
3) Run pipeline
`
dvc repro result.dvc
`
4) Last steps
After pipeline execution end, we get metrics.json file with metrics values and pipeline modified steps files. Let’s add it to git history
`
git add data/checkpoints/last/.gitignore last_checkpoint.zip.dvc result.dvc metrics.json -f
`
###Run another experiment We add hard negative mining to our training process. So we need to run new experiment and then compare it with existing
- Create new branch
`
git checkout -b hnm
dvc checkout
`
- Repeat all steps from previous section
- Compare metrics
`
dvc metrics show -a
`
Output will look like that:
``` hnm:
metrics.json: {“train”: {“jaccard”: 0.8874640464782715, “dice”: 0.9423233270645142, “loss”: 0.7522647976875305}, “validation”: {“jaccard”: 0.8573445081710815, “dice”: 0.9246319532394409, “loss”: 0.7623925805091858}}
- master:
- metrics.json: {“train”: {“jaccard”: 0.8774164915084839, “dice”: 0.9357065558433533, “loss”: 0.7595105767250061}, “validation”: {“jaccard”: 0.8574965596199036, “dice”: 0.927370011806488, “loss”: 0.7602806687355042}}
###Show DVC pipeline:
`
dvc pipeline show --ascii result.dvc
`
U may see this output:
```
+————————-+
| last_checkpoint.zip.dvc |
+————————-+
result.dvc
## Reproduce results:
Call dvc repro will run pipeline. But we need define last step of pipeline. So as a parameter we pass last pipeline step file name:
`
dvc repro result.dvc
`
After pipeline stop executing, you can see metrics (-a - show metrics from all branches):
`
dvc metrics show -a
`