Official English Documentation for TorchFusion!

TorchFusion is built to accelerate research and developement of modern AI systems. It is based on PyTorch and allows unimpeded access to all of PyTorch’s features. In creating TorchFusion, our goal is to build a deep learning framework that can easily support complex research projects while being incredibly simple enough to allow researchers focus more on research ideas rather than dealing with framework complexity. To achieve this, TorchFusion is built with multiple layers of abstractions, allowing researchers to remain productive while doing research projects needing varying levels of complexity. At all levels of abstraction, TorchFusion allows you to seamlessly use all standard PyTorch code and functions including its support libraries. The entire framework is highly decoupled allowing you to take advantage of various features even without using TorchFusion’s trainers.

TorchFusion is a project developed by John Olafenwa and Moses Olafenwa, the AI Commons team.

The Official GitHub Repository of TorchFusion is https://github.com/johnolafenwa/TorchFusion

Installing TorchFusion

Install Torchfusion

pip3 install --upgrade torchfusion

Hello FASHION MNIST!

TorchFusion makes data loading, network definition and training very easy. As you will see in this tutorial. We shall be training a basic pytorch model on the Fashion MNIST dataset.

FASHION MNIST DESCRIPTION

MNIST has been over-explored, state-of-the-art on MNIST doesn’t make much sense with over 99% already achieved. Fashion MNIST provides a more challenging version of the MNIST dataset. It contains 10 classes of grayscale diagrams of fashion items. It is exactly the same size, dimension and format as MNIST, but it is more challenging, hence, it is provides a dataset that is both fast to train and yet challenging enough to benchmark new models. Below are samples from the FashionMNIST dataset.

_images/fmnist.png

To learn more visit. Fashion MNIST

Import Classes

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torchfusion.learners import StandardLearner

Load the dataset

train_loader = fashionmnist_loader(size=28,batch_size=32)
test_loader = fashionmnist_loader(size=28,train=False,batch_size=32)

If you have used PyTorch before, you will notice just how simpler the data loading process is, this function still allows you to specify custom transformations. By default, TorchFusion loaders will normalize the images to range between -1 to 1, you can control the default normalization using the mean and std args.

Define the model

model = nn.Sequential(
    Flatten(),
    Linear(784,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,10)
)

The above is a simple 4 layer MLP, notice that all the layers above are from torchfusion. Unlike pure pytorch layers, torchfusion layers have optimal initialization by default, and you can easily specify custom initialization for them. However, they are still 100% compatible with their equivalent pytorch layers. You can also mix pure pytorch and torchfusion layers in the same model.

Define optimizer and loss

if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters())

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

Print Summary and Train the model

learner = StandardLearner(model)

if __name__ == "__main__":
    learner.summary((1,28,28))
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,test_loader=test_loader,test_metrics=test_metrics,num_epochs=40,batch_log=False)

PUTTING IT ALL TOGETHER

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torchfusion.learners import StandardLearner

train_loader = fashionmnist_loader(size=28,batch_size=32)
test_loader = fashionmnist_loader(size=28,train=False,batch_size=32)

model = nn.Sequential(
    Flatten(),
    Linear(784,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,10)
)

if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters())

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

learner = StandardLearner(model)

if __name__ == "__main__":

    print(learner.summary((1,28,28)))
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,test_loader=test_loader,test_metrics=test_metrics,num_epochs=40,batch_log=False)

Running the code above should reach an accuracy of about 90% after 30 epochs.

You can enable and disable epoch-end visualizations with the boolean args: display_metrics and save_metrics

PERFORMANCE METRICS

The Accuracy class measures the the topK accuracy. The default is top1, however, you can easily specify any K level.

Top K metric example:

train_metrics = [Accuracy(),Accuracy(topK=2),Accuracy(topK=5)]

Load the saved weights and evaluate performance on test set

We have just trained a classifier on Fashion MNIST and evaluated the performance at the end of each epoch. You can also use the evaluation function to evaluate the test performance separately.

Run evaluation

if __name__ == "__main__":
    top1_acc = Accuracy()
    top5_acc = Accuracy(topK=5)

    learner.load_model("best-models/model_3.pth")
    learner.evaluate(test_loader,[top1_acc,top5_acc])
    print("Top1 Acc: {} Top5 Acc: {}".format(top1_acc.getValue(),top5_acc.getValue()))

This produces Top1 Acc: 0.871399998664856 Top5 Acc: 0.996999979019165

Inference

The ultimate goal of training models is to use them to classify new images, now that we have trained the model on fashion images, save the images below and use the code after to classify them

_images/sample-1.jpg
_images/sample-2.jpg
_images/sample-3.jpg
_images/sample-4.jpg

Inference code

import torch
from torchfusion.layers import *
import torch.nn as nn
import torch.cuda as cuda
from torchfusion.learners import StandardLearner
from torchfusion.utils import load_image

model = nn.Sequential(
    Flatten(),
    Linear(784,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,10)
)

if cuda.is_available():
    model = model.cuda()


learner = StandardLearner(model)
learner.load_model("best_models\model_20.pth")

if __name__ == "__main__":

    #map class indexes to class names
    class_map = {0:"T-Shirt",1:"Trouser",2:"Pullover",3:"Dress",4:"Coat",5:"Sandal",6:"Shirt",7:"Sneaker",8:"Bag",9:"Ankle Boot"}

    #Load the image
    image = load_image("sample-1.jpg",grayscale=True,target_size=28,mean=0.5,std=0.5)

    #add batch dimension
    image = image.unsqueeze(0)

    #run prediction
    pred = learner.predict(image)

    #convert prediction to probabilities
    pred = torch.softmax(pred,0)

    #get the predicted class
    pred_class = pred.argmax().item()

    #get confidence for the prediction
    pred_conf = pred.max().item()

    #Map class_index to name
    class_name = class_map[pred_class]
    print("Predicted Class: {}, Confidence: {}".format(class_name,pred_conf))

Training CIFAR10!

In this section, we shall be using convolutional neural networks to train an Image Classification model on the CIFAR10 Dataset. We shall also explore more advanced concepts such as custom data transformations, learning rate scheduling and metric visualization.

CIFAR10 DESCRIPTION

Cifar10 is a dataset of 60000 images in 10 different categories. The dataset is split into a train set of 50000 images and a test set of 10000 images. CIFAR10 was collected by Alex Krizhevsky in 2009, and it is the most widely used dataset for research in Image Classification.

_images/cifar10.png

To learn more visit. To learn more visit. Cifar 10

Import Classes

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
from torchfusion.initializers import Kaiming_Normal, Xavier_Normal
import torchvision.transforms as transforms
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
from torchfusion.learners import StandardLearner

Load the dataset

train_transforms = transforms.Compose([
    transforms.RandomCrop(32,padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

test_transforms = transforms.Compose([
    transforms.CenterCrop(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

train_loader = cifar10_loader(transform=train_transforms,batch_size=32)
test_loader = cifar10_loader(transform=test_transforms,train=False,batch_size=32)

Data augmentation helps to improve the performance of our models, hence, for the train set we overrided the default transformations of torchfusion with a new one containing our custom transforms. For the test set, we simply use the default transforms.

Define the model

class Unit(nn.Module):
    def __init__(self,in_channels,out_channels):
        super(Unit,self).__init__()
        self.conv = Conv2d(in_channels,out_channels,kernel_size=3,padding=1,weight_init=Kaiming_Normal())
        self.bn = BatchNorm2d(out_channels)
        self.activation = Swish()

    def forward(self,inputs):
        outputs = self.conv(inputs)
        outputs = self.bn(outputs)
        return self.activation(outputs)

model = nn.Sequential(
    Unit(3,64),
    Unit(64,64),
    Unit(64,64),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(64,128),
    Unit(128,128),
    Unit(128,128),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(128,256),
    Unit(256,256),
    Unit(256,256),

    GlobalAvgPool2d(),

    Linear(256, 10,weight_init=Xavier_Normal())
)

To make the code more compact above, we first defined a Unit module that we reused in the model. Notice how we initialized the convolution layer with Kaiming Normal in the above, all torchfusion convolution layers are by default initialized with Kaiming_Normal and all Linear layers have default init of Xavier_Normal, however, we explicitly defined the intialization here to demonstrate how you can use any of the many initializers that torchfusion provides to initialize your layers. The bias_init arguement also allows you to initialize the bias as you want.

Define optimizer, lr scheduler and loss

if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters(),lr=0.001)

lr_scheduler = StepLR(optimizer,step_size=30,gamma=0.1)

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

In the above, we defined a learning rate scheduler to reduce the learning rate by a factor of 10 every 30 epochs. There are many learning rate schedulers in pyorch’s lr_scheduler package, you can use any of them here.

Train the model

learner = StandardLearner(model)

if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,model_dir="./cifar10-models",test_loader=test_loader,test_metrics=test_metrics,num_epochs=200,batch_log=False,lr_scheduler=lr_scheduler,save_logs="cifar10-logs.txt",display_metrics=True,save_metrics=True)

Here we specified a number of additional arguements, first we specified the lr_scheduler we earlier created, next we specified save_logs, this will save all logs to the file we specified, finally, save_metrics and display_metrics will display visualization of loss and metrics and save the generated plots. The save plots,logs and models can all be found in the directory cifar10-models that we specified above.

PUTTING IT ALL TOGETHER

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
from torchfusion.initializers import Kaiming_Normal, Xavier_Normal
import torchvision.transforms as transforms
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
from torchfusion.learners import StandardLearner

train_transforms = transforms.Compose([
    transforms.RandomCrop(32,padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

test_transforms = transforms.Compose([
    transforms.CenterCrop(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

train_loader = cifar10_loader(transform=train_transforms,batch_size=32)
test_loader = cifar10_loader(transform=test_transforms,train=False,batch_size=32)

class Unit(nn.Module):
    def __init__(self,in_channels,out_channels):
        super(Unit,self).__init__()
        self.conv = Conv2d(in_channels,out_channels,kernel_size=3,padding=1,weight_init=Kaiming_Normal())
        self.bn = BatchNorm2d(out_channels)
        self.activation = Swish()

    def forward(self,inputs):
        outputs = self.conv(inputs)
        outputs = self.bn(outputs)
        return self.activation(outputs)

model = nn.Sequential(
    Unit(3,64),
    Unit(64,64),
    Unit(64,64),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(64,128),
    Unit(128,128),
    Unit(128,128),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(128,256),
    Unit(256,256),
    Unit(256,256),

    GlobalAvgPool2d(),

    Linear(256, 10,weight_init=Xavier_Normal())
)


if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters(),lr=0.001)

lr_scheduler = StepLR(optimizer,step_size=30,gamma=0.1)

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

learner = StandardLearner(model)

if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,model_dir="./cifar10-models",test_loader=test_loader,test_metrics=test_metrics,num_epochs=30,batch_log=False,lr_scheduler=lr_scheduler,save_logs="cifar10-logs.txt",display_metrics=True,save_metrics=True)

Mixed Precision Training

Deep Learning models are usually trained using standard 32 bit floating point arithmetic. To speed up the training of deep learning models, a lot of research has gone into using lower bit precison arithmetic such as 8 bit and 16 bit arithmetic. The lower bits are much faster than the 32 bit precision arithmetic. NVIDIA Volta GPUs have tensor cores specialized for 16 bit precison arithmetic, taking advantage of them could lead to significant speed up in training of large deep learning models.

To learn more about mixed precision training, read ``Nvidia Mixed Precision Training https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html/`_

The greatest challenge with training in mixed precision made is radical decrease in model accuracy. Hence, a naive implementation of mixed precision training will result in very poor performance. TorchFusion includes highly optimized procedures for training Deep Learning Models in mixed precision without compromising performance.

PROCEDURE

The following are the procedures you need to follow to train torchfusion models in mixed precision.

Import Classes

from torchfusion.fp16_utils import half_model, FP16_Optimizer

CONVERT MODEL AND OPTIMIZER INTO FP16

model = half_model(model)
optimizer = FP16_Optimizer(Adam(model.parameters()))
learner = StandardLearner(model)
learner.half()

# if using lr_scheduler
lr_scheduler = StepLR(optimizer.optimizer,step_size=30,gamma=0.1)

PUTTING IT ALL TOGETHER

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
from torchfusion.initializers import Kaiming_Normal, Xavier_Normal
import torchvision.transforms as transforms
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
from torchfusion.learners import StandardLearner

train_transforms = transforms.Compose([
    transforms.RandomCrop(32,padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

test_transforms = transforms.Compose([
    transforms.CenterCrop(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

train_loader = cifar10_loader(transform=train_transforms,batch_size=32)
test_loader = cifar10_loader(transform=test_transforms,train=False,batch_size=32)

class Unit(nn.Module):
    def __init__(self,in_channels,out_channels):
        super(Unit,self).__init__()
        self.conv = Conv2d(in_channels,out_channels,kernel_size=3,padding=1,weight_init=Kaiming_Normal())
        self.bn = BatchNorm2d(out_channels)
        self.activation = Swish()

    def forward(self,inputs):
        outputs = self.conv(inputs)
        outputs = self.bn(outputs)
        return self.activation(outputs)

model = nn.Sequential(
    Unit(3,64),
    Unit(64,64),
    Unit(64,64),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(64,128),
    Unit(128,128),
    Unit(128,128),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(128,256),
    Unit(256,256),
    Unit(256,256),

    GlobalAvgPool2d(),

    Linear(256, 10,weight_init=Xavier_Normal())
)


if cuda.is_available():
    model = model.cuda()
model = half_model(model)

optimizer = FP16_Optimizer(Adam(model.parameters(),lr=0.001))

lr_scheduler = StepLR(optimizer.optimizer,step_size=30,gamma=0.1)

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

learner = StandardLearner(model)
learner.half()

if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,model_dir="./cifar10-models",test_loader=test_loader,test_metrics=test_metrics,num_epochs=30,batch_log=False,lr_scheduler=lr_scheduler,save_logs="cifar10-logs.txt",display_metrics=True,save_metrics=True)

Training With Custom Datasets!

While TorchFusion provides pre-defined loaders for popular standard datasets. Very often, you will need to train on your own custom datasets. TorcFusion provides loaders for any image dataset organized into a single folder with subfolders representing each class of images. For example, if you are training a model to recognize cats and dogs, you should have a train folder with two subfolders, one for dogs and one for cats.

Import Classes

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
from torchfusion.initializers import Kaiming_Normal, Xavier_Normal
import torchvision.transforms as transforms
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
from torchfusion.learners import StandardLearner

Load the dataset

train_transforms = transforms.Compose([
    transforms.RandomCrop(224,padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

test_transforms = transforms.Compose([
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

train_loader = imagefolder_loader(transform=train_transforms,batch_size=32,shuffle=True,root="path-to-train-folder")
test_loader = imagefolder_loader(transform=test_transforms,shuffle=False,batch_size=32,root="path-to-test-folder")

Define the model

class Unit(nn.Module):
    def __init__(self,in_channels,out_channels):
        super(Unit,self).__init__()
        self.conv = Conv2d(in_channels,out_channels,kernel_size=3,padding=1,weight_init=Kaiming_Normal())
        self.bn = BatchNorm2d(out_channels)
        self.activation = Swish()

    def forward(self,inputs):
        outputs = self.conv(inputs)
        outputs = self.bn(outputs)
        return self.activation(outputs)

model = nn.Sequential(
    Unit(3,64),
    Unit(64,64),
    Unit(64,64),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(64,128),
    Unit(128,128),
    Unit(128,128),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(128,256),
    Unit(256,256),
    Unit(256,256),

    GlobalAvgPool2d(),

    Linear(256, 10,weight_init=Xavier_Normal())
)

Define optimizer, lr scheduler and loss

if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters(),lr=0.001)

lr_scheduler = StepLR(optimizer,step_size=30,gamma=0.1)

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

Train the model

learner = StandardLearner(model)

if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,model_dir="./custom-models",test_loader=test_loader,test_metrics=test_metrics,num_epochs=200,batch_log=False,lr_scheduler=lr_scheduler,save_logs="custom-model-logs.txt",display_metrics=True,save_metrics=True)

PUTTING IT ALL TOGETHER

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
from torchfusion.initializers import Kaiming_Normal, Xavier_Normal
import torchvision.transforms as transforms
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
from torchfusion.learners import StandardLearner

train_transforms = transforms.Compose([
    transforms.RandomCrop(224,padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

test_transforms = transforms.Compose([
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
])

train_loader = imagefolder_loader(transform=train_transforms,batch_size=32,shuffle=True,root="path-to-train-folder")
test_loader = imagefolder_loader(transform=test_tranforms,shuffle=False,batch_size=32,root="path-to-test-folder")

class Unit(nn.Module):
    def __init__(self,in_channels,out_channels):
        super(Unit,self).__init__()
        self.conv = Conv2d(in_channels,out_channels,kernel_size=3,padding=1,weight_init=Kaiming_Normal())
        self.bn = BatchNorm2d(out_channels)
        self.activation = Swish()

    def forward(self,inputs):
        outputs = self.conv(inputs)
        outputs = self.bn(outputs)
        return self.activation(outputs)

model = nn.Sequential(
    Unit(3,64),
    Unit(64,64),
    Unit(64,64),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(64,128),
    Unit(128,128),
    Unit(128,128),
    nn.Dropout(0.25),

    nn.MaxPool2d(kernel_size=3,stride=2),

    Unit(128,256),
    Unit(256,256),
    Unit(256,256),

    GlobalAvgPool2d(),

    Linear(256, 10,weight_init=Xavier_Normal())
)


if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters(),lr=0.001)

lr_scheduler = StepLR(optimizer,step_size=30,gamma=0.1)

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

learner = StandardLearner(model)

learner = StandardLearner(model)

if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,model_dir="./custom-models",test_loader=test_loader,test_metrics=test_metrics,num_epochs=30,batch_log=False,lr_scheduler=lr_scheduler,save_logs="custom-models-logs.txt",display_metrics=True,save_metrics=True)

Logging and Visualizing the Training Process!

While torchfusion allows you to easily visualize the training process using matplotlib based charts, for more advanced visualization, Torchfusion has in-built support for visualizing the training process in both Visdom and Tensorboard.

Logging with Visdom

Visdom is a visualizing kit developed by Facebook AI Research, visdom was installed the first time you installed Torchfusion.

To visualize your training process in visdom, follow the steps below.

Step 1: Import the visdom logger:

from torchfusion.utils import VisdomLogger

Step 2: Specify the logger in your train func

visdom_logger = VisdomLogger()
if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,visdom_log=visdom_logger)

Step 3: Start the visdom server from the commad promp

python -m visdom.server

Ensure you are connected to the intenet when you run this, as visdom will need to download a few scripts.

_images/visdom.png

Notice the output above, open your browser and navigate to the url given, in this case: localhost:8097

Run this to see visdom in action

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torchfusion.learners import StandardLearner
from torchfusion.utils import VisdomLogger

train_loader = fashionmnist_loader(size=28,batch_size=32)
test_loader = fashionmnist_loader(size=28,train=False,batch_size=32)

model = nn.Sequential(
    Flatten(),
    Linear(784,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,10)
)

if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters())

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

visdom_logger = VisdomLogger()

learner = StandardLearner(model)

if __name__ == "__main__":

    print(learner.summary((1,28,28)))
    learner.train(train_loader,train_metrics=train_metrics,visdom_log=visdom_logger,optimizer=optimizer,loss_fn=loss_fn,test_loader=test_loader,test_metrics=test_metrics,num_epochs=30,batch_log=False)

Generated Visuals

_images/visdom-log.png

Using Tensorboard Torchfusion can also generate tensorboard logs that you can view with tensorboard, while Torchfusion does not require tensorboard or tensorflow installed to generate the logs as it uses TensorboardX, you need to install both tensorflow and tensorboard to view the generated logs.

Vist tensorflow.org for instructions on installing tensorflow and https://github.com/tensorflow/tensorboard for instructions on installing tensorbord

Once installed, you can use tensorboard in just ONE Line.

Specify the tensorboard_log in your train func

visdom_logger = VisdomLogger()
if __name__ == "__main__":
    learner.train(train_loader,train_metrics=train_metrics,optimizer=optimizer,loss_fn=loss_fn,tensorboard_log="./tboard-logs",visdom_log=visdom_logger)

Notice how we use both tensorboard and visdom here, we can use either independently or both if we want to.

Start the tensorboard server from the command prompt ::
tensorboard –logdir=tboard-logs
_images/tboard-start.png

Notice the output above, open your browser and navigate to the url given, in this case: specpal:6006

Run this to see tensorboard in action

from torchfusion.layers import *
from torchfusion.datasets import *
from torchfusion.metrics import *
import torch.nn as nn
import torch.cuda as cuda
from torch.optim import Adam
from torchfusion.learners import StandardLearner
from torchfusion.utils import VisdomLogger

train_loader = fashionmnist_loader(size=28,batch_size=32)
test_loader = fashionmnist_loader(size=28,train=False,batch_size=32)

model = nn.Sequential(
    Flatten(),
    Linear(784,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,100),
    Swish(),
    Linear(100,10)
)

if cuda.is_available():
    model = model.cuda()

optimizer = Adam(model.parameters())

loss_fn = nn.CrossEntropyLoss()

train_metrics = [Accuracy()]
test_metrics = [Accuracy()]

visdom_logger = VisdomLogger()

learner = StandardLearner(model)

if __name__ == "__main__":

    print(learner.summary((1,28,28)))
    learner.train(train_loader,train_metrics=train_metrics,tensorboard_log="./tboard-logs",visdom_log=visdom_logger,optimizer=optimizer,loss_fn=loss_fn,test_loader=test_loader,test_metrics=test_metrics,num_epochs=30,batch_log=False)

Generated Visuals

_images/tboard-log.png

Buiding Custom Trainers!

While Tochfusion strives to provide very good trainers, we know researchers often need custom training logic. TorchFusion makes using custom training logic easy. All you need to do is extend the Learners.

Sample Custom Trainer

#Extend the StandardLearner
   class CustomLearner(StandardLearner):

       #Override the train logic
       def __train_func__(self, data):

           self.optimizer.zero_grad()

           if self.clip_grads is not None:
               clip_grads(self.model,self.clip_grads[0],self.clip_grads[1])

           train_x, train_y = data

           batch_size = train_x.size(0)

           train_x = Variable(train_x.cuda() if self.cuda else train_x)

           train_y = Variable(train_y.cuda() if self.cuda else train_y)

           outputs = self.model(train_x)
           loss = self.loss_fn(outputs, train_y)
           loss.backward()

           self.optimizer.step()

           self.train_running_loss.add_(loss.cpu() * batch_size)

           for metric in self.train_metrics:
               metric.update(outputs, train_y)

       #Override the evaluation logic
       def __eval_function__(self, data):

           test_x, test_y = data

           test_x = Variable(test_x.cuda() if self.cuda else test_x)

           test_y = Variable(test_y.cuda() if self.cuda else test_y)

           outputs = self.model(test_x)

           for metric in self.test_metrics:
               metric.update(outputs, test_y)

       #Override the validation logic

       def __val_function__(self, data):

           val_x, val_y = data
           val_x = Variable(val_x.cuda() if self.cuda else val_x)

           val_y = Variable(val_y.cuda() if self.cuda else val_y)

           outputs = self.model(val_x)

           for metric in self.val_metrics:
               metric.update(outputs, val_y)

       #override the prediction logic

       def __predict_func__(self, inputs):

           inputs = Variable(inputs.cuda() if self.cuda else inputs)

           return self.model(inputs)

Introduction to Generative Adversarial Networks

Classification and regression models are used for predictive tasks, they map diverse inputs to fixed outputs, these class of models are called discriminative models. Generative Models do the opposite, they generate diverse outputs from fixed inputs. An example generative model is a model that can generate new pictures of cars simply from a text description. Different generative models exist, the most successful are Generative Adversarial Networks by Gooodfellow et al,2014 These models consist of a generator model which is responsible for generating new outputs, and a discriminator model that attempts to tell if the generated outputs are real or fake. During training, the discriminator is presented with both real and generated images. The discriminator is trained is trained to correctly tell the real images apart from generated images, while the generator is trained to generate images that are so real that the discriminaor will classify them as real. Hence, the two networks are competing with each other and the generator is trying to fool the discriminator. While the logic of GANs can be slighly complicated, TorchFusion makes using them a breeze and provides a highly sophiscated framework for doing research with custom GAN logic.

Below is are two pictures generated by a GAN. Karras et al,2017

_images/gan.png

UNCONDITIONAL GAN EXAMPLE

Earlier on, we learnt to correctly classify grayscale fashion images, now we shall attempt to generate them instead.

Step 1: Imports!

from torchfusion.gan.learners import *
from torchfusion.gan.applications import StandardGenerator,StandardProjectionDiscriminator
from torch.optim import Adam
from torchfusion.datasets import fashionmnist_loader
import torch.cuda as cuda
import torch.nn as nn

Define Generator and Discriminator

G = StandardGenerator(output_size=(1,32,32),latent_size=128)
D = StandardProjectionDiscriminator(input_size=(1,32,32),apply_sigmoid=False)

if cuda.is_available():
    G = nn.DataParallel(G.cuda())
    D = nn.DataParallel(D.cuda())

Here, we use predefined Generator and Discriminator in torchfusion, we set the size of the generated images to be 1,32,32 and the latent_size as 128. The images will be generated from the latent_code which will be of the size 128.

Setup optimizers

g_optim = Adam(G.parameters(),lr=0.0002,betas=(0.5,0.999))
d_optim = Adam(D.parameters(),lr=0.0002,betas=(0.5,0.999))

Since our generator and discriminator are separately trained, we need to specify different optimizers for them, try to stick to the hyper-parameters here as GANs can be very sensitive to this values.

load dataset

dataset = fashionmnist_loader(size=32,batch_size=64)

The image size here is set to be the same as the size of the images to be generated.

Define the learner

learner = RStandardGanLearner(G,D)

The Learner does all the heavy-lifting

Train the Models

if __name__ == "__main__":
    learner.train(dataset,gen_optimizer=g_optim,disc_optimizer=d_optim,save_outputs_interval=500,model_dir="./fashion-gan",latent_size=128,num_epochs=50,batch_log=False)

By specifying the save_outputs_interval as 500, every 500 batch iterations it will print sample generated immages. Note that this is different from number of epochs.

Putting it all Together

from torchfusion.gan.learners import *
from torchfusion.gan.applications import StandardGenerator,StandardProjectionDiscriminator
from torch.optim import Adam
from torchfusion.datasets import fashionmnist_loader
import torch.cuda as cuda
import torch.nn as nn

G = StandardGenerator(output_size=(1,32,32),latent_size=128)
D = StandardProjectionDiscriminator(input_size=(1,32,32),apply_sigmoid=False)

if cuda.is_available():
    G = nn.DataParallel(G.cuda())
    D = nn.DataParallel(D.cuda())

g_optim = Adam(G.parameters(),lr=0.0002,betas=(0.5,0.999))
d_optim = Adam(D.parameters(),lr=0.0002,betas=(0.5,0.999))

dataset = fashionmnist_loader(size=32,batch_size=64)

learner = RStandardGanLearner(G,D)

if __name__ == "__main__":
    learner.train(dataset,gen_optimizer=g_optim,disc_optimizer=d_optim,save_outputs_interval=500,model_dir="./fashion-gan",latent_size=128,num_epochs=50,batch_log=False)

Conditional Generative Adversarial Networks

In the previous chapter, images were randomly generated without respect to classes. Here we shall generated Images of specific classes While Conditional GANs are complex, torchfusion makes this super easy, all you have to do is state the num_classes in the Generator, Discriminator and in the Learner

Use classes in Generator and Discriminator

G = StandardGenerator(output_size=(1,32,32),latent_size=128,num_classes=10)
D = StandardProjectionDiscriminator(input_size=(1,32,32),apply_sigmoid=False,num_classes=10)

Define num_classes in Learner

if __name__ == "__main__":
    learner.train(dataset,num_classes=10,gen_optimizer=g_optim,disc_optimizer=d_optim,save_outputs_interval=500,model_dir="./fashion-gan",latent_size=128,num_epochs=50,batch_log=False)

And thats it ! The full code is below

Putting it all Together

from torchfusion.gan.learners import *
from torchfusion.gan.applications import StandardGenerator,StandardProjectionDiscriminator
from torch.optim import Adam
from torchfusion.datasets import fashionmnist_loader
import torch.cuda as cuda
import torch.nn as nn

G = StandardGenerator(output_size=(1,32,32),latent_size=128,num_classes=10)
D = StandardProjectionDiscriminator(input_size=(1,32,32),apply_sigmoid=False,num_classes=10)

if cuda.is_available():
    G = nn.DataParallel(G.cuda())
    D = nn.DataParallel(D.cuda())

g_optim = Adam(G.parameters(),lr=0.0002,betas=(0.5,0.999))
d_optim = Adam(D.parameters(),lr=0.0002,betas=(0.5,0.999))

dataset = fashionmnist_loader(size=32,batch_size=64)

learner = RStandardGanLearner(G,D)

if __name__ == "__main__":
    learner.train(dataset,num_classes=10,gen_optimizer=g_optim,disc_optimizer=d_optim,save_outputs_interval=500,model_dir="./fashion-gan",latent_size=128,num_epochs=50,batch_log=False)

After just 17 epochs, this produces

_images/fmnist-conditional.png

GAN Inference

Now that we have learn’t how to generate images of specific classes, here we shall use the trained generator for inference

Putting it all Together

from torchfusion.gan.learners import *
from torchfusion.gan.applications import StandardGenerator
import torch.cuda as cuda
import torch.nn as nn
from torchvision.utils import save_image
import torch
from torch.distributions import Normal


G = StandardGenerator(output_size=(1,32,32),latent_size=128,num_classes=10)

if cuda.is_available():
    G = nn.DataParallel(G.cuda())

learner = RStandardGanLearner(G,None)
learner.load_generator("path-to-trained-gen")

if __name__ == "__main__":
    "Define an instance of the normal distribution"
    dist = Normal(0,1)

    #Get a sample latent vector from the distribution
    latent_vector = dist.sample((1,128))

    #Define the class of the image you want to generate
    label = torch.LongTensor(1).fill_(5)

    #Run inference
    image = learner.predict([latent_vector,label])

    #Save generated image
    save_image(image, "image.jpg")

Buiding Custom Trainers!

Torchfusion provides a wide variety of GAN Learners, you will find them in the torchfusion.gan.learners package However, lots of research is ongoing into improved techniques for GANs, hence, we provide multiple levels of abstractions to faciliate research.

Custom Loss

#Extend the StandardBaseGanLearner
class CustomGanLearner(StandardBaseGanLearner):
    #Override the __update_discriminator_loss__
    def __update_discriminator_loss__(self, real_images, gen_images, real_preds, gen_preds):

        pred_loss = -torch.mean(real_preds - gen_preds)

        return pred_loss

    #Override the __update_generator_loss__
    def __update_generator_loss__(self,real_images,gen_images,real_preds,gen_preds):

        pred_loss = -torch.mean(gen_preds - real_preds)
        return pred_loss

Custom Training Logic

#Extend BaseGanCore
class CustomGanLearner(BaseGanCore):

    #Extend train
    def train(self,train_loader, gen_optimizer,disc_optimizer,latent_size,loss_fn=nn.BCELoss(),**kwargs):

        self.latent_size = latent_size
        self.loss_fn = loss_fn
        super().__train_loop__(train_loader,gen_optimizer,disc_optimizer,**kwargs)

    #Extend __disc_train_func__
    def __disc_train_func__(self, data):

        super().__disc_train_func__(data)

        self.disc_optimizer.zero_grad()

        if isinstance(data, list) or isinstance(data, tuple):
            x = data[0]
        else:
            x = data

        batch_size = x.size(0)

        source = self.dist.sample((batch_size,self.latent_size))

        real_labels = torch.ones(batch_size,1)
        fake_labels = torch.zeros(batch_size,1)

        if self.cuda:
            x = x.cuda()
            source = source.cuda()
            real_labels = real_labels.cuda()
            fake_labels = fake_labels.cuda()

        x = Variable(x)
        source = Variable(source)

        outputs = self.disc_model(x)

        generated = self.gen_model(source)
        gen_outputs = self.disc_model(generated.detach())

        gen_loss = self.loss_fn(gen_outputs,fake_labels)

        real_loss = self.loss_fn(outputs,real_labels)

        loss = gen_loss + real_loss
        loss.backward()
        self.disc_optimizer.step()

        self.disc_running_loss.add_(loss.cpu() * batch_size)


    #Extend __gen_train_func__
    def __gen_train_func__(self, data):

        super().__gen_train_func__(data)

        self.gen_optimizer.zero_grad()

        if isinstance(data, list) or isinstance(data, tuple):
            x = data[0]
        else:
            x = data
        batch_size = x.size(0)

        source = self.dist.sample((batch_size,self.latent_size))

        real_labels = torch.ones(batch_size,1)

        if self.cuda:
            source = source.cuda()
            real_labels = real_labels.cuda()

        source = Variable(source)

        fake_images = self.gen_model(source)
        outputs = self.disc_model(fake_images)

        loss = self.loss_fn(outputs,real_labels)
        loss.backward()

        self.gen_optimizer.step()

        self.gen_running_loss.add_(loss.cpu() * batch_size)

Examples Visit https://github.com/AICommons/TorchfusionExamples for example codes in TorchFusion