Augmentor

Augmentor is a Python package designed to aid the augmentation and artificial generation of image data for machine learning tasks. It is primarily a data augmentation tool, but will also incorporate basic image pre-processing functionality.

Tip

A Julia version of the package is also being actively developed. If you prefer to use Julia, you can find it here.

The documentation is organised as follows:

Main Features

In this section we will describe the main features of Augmentor with example code and output.

Augmentor is software package for image augmentation with an emphasis on providing operations that are typically used in the generation of image data for machine learning problems.

In principle, Augmentor consists of a number of classes for standard image manipulation functions, such as the Rotate class or the Crop class. You interact and use these classes using a large number of convenience functions, which cover most of the functions you might require when augmenting image datasets for machine learning problems.

Because image augmentation is often a multi-stage procedure, Augmentor uses a pipeline-based approach, where operations are added sequentially in order to generate a pipeline. Images are then passed through this pipeline, where each operation is applied to the image as it passes through.

Also, Augmentor applies operations to images stochastically as they pass through the pipeline, according to a user-defined probability value for each operation.

Therefore every operation has at minimum a probability parameter, which controls how likely the operation will be applied to each image that is seen as the image passes through the pipeline. Take for example a rotate operation, which is defined as follows:

rotate(probability=0.5, max_left_rotation=5, max_right_rotation=10)

The probability parameter controls how often the operation is applied. The max_left_rotation and max_right_rotation controls the degree by which the image is rotated, if the operation is applied. The value, in this case between -5 and 10 degrees, is chosen at random.

Therefore, Augmentor allows you to create an augmentation pipeline, which chains together operations that are applied stochastically, where the parameters of each of these operations are also chosen at random, within a range specified by the user. This means that each time an image is passed through the pipeline, a different image is returned. Depending on the number of operations in the pipeline, and the range of values that each operation has available, a very large amount of new image data can be created in this way.

All functions described in this section are made available by the Pipeline object. To begin using Augmentor, you always create a new Pipeline object by instantiating it with a path to a set of images or image that you wish to augment:

>>> import Augmentor
>>> p = Augmentor.Pipeline("/path/to/images")
Initialised with 100 images found in selected directory.

You can now add operations to this pipeline using the p Pipeline object. For example, to add a rotate operation:

>>> p.rotate(probability=1.0, max_left_rotation=5, max_right_rotation=10)

All pipeline operations have at least a probability parameter.

To see the status of the current pipeline:

>>> p.status()
There are 1 operation(s) in the current pipeline.
Index 0:
    Operation RotateRange (probability: 1):
        Attribute: max_right_rotation (10)
        Attribute: max_left_rotation (-5)
        Attribute: probability (1)

There are 1 image(s) in the source directory.
Dimensions:
    Width: 400 Height: 400
Formats:
    PNG

You can remove operations using the remove_operation(index) function and the appropriate index indicator from above.

Full documentation of all functions and operations can be found in the auto-generated documentation. This guide suffice as a rough guide to the major features of the package, however.

Perspective Skewing

Perspective skewing involves transforming the image so that it appears that you are looking at the image from a different angle.

The following main functions are used for skewing:

  • skew_tilt()
  • skew_left_right()
  • skew_top_bottom()
  • skew_corner()
  • skew()

To skew or tilt an image either left, right, forwards, or backwards, use the skew_tilt function. The image will be skewed by a random amount in the following directions:

Skew Tilt Left Skew Tilt Right Skew Tilt Forward Skew Tilt Backward
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/TiltLeft_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/TiltRight_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/TiltForward_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/TiltBackward_s.png

Or, to skew an image by a random corner, use the skew_corner() function. The image will be skewed using one of the following 8 skew types:

Skew Type 0 Skew Type 1 Skew Type 2 Skew Type 3
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner0_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner1_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner2_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner3_s.png
Skew Type 4 Skew Type 5 Skew Type 6 Skew Type 7
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner4_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner5_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner6_s.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/Corner7_s.png

If you only wish to skew either left or right, use skew_left_right(). To skew only forwards or backwards, use skew_top_bottom().

The function skew() will skew your image in a random direction of the 12 directions shown above.

Elastic Distortions

Elastic distortions allow you to make distortions to an image while maintaining the image’s aspect ratio.

  • random_distortion()

Here, we have taken a sample image and generated 50 samples, with a grid size of 16 and a distortion magnitude of 8:

Original Image Random distortions applied
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/distort.gif

To highlight how this might be useful in a real-world scenario, here is the distort function being applied to a single image of a figure 8.

Original Image Random distortions applied
https://cloud.githubusercontent.com/assets/16042756/23697279/79850d52-03e7-11e7-9445-475316b702a3.png https://cloud.githubusercontent.com/assets/16042756/23697283/802698a6-03e7-11e7-94b7-f0b61977ef33.gif

Realistic new samples can be created using this method.

See the auto-generated documentation for more details regarding this function’s parameters.

Rotating

Rotating can be performed in a number of ways. When rotating by modulo 90, the image is simply rotated and saved. To rotate by arbitrary degrees, then a crop is taken from the centre of the newly rotated image.

Rotate functions that are available are:

  • rotate()
  • rotate90()
  • rotate180()
  • rotate270()
  • rotate_random_90()

Most of these methods are self-explanatory. The rotate_random_90() function will rotate the image by either 90, 180, or 270 degrees.

However, the rotate() warrants more discussion and will be described here. When an image is rotated, and it is not a multiple of 90 degrees, the image must either be stretched to accommodate a now larger image, or some of the image must be cut, as demonstrated below:

Original Image Rotated 10 degrees
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/rotate.png

As can be seen above, an arbitrary, non-modulo 90, rotation will unfortunately result in the image being padded in each corner. To alleviate this, Augmentor’s default behaviour is to crop the image and retain the largest crop possible while maintaining the image’s aspect ratio:

Original Image Rotated 10 degrees, automatically cropped
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/rotate_aug_b.png

This will, of course, result in the image being zoomed in. For smaller rotations of between -5 and 5 degrees, this zoom effect is not particularly drastic.

Shearing

Shearing tilts an image along one of its sides. The can be in the x-axis or y-axis direction.

Functions available for shearing are:

  • shear()

If you shear in the x or y axis, you will normally get images that look as follows:

Original image Shear (x-axis) 20 degrees Shear (y-axis) 20 degrees
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/shear_x.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/shear_y.png

However, as with rotations, you are left with image that are either larger in size, or are cropped to the original size but contain padding in at the sides of the images.

Augmentor automatically crops the largest area possible before returning the image, as follows:

Original image Shear (x-axis) 20 degrees Shear (y-axis) 20 degrees
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/shear_x_aug.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/shear_y_aug.png

You can shear by random amounts, a fixed amount, in random directions, or in a fixed direction. See the auto-generated documentation for more details.

Cropping

Cropping functions which are available are:

  • crop_centre()
  • crop_by_size()
  • crop_random()

The crop_random() function warrants further explanation. Here a region of a size specified by the user is cropped at random from the original image:

Original image Random crops
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/crop.gif

You could combine this with a resize operation, so that the images returned are the same size as the images of the original, pre-augmented dataset:

Original image Random crops + resize operation
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/orig.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/crop_resize.gif

Mirroring

The following functions are available for mirroring images (translating them through the x any y axes):

  • flip_left_right()
  • flip_top_bottom()
  • flip_random()

Of these, flip_random() can be used in situations where mirroring through both axes may make sense. We may, for example, combine random mirroring, with random distortions, to create new data:

Original image Random mirroring + random distortions
https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/eight_200px.png https://raw.githubusercontent.com/mdbloice/AugmentorFiles/master/UsageGuide/flip_distort.gif

Notes

Checkerboard image obtained from WikiMedia Commons and is in the public domain. See https://commons.wikimedia.org/wiki/File:Checkerboard_pattern.svg

Skin lesion image obtained from the ISIC Archive:

See https://isic-archive.com/#images for further details.

Installation

Installation is via pip:

pip install Augmentor

If you have to use sudo it is recommended that you use the -H flag:

sudo -H pip install Augmentor

Requirements

Augmentor requires Pillow and tqdm. Note that Pillow is a fork of PIL, but both packages cannot exist simultaneously. Uninstall PIL before installing Pillow.

Building

If you prefer to build the package from source, first clone the repository:

git clone https://github.com/mdbloice/Augmentor.git

Then enter the Augmentor directory and build the package:

cd Augmentor
python setup.py install

Alternatively you can first run python setup.py build followed by python setup.py install. This can be useful for debugging.

Attention

If you are compiling from source you may need to compile the dependencies also, including Pillow. On Linux this means having libpng (libpng-dev) and zlib (zlib1g-dev) installed.

Usage

Here we describe the general usage of Augmentor.

Getting Started

To use Augmentor, the following general procedure is followed:

  1. You instantiate a Pipeline object pointing to a directory containing your initial image data set.
  2. You define a number of operations to perform on this data set using your Pipeline object.
  3. You execute these operations by calling the Pipeline’s sample() method.

We will go through each of these steps in order in the proceeding 3 sub-sections.

Step 1: Create a New Pipeline

Let us first create an empty pipeline. In other words, to begin any augmentation task, you must first initialise a Pipeline object, that points to a directory where your original image dataset is stored:

>>> import Augmentor
>>> p = Augmentor.Pipeline("/path/to/images")
Initialised with 100 images found in selected directory.

The variable p now contains a Pipeline object, and has been initialised with a list of images found in the source directory.

Step 2: Add Operations to the Pipeline

Once you have created a Pipeline, p, we can begin by adding operations to p. For example, we shall begin by adding a rotate() operation:

>>> p.rotate(probability=0.7, max_left_rotation=10, max_right_rotation=10)

In this case, we have added a rotate() operation, that will execute with a probability of 70%, and have defined the maximum range by which an image will be rotated from between -10 and 10 degrees.

Next, we add a further operation, in this case a zoom() operation:

>>> p.zoom(probability=0.3, min_factor=1.1, max_factor=1.6)

This time, we have specified that we wish the operation to be applied with a probability of 30%, while the scale should be randomly selected from between 1.1 and 1.6

Step 3: Execute and Sample From the Pipeline

Once you have added the operations that you require, you can generate new, augmented data by using the sample() function and specify the number of images you require, in this case 10,000:

>>> p.sample(10000)

A progress bar will appear providing a number of metrics while your samples are generated. Newly generated, augmented images will by default be saved into an directory named output, relative to the directory which contains your initial image data set.

Hint

A full list of operations can be found in the Operations module documentation.

Examples

A number of typical usage scenarios are described here.

Note

A full list of operations can be found in the Operations module documentation.

Initialising a pipeline

import Augmentor

path_to_data = "/home/user/images/dataset1/"

# Create a pipeline
p = Augmentor.Pipeline(path_to_data)

Adding operations to a pipeline

# Add some operations to an existing pipeline.

# First, we add a horizontal flip operation to the pipeline:
p.flip_left_right(probability=0.4)

# Now we add a vertical flip operation to the pipeline:
p.flip_top_bottom(probability=0.8)

# Add a rotate90 operation to the pipeline:
p.rotate90(probability=0.1)

Executing a pipeline

# Here we sample 100,000 images from the pipeline.

# It is often useful to use scientific notation for specify
# large numbers with trailing zeros.
num_of_samples = int(1e5)

# Now we can sample from the pipeline:
p.sample(num_of_samples)

Extending Augmentor

Extending Augmentor to add new functionality is quite simple, and is performed in two steps:

  1. Create a custom class which subclasses from the Operation base class, and
  2. Add an object of your new class to the pipeline using the add_operation() function.

This allows you to add custom functionality and extend Augmentor at run-time. Of course, if you have written an operation that may be of benefit to the community, you can make a pull request on the GitHub repository.

The following sections describe extending Augmentor in two steps. Step 1 involves creating a new Operation subclass, and step 2 involves using an object of your new custom operation in a pipeline.

Step 1: Create a New Operation Subclass

To create a custom operation and extend Augmentor:

  1. You create a new class that inherits from the Operation base class.
  2. You must overload the perform_operation() method belonging to the superclass.
  3. You must call the superclass’s __init__() constructor.
  4. You must return an object of type PIL.Image.

For example, to add a new operation called FoldImage, you would write this code:

# Create your new operation by inheriting from the Operation superclass:
class FoldImage(Operation):
    # Here you can accept as many custom parameters as required:
    def __init__(self, probability, num_of_folds):
        # Call the superclass's constructor (meaning you must
        # supply a probability value):
        Operation.__init__(self, probability)
        # Set your custom operation's member variables here as required:
        self.num_of_folds = num_of_folds

    # Your class must implement the perform_operation method:
    def perform_operation(self, image):
        # Start of code to perform custom image operation.
        for fold in range(self.num_of_folds):
            pass
        # End of code to perform custom image operation.

        # Return the image so that it can further processed in the pipeline:
        return image

You have seen that you need to implement the perform_operation() function and you must call the superclass’s constructor which requires a probability value to be set. Ensure you return a PIL Image as a return value.

If you wish to make these changes permanent, place your code in the Operations module.

Hint

You can also overload the superclass’s __str__() function to return a custom string for the object’s description text. This is useful for some methods that display information about the operation, such as the status() method.

Step 2: Add an Object to the Pipeline Manually

Once you have a new operation which is of type Operation, you can add an object of you new operation to an existing pipeline.

# Instantiate a new object of your custom operation
fold = Fold(probability = 0.75, num_of_folds = 4)

# Add this to the current pipeline
p.add_operation(fold)

# Executed the pipeline as normal, and your custom operation will be executed
p.sample(1000)

As you can see, adding custom operations at run-time is possible by subclassing the Operation class and adding an object of this class to the pipeline manually using the add_operation() function.

Using non-PIL Image Objects

Images can be converted to their raw formats for custom operations, for example by using NumPy:

import numpy

# Custom class declaration

def perform_operation(image):

    image_array = numpy.array(image).astype('uint8')

    # Perform your custom operations here

    image = PIL.Image.fromarray(image_array)

    return image

Auto Generated Documentation

Module by Module Documentation

Documentation of the Pipeline module

Documentation of the Operations module

Documentation of the ImageUtilities module

Licence

Augmentor Licence

The Augmentor package is licenced under the terms of the MIT licence.

The MIT License (MIT)

Copyright (c) 2016 Marcus D. Bloice

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Indices and tables