Smartmove documentation

Installation

Download Smartmove

First install Python version 3.5+.

You can get the smartmove code by either using git clone (shown below), or by downloading an archive file and extracting it to the location of your choice. Note, if the location of the code is not in you PYTHONPATH, you will only be able to import the code as a Python module from its parent directory (more on that here).

The publication release version can be found here: <git link>

The latest version of the code can be found here: <git link>

mkdir ~/opt
cd ~/opt
git clone <git link>

Virtual environment

Using virtualenv will ensure that you have the correct versions of the dependencies installed, but it is possible to just install directly in your native Python environment (in which case, skip to Installing dependencies).

With virtualenv already installed, create a virtual environment to ensure that you are you using the correct dependency versions for smartmove:

cd ~/opt/smartmove
virtualenv --python=python3 venv

Then activate the virtual environment before installing the dependencies using the requirements.txt file.

source venv/bin/activate

Note

Anytime you run smartmove, either in a Python interpreter or running a script from a commandline, you’ll need to activate the virtual environment first.

Installing dependencies

After you have installed and activated your virtual environment, or if you are skipping that step and installing everything to your local Python installation, install the files using the requirements.txt file from smartmove directory:

cd ~/opt/smartmove
source venv/bin/activate
pip3 install -r requirements.txt

Project setup

Creating the project directory

smartmove uses a series of configuration YAML files and a pre-defined directory structure for input and output data files. All of these are located in a project directory, which you define when you create your smartmove project.

First create a directory to use as your smartmove project directory:

mkdir /home/<user>/smartmove_project

Open the IPython interpretor:

cd ~/opt
IPython3

Then import smartmove and use the smartmove.create_project() method to create the necessary subdirectories and configuration files used in the smartmove analyses.

import smartmove

path_project = '/home/<user>/smartmove_project'
smartmove.create_project(path_project)

The configuration YAML files and necessary directories will then be created in your project directory, and you should receive a message instructing you to copy the necessary data files to their respective directories.

Example project directory structure

project directory
├── cfg_ann.yml
├── cfg_experiments.yml
├── cfg_glide.yml
├── cfg_project.yml
├── cfg_subglide-filter.yml
├── data_csv
├── data_ctd
├── data_glide
├── data_tag
└── model_ann

Copying data to the project directory

Note

The propeller_calibrations.csv file is for calibration of the propeller sensor data from rotations to units of m s^-2, and the cal.yml files in the Little Leonardo data directories are used by pylleo for calibrating the accelerometer sensor data to units of gravity.

See the pylleo documentation for more information on performing these calibrations and the naming of data files.

The CSV files for field experiments, isotope experiments, and propeller calibrations should be placed in the data_csv directory, the matlab file with CTD measurements should be placed in the data_ctd directory, and the directories containing Little Leonardo data should be placed in the data_tag directory.

project directory
├── data_csv
│   ├── field_experiments.csv
│   ├── isotope_experiments.csv
│   └── propeller_calibrations.csv
├── data_ctd
│   └── kaldfjorden2016_inner.mat
└── data_tag
    ├── 20150306_W190PD3GT_34839_Notag_Control
    ├── 20150310_W190PD3GT_34839_Notag_Control
    ├── 20150311_W190PD3GT_34839_Skinny_Control
    ├── 20150314_W190PD3GT_34839_Skinny_2neutralBlocks
    ├── 20150315_W190PD3GT_34839_Notag_Control
    ├── 20150316_W190PD3GT_34839_Skinny_4weighttubes_2Blocks
    ├── 20150317_W190PD3GT_34839_Skinny_4Floats
    ├── 20150318_W190PD3GT_34839_Notag_2neutrals
    ├── 20150320_W190PD3GT_34839_Skinny_4weights
    ├── 20150323_W190PD3GT_34839_Skinny_4NeutralBlocks
    ├── 20160418_W190PD3GT_34840_Skinny_2Neutral
    ├── 20160419_W190PD3GT_34840_Skinny_2Weighted
    ├── 20160422_W190PD3GT_34840_Skinny_4Floats
    └── 20160425_W190PD3GT_34840_Skinny_4Weights

Configuration files

The configuration files copied to the project directory from smartmove/_templates are used for configuration of ___

Project configuration

# Datalogger calibration directories
cal:
  # Contains acceleration calibration YAMLs for associated tag ID
  # Sensors should be calibrated per month--closest calibration used for analysis
  w190pd3gt:
    34839:
      2015:
        03: 20150306_W190PD3GT_34839_Notag_Control

    34840:
      2016:
        04: 20160418_W190PD3GT_34840_Skinny_2Neutral

# Parameters for field experiment
experiment:
  # 69° 41′ 57.9″ North, 18° 39′ 4.5″ East
  coords:
    lon: 18.65125
    lat: 69.69942
  net_depth: 18 #meters
  fname_ctd: 'kaldfjorden2016_inner.mat'

Glide analysis

# Number of samples per frequency segment in PSD calculation
nperseg: 256

# Threshold above which to find peaks in PSD
peak_thresh: 0.10

# High/low pass cutoff frequency, determined from PSD plot
cutoff_frq: None

# Frequency of stroking, determinded from PSD plot
stroke_frq: 0.4 # Hz

# fraction of `stroke_frq` to calculate cutoff frequency (Wn)
stroke_ratio: 0.4

# Maximum length of stroke signal
# 1/stroke_freq
t_max: 2.5 # seconds

# Minimumn frequency for identifying strokes
# 2 / (180*numpy.pi) (Hz)
J: 0.0349

# For magnetic pry routine
alpha: 25

# Minimum depth at which to recognize a dive
min_depth: 0.4

Sub-glide filtering

# Pitch angle (degrees) to consider sgls
pitch_thresh: 30

# Minimum depth at which to recognize a dive (2. Define dives)
min_depth: 0.4

# Maximum cummulative change in depth over a glide
max_depth_delta: 8.0

# Minimum mean speed of sublide
min_speed: 0.3

# Maximum mean speed of sublide
max_speed: 10

# Maximum cummulative change in speed over a glide
max_speed_delta: 1.0

Artificial Neural network

# Parameters for compiling data
data:
    sgl_cols:
        - 'exp_id'
    glides:
      cutoff_frq: 0.3
      J: 0.05
    sgls:
      dur: 2
    filter:
      pitch_thresh: 30
      max_depth_delta: 8.0
      min_speed: 0.3
      max_speed: 10
      max_speed_delta: 1.0

# Data and network config common to all structures
net_all:
    features:
        - 'abs_depth_change'
        - 'dive_phase_int'
        - 'mean_a'
        - 'mean_depth'
        - 'mean_pitch'
        - 'mean_speed'
        - 'mean_swdensity'
        - 'total_depth_change'
        - 'total_speed_change'
    target: 'rho_mod'
    valid_frac: 0.6
    n_targets:  10

# Network tuning parameters, all permutations of these will be trained/validated
net_tuning:
    # Number of nodes in each hidden layer
    hidden_nodes:
        - 10
        - 20
        - 40
        - 60
        - 100
        - 500
    # Number of hidden layers
    hidden_layers:
        - 1
        - 2
        - 3
    # Trainers (optimizers)
    # https://theanets.readthedocs.io/en/stable/api/trainers.html
    # http://sebastianruder.com/optimizing-gradient-descent/
    algorithm:
        - adadelta
        - rmsprop
    hidden_l1:
        - 0.1
        - 0.001
        - 0.0001
    weight_l2:
        - 0.1
        - 0.001
        - 0.0001
    momentum:
        - 0.9
    patience:
        - 10
    min_improvement:
        - 0.999
    validate_every:
        - 10
    learning_rate:
        - 0.0001

Using the Analysis helper class

After setting up your project directory with smartmove.create_project(), a template YAML files were copied to your project directory that configure the information about the project, experiment indices, glide identification parameters, sub-glide filtering parameters, and parameters for the ANN. Read more about the configuration files here in Configuration files.

Once your project directory is configured, you can use the Analysis helper class for running different parts of the analysis. The Analysis class object keeps track of the configuration files and the ANN analysis from which to generate tables and figures from. This allows you to easily run new ANN configurations at a later time and inspect the results of all models that have been run.

Create an analysis

Activate your virtual environment, and then lauch a python interpreter.

cd ~/opt/smartmove/
source venv/bin/activate
Ipython3

Then initiate an Analysis class object from which to run and inpect results from the ANN. After initializing your analysis, you can execute the glide identification function with the class method run_glides(), which will walk you through the glide identification.

import smartmove

path_project = './'

a = smartmove.Analysis(path_project)

You can inspect the attributes of the object from within a Python interpreter, such as iPython:

# Show the names of attributes for `a`
vars(a).keys()

# Show all attributes and class methods available from `a`
dir(a)

# Print the glide configuration dictionary
a.cfg_glide

Glide identification

# Run with `cfg_glide` and splitting sub-glides into 2s segments
a.run_glides(sgl_dur=2)

See the Glide identification documentation for an overview of the procedure.

Run ANN

a.run_ann()

See the Artifical Neural Network Analysis documentation for an overview of the procedure.

Paper figures and tables

Running the following routine will create a paper/ subdirectory the project directory passed when initializing your Analysis class object.

Be sure to load the ANN analysis you wish to produce figures for before running the commands.

select

# Generate figures
a.make_tables()

# Generate figures
a.make_figures()

Glide identification

Plot tools

Throughout the glide identification process, plots will be generated from which you can inspect the data. These are plot instances created using the matplotlib library, which have some default tools that are used by the user to determine values to be manually entered by the user, so we’ll cover those tools now.

Icon Matplotlib tool description
home Reset the original view of the plot
back Back to previous plot view
fwd Forward to next plot view
pan Pan axes with left mouse, zoom with right
zoom Zoom to the selected rectangle
cfg Configure subplot attributes
save Save the figure

Select tag data to process

At the start of the glide processing you are prompted to select the tag data directories which should be processed. You can type all for processing all tag directories or type a list of ID numbers for those you wish to process separated by commas (e.g. 0,4,6).

term_select

If the data has not previously been loaded by pylleo you will see then see output pertaining to the loading of the data, and a binary pandas pickle file will be saved in the data directory for subsequent loading.

Selecting a cutoff frequency

The Power Spectral Density plot will be then be shown, which is used for determining the cutoff frequency to split the filter the accelerometry data.

plot_psd1

Using the zoom tool select the area to the left including the peak and the area of the curve up to the point at which it flattens out. The frequency (x-axis values) used for the smartmove paper was selected to be the point past the falling inflection point which was roughly half the distance between the maximum and the falling inflection point (pictured below).

Note

User input required

plot_psd2

The frequency you determine from looking at these plots can then be entered in the terminal.

term_cutoff

Review processed data

The accelerometer data will then be low and high-pass filtered, and plots of the split accelerometry data will be shown with the original signal, low-pass filtered signal, and high-pass filtered signal.

plot_acc1

You can use the zoom tool to get a better idea of how the signals have been split at higher resolutions.

plot_acc2

Plots of the identified dives are then shown with descent phases labeled in blue and ascent phases labeled in green. The subplot beneath the dives shows the pitch angle of the animal calculated from the accelerometer data, with the low filtered signal (red) plotted on top of the original signal (green).

plot_dive1

Zoomed in used zoom

plot_dive2

Selecting a glide threshold

A diagnostic plot for determining the threshold for determing what portions of the accelerometer signal are considered to be active stroking vs. gliding will be displayed. will then be displayed showing the PSD plot of the high frequency signals for the x ans z axes, along with a plot of these signals over time. In the PSD plot, the peak in power (y-axis) should occur roughly at the frequency (x-axis) the characterizes stroking movements. Zooming into greater detail in the acceleration subplot using the zoom tool, You can then look at areas which appear to have relatively steady activity below this frequency (y-axis).

plot_J1 plot_J2 plot_J3

After determining the threshold, enter it when prompted in the terminal

term_J

Glide events will then be identified as as the areas below this threshold, and sub-glides will be split from the glides using the sgl_dur value passed to (e.g. a.run_glides(sgl_dur=2)).

Reviewing split sub-glides

A plot of the depth data and high frequency acceleration of the z-axis will be shown with each sub-glide highlighted and the sequential labels of the dive number in which it occurred and the number of the sub-glide.

plot_sgl1

Zooming in with zoom will give you better view of things.

plot_sgl2

Artifical Neural Network Analysis

Input data selection

When first running the ANN setup and training, A summary of glide identification features to be used are displayed

start1

All experiments who have glide identification processing results that match those features will be listed for selection to be compiled into a dataset for the ANN.

You call select all experiments by typing all, or you can type the experiment’s ids individually in a list separated by commas.

start2

After typing your selection and hitting enter, you will see the Theanets trainer initialize and begin the training process.

run

The training and validation data sets will automatically be split into min-batches by downhill, and you will see the training move onto the next mini-batch every so often.

test

When the training is finished, the resulting accuracy of the tuning dataset size test will be displayed at the end of the console output.

end

Inspecting the results of the ANN

The Analaysis object will automatically be updated with the results of the current ANN run. You can see a summary of the results by looking at the post attribute:

# Typing this and hitting enter will show you summary of the ANN results
a.post

# To access data, concatenate the path to the data
import os
from smartmove.config import paths, fnames
path_output = os.path.join(a.path_project, paths['ann'], a.current_analysis)

# Get the tuning results
import pandas
filename_tune = os.path.join(path_output, fnames['ann']['tune'])
results_tune = pandas.read_pickle(filename_tune)

# Print all files and directories in the output directory
for f in os.listdir(path_output):
    print('{}'.format(f))

Overview of ANN output

The following table gives an overview of all the files produced during the ANN tuning, dataset size test, and the post-processing. Note that the names for these files are set in smartmove.config.fnames and can be accessed in the ann field.

Filename File description
cfg_ann.yml Copy of main ANN configuration for archiving with results
cms_data.p Dictionary of confusion matrices for the dataset size test
cms_tune.p Dictionary of confusion matrices for the tuning process
data_sgls.p The compiled sub-glide data set with rho_mod added, from which datasets are split
data_sgls_norm.p The compiled sub-glide data set with each column unit normalized
data_test.p The test dataset tuple, where test[0] are the input features, and test[1] are the target values
data_train.p The train dataset tuple, where train[0] are the input features, and train[1] are the target values
data_valid.p The validation dataset tuple, where valid[0] are the input features, and train[1] are the target values
postprocessing.yml The summary of the ANN results saved as YAML
results_dataset_size.p The results of the dataset size test saved as a pandas.DataFrame
results_tuning.p The results of the tuning process saved as a pandas.DataFrame
stats_input_features.p Summary statistics calculated for the input features
stats_input_values.p Summary statistics calculated for the target values

YAML files can be easily read in any text editor, but you can also you yamlord to load them to a python dictionary:

import yamlord

# Manually open a post-processing YAML file
post = yamlord.read_yaml(os.path.join(path_output, 'postprocessing.yml'))

The files ending in .p are python pickle files, which you can use the pandas helper function to open:

import pandas

train = pandas.read_pickle(os.path.join(path_output, 'data_train.p'))

Smartmove API Documentation

glideid

ann

visuals

utils

Indices and tables