Smartmove documentation¶
Installation¶
Download Smartmove¶
First install Python version 3.5+.
You can get the smartmove code by either using git clone (shown below), or by downloading an archive file and extracting it to the location of your choice. Note, if the location of the code is not in you PYTHONPATH, you will only be able to import the code as a Python module from its parent directory (more on that here).
The publication release version can be found here: <git link>
The latest version of the code can be found here: <git link>
mkdir ~/opt
cd ~/opt
git clone <git link>
Virtual environment¶
Using virtualenv will ensure that you have the correct versions of the dependencies installed, but it is possible to just install directly in your native Python environment (in which case, skip to Installing dependencies).
With virtualenv already installed, create a virtual environment to ensure that you are you using the correct dependency versions for smartmove:
cd ~/opt/smartmove
virtualenv --python=python3 venv
Then activate the virtual environment before installing the dependencies using the requirements.txt file.
source venv/bin/activate
Note
Anytime you run smartmove, either in a Python interpreter or running a script from a commandline, you’ll need to activate the virtual environment first.
Installing dependencies¶
After you have installed and activated your virtual environment, or if you are skipping that step and installing everything to your local Python installation, install the files using the requirements.txt file from smartmove directory:
cd ~/opt/smartmove
source venv/bin/activate
pip3 install -r requirements.txt
Project setup¶
Creating the project directory¶
smartmove uses a series of configuration YAML files and a pre-defined directory structure for input and output data files. All of these are located in a project directory, which you define when you create your smartmove project.
First create a directory to use as your smartmove project directory:
mkdir /home/<user>/smartmove_project
Open the IPython interpretor:
cd ~/opt
IPython3
Then import smartmove and use the smartmove.create_project() method to create the necessary subdirectories and configuration files used in the smartmove analyses.
import smartmove
path_project = '/home/<user>/smartmove_project'
smartmove.create_project(path_project)
The configuration YAML files and necessary directories will then be created in your project directory, and you should receive a message instructing you to copy the necessary data files to their respective directories.
Example project directory structure
project directory
├── cfg_ann.yml
├── cfg_experiments.yml
├── cfg_glide.yml
├── cfg_project.yml
├── cfg_subglide-filter.yml
├── data_csv
├── data_ctd
├── data_glide
├── data_tag
└── model_ann
Copying data to the project directory¶
Note
The propeller_calibrations.csv file is for calibration of the propeller sensor data from rotations to units of m s^-2, and the cal.yml files in the Little Leonardo data directories are used by pylleo for calibrating the accelerometer sensor data to units of gravity.
See the pylleo documentation for more information on performing these calibrations and the naming of data files.
The CSV files for field experiments, isotope experiments, and propeller calibrations should be placed in the data_csv directory, the matlab file with CTD measurements should be placed in the data_ctd directory, and the directories containing Little Leonardo data should be placed in the data_tag directory.
project directory
├── data_csv
│ ├── field_experiments.csv
│ ├── isotope_experiments.csv
│ └── propeller_calibrations.csv
├── data_ctd
│ └── kaldfjorden2016_inner.mat
└── data_tag
├── 20150306_W190PD3GT_34839_Notag_Control
├── 20150310_W190PD3GT_34839_Notag_Control
├── 20150311_W190PD3GT_34839_Skinny_Control
├── 20150314_W190PD3GT_34839_Skinny_2neutralBlocks
├── 20150315_W190PD3GT_34839_Notag_Control
├── 20150316_W190PD3GT_34839_Skinny_4weighttubes_2Blocks
├── 20150317_W190PD3GT_34839_Skinny_4Floats
├── 20150318_W190PD3GT_34839_Notag_2neutrals
├── 20150320_W190PD3GT_34839_Skinny_4weights
├── 20150323_W190PD3GT_34839_Skinny_4NeutralBlocks
├── 20160418_W190PD3GT_34840_Skinny_2Neutral
├── 20160419_W190PD3GT_34840_Skinny_2Weighted
├── 20160422_W190PD3GT_34840_Skinny_4Floats
└── 20160425_W190PD3GT_34840_Skinny_4Weights
Configuration files¶
The configuration files copied to the project directory from smartmove/_templates are used for configuration of ___
Project configuration¶
# Datalogger calibration directories
cal:
# Contains acceleration calibration YAMLs for associated tag ID
# Sensors should be calibrated per month--closest calibration used for analysis
w190pd3gt:
34839:
2015:
03: 20150306_W190PD3GT_34839_Notag_Control
34840:
2016:
04: 20160418_W190PD3GT_34840_Skinny_2Neutral
# Parameters for field experiment
experiment:
# 69° 41′ 57.9″ North, 18° 39′ 4.5″ East
coords:
lon: 18.65125
lat: 69.69942
net_depth: 18 #meters
fname_ctd: 'kaldfjorden2016_inner.mat'
Glide analysis¶
# Number of samples per frequency segment in PSD calculation
nperseg: 256
# Threshold above which to find peaks in PSD
peak_thresh: 0.10
# High/low pass cutoff frequency, determined from PSD plot
cutoff_frq: None
# Frequency of stroking, determinded from PSD plot
stroke_frq: 0.4 # Hz
# fraction of `stroke_frq` to calculate cutoff frequency (Wn)
stroke_ratio: 0.4
# Maximum length of stroke signal
# 1/stroke_freq
t_max: 2.5 # seconds
# Minimumn frequency for identifying strokes
# 2 / (180*numpy.pi) (Hz)
J: 0.0349
# For magnetic pry routine
alpha: 25
# Minimum depth at which to recognize a dive
min_depth: 0.4
Sub-glide filtering¶
# Pitch angle (degrees) to consider sgls
pitch_thresh: 30
# Minimum depth at which to recognize a dive (2. Define dives)
min_depth: 0.4
# Maximum cummulative change in depth over a glide
max_depth_delta: 8.0
# Minimum mean speed of sublide
min_speed: 0.3
# Maximum mean speed of sublide
max_speed: 10
# Maximum cummulative change in speed over a glide
max_speed_delta: 1.0
Artificial Neural network¶
# Parameters for compiling data
data:
sgl_cols:
- 'exp_id'
glides:
cutoff_frq: 0.3
J: 0.05
sgls:
dur: 2
filter:
pitch_thresh: 30
max_depth_delta: 8.0
min_speed: 0.3
max_speed: 10
max_speed_delta: 1.0
# Data and network config common to all structures
net_all:
features:
- 'abs_depth_change'
- 'dive_phase_int'
- 'mean_a'
- 'mean_depth'
- 'mean_pitch'
- 'mean_speed'
- 'mean_swdensity'
- 'total_depth_change'
- 'total_speed_change'
target: 'rho_mod'
valid_frac: 0.6
n_targets: 10
# Network tuning parameters, all permutations of these will be trained/validated
net_tuning:
# Number of nodes in each hidden layer
hidden_nodes:
- 10
- 20
- 40
- 60
- 100
- 500
# Number of hidden layers
hidden_layers:
- 1
- 2
- 3
# Trainers (optimizers)
# https://theanets.readthedocs.io/en/stable/api/trainers.html
# http://sebastianruder.com/optimizing-gradient-descent/
algorithm:
- adadelta
- rmsprop
hidden_l1:
- 0.1
- 0.001
- 0.0001
weight_l2:
- 0.1
- 0.001
- 0.0001
momentum:
- 0.9
patience:
- 10
min_improvement:
- 0.999
validate_every:
- 10
learning_rate:
- 0.0001
Using the Analysis helper class¶
After setting up your project directory with smartmove.create_project(), a template YAML files were copied to your project directory that configure the information about the project, experiment indices, glide identification parameters, sub-glide filtering parameters, and parameters for the ANN. Read more about the configuration files here in Configuration files.
Once your project directory is configured, you can use the Analysis helper class for running different parts of the analysis. The Analysis class object keeps track of the configuration files and the ANN analysis from which to generate tables and figures from. This allows you to easily run new ANN configurations at a later time and inspect the results of all models that have been run.
Create an analysis¶
Activate your virtual environment, and then lauch a python interpreter.
cd ~/opt/smartmove/
source venv/bin/activate
Ipython3
Then initiate an Analysis class object from which to run and inpect results from the ANN. After initializing your analysis, you can execute the glide identification function with the class method run_glides(), which will walk you through the glide identification.
import smartmove
path_project = './'
a = smartmove.Analysis(path_project)
You can inspect the attributes of the object from within a Python interpreter, such as iPython:
# Show the names of attributes for `a`
vars(a).keys()
# Show all attributes and class methods available from `a`
dir(a)
# Print the glide configuration dictionary
a.cfg_glide
Glide identification¶
# Run with `cfg_glide` and splitting sub-glides into 2s segments
a.run_glides(sgl_dur=2)
See the Glide identification documentation for an overview of the procedure.
Run ANN¶
a.run_ann()
See the Artifical Neural Network Analysis documentation for an overview of the procedure.
Paper figures and tables¶
Running the following routine will create a paper/ subdirectory the project directory passed when initializing your Analysis class object.
Be sure to load the ANN analysis you wish to produce figures for before running the commands.
# Generate figures
a.make_tables()
# Generate figures
a.make_figures()
Glide identification¶
Plot tools¶
Throughout the glide identification process, plots will be generated from which you can inspect the data. These are plot instances created using the matplotlib library, which have some default tools that are used by the user to determine values to be manually entered by the user, so we’ll cover those tools now.
Icon | Matplotlib tool description |
![]() |
Reset the original view of the plot |
![]() |
Back to previous plot view |
![]() |
Forward to next plot view |
![]() |
Pan axes with left mouse, zoom with right |
![]() |
Zoom to the selected rectangle |
![]() |
Configure subplot attributes |
![]() |
Save the figure |
Select tag data to process¶
At the start of the glide processing you are prompted to select the tag data directories which should be processed. You can type all for processing all tag directories or type a list of ID numbers for those you wish to process separated by commas (e.g. 0,4,6).
If the data has not previously been loaded by pylleo you will see then see output pertaining to the loading of the data, and a binary pandas pickle file will be saved in the data directory for subsequent loading.
Selecting a cutoff frequency¶
The Power Spectral Density plot will be then be shown, which is used for determining the cutoff frequency to split the filter the accelerometry data.
Using the tool select the area to the left including the peak and
the area of the curve up to the point at which it flattens out. The frequency
(x-axis values) used for the smartmove paper was selected to be the point past
the falling inflection point which was roughly half the distance between the
maximum and the falling inflection point (pictured below).
Note
User input required
The frequency you determine from looking at these plots can then be entered in the terminal.
Review processed data¶
The accelerometer data will then be low and high-pass filtered, and plots of the split accelerometry data will be shown with the original signal, low-pass filtered signal, and high-pass filtered signal.
You can use the tool to get a better idea of how the signals have been
split at higher resolutions.
Plots of the identified dives are then shown with descent phases labeled in blue and ascent phases labeled in green. The subplot beneath the dives shows the pitch angle of the animal calculated from the accelerometer data, with the low filtered signal (red) plotted on top of the original signal (green).
Zoomed in used
Selecting a glide threshold¶
A diagnostic plot for determining the threshold for determing what portions
of the accelerometer signal are considered to be active stroking vs. gliding
will be displayed. will then be displayed showing the PSD plot of the high
frequency signals for the x ans z axes, along with a plot of these signals over
time. In the PSD plot, the peak in power (y-axis) should occur roughly at the
frequency (x-axis) the characterizes stroking movements. Zooming into greater
detail in the acceleration subplot using the tool, You can then look at
areas which appear to have relatively steady activity below this frequency
(y-axis).
After determining the threshold, enter it when prompted in the terminal
Glide events will then be identified as as the areas below this threshold, and sub-glides will be split from the glides using the sgl_dur value passed to (e.g. a.run_glides(sgl_dur=2)).
Reviewing split sub-glides¶
A plot of the depth data and high frequency acceleration of the z-axis will be shown with each sub-glide highlighted and the sequential labels of the dive number in which it occurred and the number of the sub-glide.
Zooming in with will give you better view of things.
Artifical Neural Network Analysis¶
Input data selection¶
When first running the ANN setup and training, A summary of glide identification features to be used are displayed
All experiments who have glide identification processing results that match those features will be listed for selection to be compiled into a dataset for the ANN.
You call select all experiments by typing all, or you can type the experiment’s ids individually in a list separated by commas.
After typing your selection and hitting enter, you will see the Theanets trainer initialize and begin the training process.
The training and validation data sets will automatically be split into min-batches by downhill, and you will see the training move onto the next mini-batch every so often.
When the training is finished, the resulting accuracy of the tuning dataset size test will be displayed at the end of the console output.
Inspecting the results of the ANN¶
The Analaysis object will automatically be updated with the results of the current ANN run. You can see a summary of the results by looking at the post attribute:
# Typing this and hitting enter will show you summary of the ANN results
a.post
# To access data, concatenate the path to the data
import os
from smartmove.config import paths, fnames
path_output = os.path.join(a.path_project, paths['ann'], a.current_analysis)
# Get the tuning results
import pandas
filename_tune = os.path.join(path_output, fnames['ann']['tune'])
results_tune = pandas.read_pickle(filename_tune)
# Print all files and directories in the output directory
for f in os.listdir(path_output):
print('{}'.format(f))
Overview of ANN output¶
The following table gives an overview of all the files produced during the ANN tuning, dataset size test, and the post-processing. Note that the names for these files are set in smartmove.config.fnames and can be accessed in the ann field.
Filename | File description |
cfg_ann.yml | Copy of main ANN configuration for archiving with results |
cms_data.p | Dictionary of confusion matrices for the dataset size test |
cms_tune.p | Dictionary of confusion matrices for the tuning process |
data_sgls.p | The compiled sub-glide data set with rho_mod added, from which datasets are split |
data_sgls_norm.p | The compiled sub-glide data set with each column unit normalized |
data_test.p | The test dataset tuple, where test[0] are the input features, and test[1] are the target values |
data_train.p | The train dataset tuple, where train[0] are the input features, and train[1] are the target values |
data_valid.p | The validation dataset tuple, where valid[0] are the input features, and train[1] are the target values |
postprocessing.yml | The summary of the ANN results saved as YAML |
results_dataset_size.p | The results of the dataset size test saved as a pandas.DataFrame |
results_tuning.p | The results of the tuning process saved as a pandas.DataFrame |
stats_input_features.p | Summary statistics calculated for the input features |
stats_input_values.p | Summary statistics calculated for the target values |
YAML files can be easily read in any text editor, but you can also you yamlord to load them to a python dictionary:
import yamlord
# Manually open a post-processing YAML file
post = yamlord.read_yaml(os.path.join(path_output, 'postprocessing.yml'))
The files ending in .p are python pickle files, which you can use the pandas helper function to open:
import pandas
train = pandas.read_pickle(os.path.join(path_output, 'data_train.p'))