tunacell: Time-lapse UNiCELLular Analyzer¶
Welcome to tunacell’s documentation. It is divided in four sections: introduction, user manual, advanced user manual, and API documentation generated from docstrings.
Introduction to tunacell
¶
tunacell
is a computational tool to study the dynamic properties of growing
and dividing cells. It uses raw data extracted from time-lapse microscopy and
provides the user with a set of functions to explore and visualize data, and
to compute the statistics of the dynamics.
It has been designed with the purpose of studying growth and gene expression from fluorescent reporters in E. coli, but should be applicable to any growing and dividing micro-organism.
tunacell
operates after segmentation and tracking has been performed on images.
For this first step, there are many tools available on the web,
see Segmentation and tracking tools for a short list,
and we will try in the future to facilitate the segmentation output to
tunacell’s input pipeline for some of these tools.
What it does¶
tunacell
reads segmented/tracked output and is able to reconstruct lineages
and colony structures from which it can directly
- plot the trajectories of user-defined observables for small samples, to gain qualitative intuition about the dynamic processes;
- perform statistical analysis of the dynamics, to gain some quantitative insights about these processes.
It also provides a Python API for the user to tailor his/her specific analysis.
How it works¶
tunacell
is not as smart as you are, but with the appropriate parameters
it will compute information faster than you can.
In order to perform any analysis, the user has to define the following quantities:
- the particular observable you want to look at, whether it is directly time-lapse raw values (e.g. length, total cell fluorescence), differentiation of it (e.g. growth rate, production rate), or quantities defined at cell-cycle stage (e.g. birth length, cell-cycle increase in fluorescence);
- the set of filters that will define the ensemble over which statistics are computed;
- the set of conditions that may define subgroups of samples over which comparison of results is relevant.
These steps are described in Setting up your analysis.
After these quantities are defined, API highest level functions are designed to
- plot trajectories as visual examples (see Plotting samples);
- compute the statistics of the dynamics (see Statistics of the dynamics);
- visualize the results of such computations (idem).
Lower-level API-functions may be used by experienced users to tailor new, specific analyses.
Why should I use tunacell
?¶
Because as a Python enthusiast, how cool is to say to your colleagues at the next conference you’ll attend: “I use both Python and tuna to analyze data about how bacteria struggle in life” [1].
One of the novelties of tunacell
is to provide a powerful tool to perform
conditional analysis of the dynamics. By conditional, we mean performing
statistical computations over user-defined subgroups of the original sample
ensemble.
A first set of functions to make these subgroups are already defined. Although it is not exhaustive, the pre-defined set of subgrouping functions already covers a wide range of possibilities.
Import/export functions have been implemented to save analyses altogether with their parameters to allow the user to keep a structured track of what has been done, making easier to collaborate on the analysis step.
Finally, experienced users will find it useful to be able to extend tuna’s framework, by designing new filtering functions, or implementing statistical analyses tailored to their particular project.
Where to start then?¶
We encourage readers to start with the 10 minute tutorial that will present the features described above on a simple, numerically generated dataset.
Then plug-in your data, check the documentation, and discover how cool micro-organisms are (on a dynamical point of view).
Contribute¶
If you find any bug, feel free to report it.
We also welcome contributors that point directly solutions to bugs, or that have implemented other functions useful for analysing the dynamics of growing micro-organisms.
Segmentation and tracking tools¶
Before using tunacell
you need to have segmented and tracked images from your
time-lapse movie. Many different softwares exist, depending on your
experimental setup. Some exemples are listed (non-exhaustively):
- SuperSegger : a Matlab-based software, with GUI. Uses machine learning principles to detect false divisions. Adapted for micro-colony growth in agar pads-like setups. Segment brightfield images (possible to inverse fluorescent images).
- oufti is also a Matlab-based software, following the previous microbetracker software developed by the same group.
- moma is a Java-based software particularly adapted to mother machine-like setups (and their paper).
- ieee_seg is another software adapted to mother machine-like setups.
Footnotes
[1] | It is highly recommended to double check the conference topic beforehand. |
Install¶
The easiest way to install tunacell is from wheels:
pip install tunacell
However some introductory tutorial and scripts are missing in the library. To get them you can visit the GitHub repository:
https://github.com/LeBarbouze/tunacell
where you can copy/paste these scripts (look into the scripts folder).
To get everything, a good solution is to fork the repository to your local account and/or to clone the repository on your computer. Change directory in your local repo and do a local install:
pip install -e .
(the -e
option stands for editable
).
With such a clone install, the scripts are in the same place, and you can use the Makefile to run tutorials/demos.
Local install¶
If Python is installed system-wide, you may have to sudo the command above. When it’s not possible you may give the option to install it on the user directory:
pip install -e –user .
Virtual environment¶
A better solution when Python is to create a virtual environment where you plan to work with tunacell. It requires pip and virtualenv to be installed on your machine. Then the Makefile does the job, run the command:
make virtualenv
that will set up the virtual environment and install pytest and flake8 locally. Activate the virtual environment with:
source venv/bin/activate
Then you can run the pip install -e., or make pipinstall command, without worrying about permissions since everything will be installed locally, and accesses only when your virtual environment is active. When you finish working with tunacell, type:
deactivate
and that’s it.
Dependencies¶
tunacell depends on few libraries that are automatically installed if you are using pip.
Numpy, Scipy, matplotlib are classic libraries, as well as pandas that is used to provide the user with DataFrame objects for some statistical analyses.
The tree-like structure arising from dividing cells has been implemented using the treelib library.
We use pyYAML to parse yaml files such as metadata or other library-created files, tqdm package for progress bars, and tabulate for fancy tabular printing.
New to Python¶
Python is a computer programming language and to use it, you need a Python interpreter. To check whether you have the Python interpreter installed on your system, run the following command in a terminal:
python -V
If the answer shows something like Python 2.7.x
, or
Python 3.6.y
, you’re good to go.
Otherwise you should install it, either directly downloading Python,
or using a friendlier package that will guide you,
such as anaconda.
After that you should be ready, and pip should be automatically installed. Again try:
pip -V
If it is not installed, you may check [this to install pip][install-pip].
Then get back to install instructions above.
10 minute tutorial¶
Contents
This tutorial is an introduction to tunacell’s features. A script is
associated in the Github repository at scripts/tutorial.py
.
If you cloned the repo, you can use the Makefile recipe make tuto
.
You do not need to plug in your data yet, as we will use numerically simulated data.
Generating data¶
To generate data, you can use the tunasimu
command from a terminal:
$ tunasimu -s 42
where the seed option is set to generate identical data that what is exposed on this page. The terminal output should resemble:
Path: /home/joachim/tmptunacell
Label: simutest
simulation: 100%|██████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 113.94it/s]
The progress bar indicates the time it takes to run the numerical simulations.
A new folder tmptunacell
is created in your home directory, it contains a
fresh new experiment called simutest
, composed of numerically
generated data, and everything needed for tunacell
’s reader functions to work
out properly. The name simutest
is set by default; it can be set otherwise
using the -l
option, though keep in mind other scripts are set to read the
simutest
by default.
Note
If you are not familiar with the term “numerical simulations”, it means generating “fake” data, that look like what would be observed in actual experiments, but are constructed with controled, specific assumptions. Our assumptions are described in Numerical simulations in tunacell.
In a nutshell, we generate fake, controled data about cells that grow and divide.
By typing:
$ cd
$ cd tmptunacell
$ ls
simutest
you should see this new subfolder. If you cd
in this subfolder you will see:
$ cd simutest
$ ls
containers descriptor.csv metadata.yml
The containers
folder contains data, the descriptor.csv
describes the
columns in data files, and finally the metadata.yml
file associates
metadata to current experiment.
We’ll discuss more details about the folder/files organization in Input file format.
Loading data¶
tunacell
is able to read one time-lapse experiment at a time.
Let’s open a Python session (or better, IPython) and open our recently simulated
experiment.
We are still in the tmptunacell
folder when we start our session.
To load the experiment simutest
in tunacell, type in:
>>> from tunacell import Experiment
>>> exp = Experiment('simutest')
(note that if your Python console is launched elsewhere, you should rather
provide the path to the simutest
folder, e.g. /home/joachim/tmptunacell/simutest
)
Ok, we’re diving in. Let’s call print about this:
>>> exp
Experiment root: /home/joachim/tmptunacell/simutest
Containers:
container_001
container_002
container_003
container_004
container_005
...
(100 containers)
birth_size_params:
birth_size_mean: 1.0
birth_size_mode: fixed
birth_size_sd_to_mean: 0.1
date: '2018-01-27'
division_params:
div_lambda: 0
div_mode: gamma
div_sd_to_mean: 0.1
div_size_target: 2.0
use_growth_rate: parameter
label: simutest
level: top
ornstein_uhlenbeck_params:
noise: 8.897278035522248e-08
spring: 0.03333333333333333
target: 0.011552453009332421
period: 5.0
simu_params:
nbr_colony_per_container: 2
nbr_container: 100
period: 5.0
seed: 42
start: 0.0
stop: 180.0
The exp
object shows up:
- the absolute path to the folder corresponding to our experiment;
- the list of container files;
- the experiment metadata, which summarizes the content of the
metadata.yml
file. When data is numerically simulated, parameters from the simulation are automatically exported in the metadata file.
To visualize something we need to select/add some samples.
Selecting small samples¶
To work with hand-picked, or randomly selected samples, we use:
>>> from tunacell import Parser
>>> parser = Parser(exp)
Let’s add a couple of samples, a first one that we know it exists if you used
default settings (in particular the seed
parameter), and another random sample
that will differ from below:
>>> parser.add_sample(('container_079', 4)) # this one works out on default settings
>>> parser.add_sample(1) # add 1 random sample default settings have not been used
index container cell
------- ------------- ------
0 container_079 4 # you should see this sample
1 container_019 21 # this one may differ
The particular container label and the cell number you’ve got on your screen is
unlikely to be the same as the one I got above. The container label indicates
which container file has been open, and the cell identifier indicates which cell
has been randomly selected in this container file. This entry is associated to
index 0
, the starting index in Python.
Inspecting small samples¶
Cell¶
Type in:
>>> cell = parser.get_cell(0)
>>> print(cell)
4;p:2;ch:-
cell
is the Cell
instance associated to our cell sample.
The print
call shows us three fields separated by semicolons.
The first field is the cell’s identifier; the second field indicates the
parent cell identifier if it exists (otherwise it’s a -
minus sign);
the third field indicates the offspring identifiers (again a -
minus sign indicates
this cell has no descendants).
Raw data is stored under the .data
attribute:
>>> print(cell.data)
[(130., 0.01094928, 0.08573655, 1.08951925, 4, 2)
(135., 0.01019836, 0.13886413, 1.14896798, 4, 2)
(140., 0.01016952, 0.18872295, 1.20770631, 4, 2)
(145., 0.00969319, 0.23829265, 1.26908054, 4, 2)
(150., 0.01036414, 0.28900263, 1.33509524, 4, 2)
(155., 0.01128818, 0.34241417, 1.40834348, 4, 2)
(160., 0.01110128, 0.39900286, 1.49033787, 4, 2)
(165., 0.01161387, 0.45529213, 1.57663389, 4, 2)
(170., 0.0111819 , 0.5127363 , 1.66985418, 4, 2)
(175., 0.0117796 , 0.56991253, 1.76811239, 4, 2)]
This is a structured array with column names:
>>> print(cell.data.dtype.names)
('time', 'ou', 'ou_int', 'exp_ou_int', 'cellID', 'parentID')
We can spot three recognizable names: time
, cellID
, and parentID
.
They give the acquisition time of current row frame, the cell identifier,
and its parent cell identifer (0
is reserved to mention ‘no parent cell’).
The other column names are a bit cryptic because they come from numerical
simulations (see in Numerical simulations in tunacell for more information). What we need to know so
far is that exp_ou_int
is synonymous with “cell size”, and ou
is
synonymous of “instantaneous cell size growth rate”. We intentionnally keep
this cryptic names to remember we are dealing with “fake data”.
Plotting small samples¶
To plot some quantity, we first need to define the observable from raw data. Raw data is presented as columns, and column names are what we call raw observables.
Defining the observable to plot¶
To define the observable to plot, we are using the Observable
object
located in tunacell
’s main scope:
>>> from tunacell import Observable
and we will choose the cryptic exp_ou_int
raw column in our simulated
data, and associate it to the “size” variable:
>>> obs = Observable(name='size', raw='exp_ou_int')
>>> print(obs)
Observable(name='size', raw='exp_ou_int', scale='linear', differentiate=False, local_fit=False, time_window=0.0, join_points=3, mode='dynamics', timing='t', tref=None, )
The output of print statement recapitulates all parameters of the observable. A more human-readable output by using the following method:
>>> print(obs.as_string_table())
parameter value
------------- ----------
name size
raw exp_ou_int
scale linear
differentiate False
local_fit False
time_window 0.0
join_points 3
mode dynamics
timing t
tref
We’ll review the details of Observable
object in the
Observable section.
Calling the plotting function¶
Now that we have a colony, we would like to inspect the timeseries of our chosen observable in the different lineages. To do this, we import the main object to plot samples:
>>> from tunacell.plotting.samples import SamplePlot
And we instantiate it with our settings:
>>> myplot = SamplePlot([colony, ], parser=parser)
>>> myplot.make_plot(obs)
>>> myplot.save(user_bname='tutorial_sample', add_obs=False)
To print out the figure:
>>> myplot.fig.show()
or when inline plotting is active just type:
>>> myplot.fig
If none of these commands worked out (that would be fairly strange), you can open the file that has been saved in:
~/tmptuna/simutest/sampleplots/
as tutorial_sample-plot.png
.
You should see something that looks like:
Our cryptic exp_ou_int
raw data stands in fact for a quantifier of the size
of our simulated growing cells, and this plot shows you how size of cells
evolves through time and rounds of divisionsin the same colony. This plot
shows how this quantity evolves in time, as well as the tree structure divisions
are making.
Further exploration about plotting timeseries of small samples is described in Plotting samples
Statistical analysis of the dynamics¶
The core of tunacell is to analyze the dynamics through statistics.
Warning
It gets a bit tougher to understand the following points if you’re not familiar with concepts of random processes.
Let’s briefly see how one can perform pre-defined analysis.
First, instead of looking at our previous observable, we will look at the
basic ou
observable:
>>> ou = Observable(name='growth-rate', raw='ou')
It describes a quantity that fluctuates in time around a given average value. One is then interested in inspecting three main things: what is the average value at each time-point of the experiment? how much are the typical deviations from this average value, at each time-point? And how far these fluctuations propagate in time?
We load a high level api function that perform the pre-defined analysis on single observables in order to answer these 3 main questions:
>>> from tunacell.stats.api import compute_univariate
>>> univariate = compute_univariate(exp, ou)
The first time such a command is run on current exp
instance, tunacell will
parse all data and count how much containers, cells, colonies, and lineages
are present. Such a count is printed and should be:
Count summary:
- cells : 2834
- lineages : 1517
- colonies : 200
- containers : 100
After such a count is performed, a progress bar informs about the time needed to parse data in order to compute univariate statistics. Results can be exported in a structured folder using:
>>> univariate.export_text()
This object univariate
stores our statistical quantifiers for our single
observable ou
. There are functions to generate plots of the results
stored in such univariate
object:
>>> from tunacell.plotting.dynamics import plot_onepoint, plot_twopoints
We make the plots by typing:
>>> fig = plot_onepoint(univariate, show_ci=True, save=True)
>>> fig2 = plot_twopoints(univariate, save=True)
It generates two plots. If they have not been printed automatically, you can open the pdf files that have been saved using the last line. They have been saved into a new bunch of folders:
~/tmptuna/simutest/analysis/filterset_01/growth-rate
The first one to look at is plot_onepoint_growth-rate_ALL.png
, or by typing:
>>> fig.show()
It should print out like this:
This plot is divided in three panels. All panels share the same x-axis, time (expressed here in minutes).
- The top panel y-axis,
Counts
, is the number of samples at each time-point (number of cells at each time-point); through divisions, this number of cells should increase, roughly exponentially; - The middle panel y-axis,
Average
, is the sample average of our observableou
(remember, this is our simulated stochastic process), the shadowed region is the 99% confidence interval; here the average value is stable, because our stochastic process is made like this; - The bottom panel y-axis,
Variance
, is the sample variance of the data (this is the square of the standard deviation shadow on middle panel, replotted here for convenience); again the standard deviation is stable, up to estimate fluctuations due to finite-size sampling.
The second plot to look at is plot_twopoints_growth-rate_ALL.png
, or:
>>> fig2.show()
which should print like this:

Plot of two-point functions: counts, autocorrelation functions, and centered superimposition of autocorrelation functions vs.time.
This plot is again divided in three panels. And for each panel, there are 4 curves that represent the autocorrelation function \(a(s, t)\) for four values of the first argument \(s = t_{\mathrm{ref}}\). The top two panels share the same x-axis:
- top-panel y-axis,
Counts
, is the number of independent lineages connecting \(t\) to \(t_{\mathrm{ref}}\) (one colour per \(t_{\mathrm{ref}}\)); - mid-panel y-axis,
Autocorr.
, is the autocorrelation functions; - bottom panel superimposes the autocorrelation functions for the 4 different \(a(t_{\mathrm{ref}}, t)\).
Auto-correlation functions obtained directly by reading the auto-covariance
matrix, as represented above, are quite noisy since the number of samples,
i.e. the number of lineages connecting a cell at time \(t_{\mathrm{ref}}\)
to a cell at time \(t\) is experimentally limited (in our numerical
experiment we’re reaching \(10^3\) for the red curve when \(t\) is close
to \(t_{\mathrm{ref}}=150\) mins, which begins to be acceptable).
tunacell
provides tools
to compute smoother auto- and cross-correlation functions when some conditions
are required. It goes beyond the purpose of this introductory tutorial to
expose these tools: you can learn more in a the specific tutorial
how to compute the statistics of the dynamics, or
in the paper.
What to do next?¶
If you are eager to explore your dataset, check first
how your data should be structured so that tunacell
can read it. Then
you may check how to set your analysis,
how to customize sample plots,
and finally how to compute the statistics of the dynamics,
in particular with respect to conditional analysis.
Enjoy!
Input file format¶
This section discusses how raw data should be organized so that tunacell
is
able to read it.
Two types of input are possible:
- plain text format (full compatibility): data output from any segmentation software can be translated to plain-text format; its format is explained thoroughly below
- SuperSegger output format (experimental): data output is read directly from
the output of the software (stored in a number of Matlab
.mat
files under a specific folder structure.
Contents
A given experiment is stored in a main folder¶
The name of the main folder is taken as the label of the experiment, i.e. as a unique name that identifies the experiment.
The scaffold to be used in the main folder is:
<experiment_label>/
containers/
descriptor.csv
metadata.yml
If you executed the tunasimu
script (see 10 minute tutorial) you can look in
the newly created directory tmptunacell
in your home directory:
there should be a folder simutest
storing data from the numerical simulations:
$ cd simutest
$ ls
and check that the structure matches the scaffold above.
There is a subfolder called containers
where raw data files are stored,
and two text files: descriptor.csv
describes the column organization of raw
data text files (see Raw data description),
while metadata.yml
stores metadata about the experiment
(see Metadata description).
Both files are needed for tunacell to run properly.
Data is stored in container files in the containers
subfolder¶
Time-lapse data is stored in the containers
folder. If you ran the
10 minute tutorial you can check what you find in this folder:
$ cd containers
$ ls
You should see a bunch of .txt
files (exactly 100 such files if you stuck to
default values for the simulation).
Each file in this containers
folder recapitulates raw data of cells
observed in fields of view of your experiment, which have been reported
by your image analysis process.
Your experiment may consist of multiple fields of view (or even subsets thererof), and we call each of these files a container file. Within a given container file, cell identifiers are univocal: there cannot be two different cells with the same identifier.
The container file is tab-separated values, and each column corresponds to a cell quantifier exported by the image analysis process. Each row represents one acquisition frame for a given cell. Rows are grouped by cell: if cell ‘1’ was imaged on 5 successive frames, there should be 5 successive rows in the container file reporting for raw data about cell ‘1’.
Raw data description¶
The column name and the type of data for each column is reported in the
descriptor.csv
file, a comma-separated value files, where each line entry
consists of <column-name>,<column-type>
.
The column name is arbitrary unless for 3 mandatory quantifiers (see mandatory-fields). The column type must be given as numpy datatypes; mostly used datatypes are:
f8
are floating point numbers coded on 8 bytes (this should be your default datatype for most quantifiers, except cell identifiers),i4
means integer coded on 4 bytes,u2
usually refer to the Irish band. For our purpose it also means unsigned integer coded on 2 bytes (this is the default for cell identifier, it counts cells up to 65535, which can be upgraded tou4
pushing the limits to 4294967295 cells—after that let me know if you still haven’t found what you’re looking for)
Mandatory raw data columns¶
cellID
: the identifier of a given cell. In our example, cells are labeled numerically by integers, hence the type isu2
(Numpy shortname that means unsigned integer coded on 2 bytes);parentID
: the identifier of the parent of given cell. This is mandatory fortunacell
to reconstruct lineages and colonies;time
: time at which acquisition has been made. Its type should bef8
, that means floating type coded on 8 bytes. The unit is left to the user’s appreciation (minutes, hours, or it can even be frame acquisition number—though this is discouraged since physical processes are independent of the period of acquisition).
All other fields are left to the user’s discretion.
Example¶
In our simutest
experiment, one could inspect descriptor.csv
:
time,f8
ou,f8
ou_int,f8
exp_ou_int,f8
cellID,u2
parentID,u2
In addition to the mandatory fields listed above one can find the following
cryptic names: ou, ou_int, exp_ou_int
. These are explained in Numerical simulations in tunacell.
Metadata description¶
YAML format¶
Experiment metadata is stored in the metadata.yml
file which is parsed using
the YAML syntax. First the file can be separated in documents (documents are
separated by ‘—’). Each document is organized as a list of parameters
(parsed as a dictionary). There must be at least one document where the entry
level
should be set to experiment
(or synonymously,
top
).
It indicates the higher level experimental metadata (can be date of experiment,
used strain, medium, etc…). A minimal example would be:
level: experiment
period: 3
which indicates that the acquisition time period is 3 minutes. A more complete metadata file could be:
level: experiment
period: 3
strain: E. coli
medium: M9 Glucose
temperature: 37
author: John
date : 2018-01-20
When the experiment has been designed such that metadata is heterogeneous, i.e. some fields of view get a different set of parameters, and that one later needs to distinguish these fields of view, then insert as many new documents as there are different types of fields of view. For example assume our experiment is designed to compare the growth of two strains and that fields of view 01 and 02 get one strain while field of view 03 get the other strain. One way to do it is:
level: experiment
period: 3
---
level:
- container_01
- container_02
strain: E. coli MG1655
---
level: container_03
strain: E. coli BW25113
A parameter given in a lower-lover overrides the same experiment-level parameter, which means that such a metadata could be shortened:
level: experiment
period: 3
strain: E. coli MG1655
---
level: container_03
strain: E. coli BW25113
such that it is assumed that the strain is E. coli MG1655
for all container
files, unless indicated otherwise which is the case here for container_03
that gets the BW25113
strain.
Tabular format (.csv)¶
Another option is to store metadata in a tabular file, such as comma-separated
values. The header should contain at least level
and period
.
The first row after header is usually reserved for the experiment level metadata,
and following rows may be populated for different fields of view. For example
the csv file corresponding to our latter example reads:
level,period,strain
experiment,3,E. coli MG1655
container_03,,E.coli BW25113
Although more compact, it can be harder to read/or fill from a text file.
Note
When a container is not listed, its metadata is read from to the experiment metadata. Missing values for a container row are filled with experiment-level values.
Supersegger output¶
The supersegger output is stored in numerous subfolders from a main folder. The Metadata description needs to be added as well under this main folder.
What to do next?¶
If you’d like to start analysing your dataset, your first task is to organize data in the presented structure. When it’s done, you can try to adapt the commands from the 10 minute tutorial to your dataset. When you want to get more control about your analysis, have a look at Setting up your analysis which presents you how to set up the analysis, in particular how to define the statistical ensemble and how to create subgroups for statistical analysis. Then you can refer to Plotting samples to customize your qualitative exploration of data, and then dive in Statistics of the dynamics to start the quantitative analysis.
Setting up your analysis¶
Once raw data files are organized following requirements in Input file format, analysis can get started. A first step is to follow the guidelines in 10 minute tutorial. Here we go into more detail about:
- how to parse your data,
- how to define the observable to look at, and
- how to define conditions.
Contents
Experiment and filters¶
To start the analysis, you need to tell tunacell which experiment to analyse, and whether to apply filters.
Loading an experiment¶
To set up the experiment, you have to give the path to the experiment folder on
your computer. We will denote this path as <path-to-exp>
, then use:
from tunacell import Experiment
exp = Experiment(<path-to-exp>)
By default, no filter is applied. But it is possible to associate a set of filters to an experiment, giving instructions to how data structures will be parsed.
Defining the statistical ensemble¶
The statistical ensemble is the set of cells, lineages, colonies, and containers that are parsed to compute statistics. In some cases, you might be likely to remove outliers, such as cells carrying anomalous values.
To do so a FilterSet
instance must be defined
and associated it with the Experiment
object.
Detailed description about how to define filters and filter sets is in
Filters. Here we give a simple, concrete example. Suppose you’d like to
filter out cells that didn’t divide symmetrically. To do so, you first instantiate
the FilterSymmetricDivision
class:
from tunacell.filters.cells import FilterSymmetricDivision
myfilter = FilterSymmetricDivision(raw='length', lower_bound=0.4, upper_bound=0.6)
length
is used as the raw quantifier (assuming you have a column
length
in your data files). Such filter requires that the
daughter cell length
at birth must be bound within 40 and 60 percent
of the mother cell’s length
at division. Then:
from tunacell import FilterSet
myfset = FilterSet(filtercells=myfilter)
In the last line, the keyword argument specifies filtercells
since our
filter myfilter
acts on Cell
instances. You can define one filter
for each type of structures: Cell
, Colony
, Lineage
,
and Container
.
Once that a FilterSet
instance is defined, load it with:
exp.set_filter(fset)
Note
Filtering cell outliers may affect the tree structure, decomposing original tree in multiple subtrees where outlier node has been removed. Hence the number of trees generated from one container file depends on the filter applied to cells.
Defining particular samples¶
All samples from an experiment are used for statistics, under the filtering assumption discussed above. However, visualization of trajectories is performed over a subset of reasonable size: this is what we’ll be calling small samples.
Small samples can be chosen specifically by user (“I am intrigued by this cell, let’s have a look on its trajectory”), or randomly. To do so:
from tunacell import Parser
parser = Parser(exp)
Note that a sample is identified by a couple of labels: the container label, and the cell identifier. For example:
parser.add_sample({'container_label': 'FOV_001', 'cellID': 12})
or synonymously:
parser.add_sample(('FOV_001', 12))
This information is stored under the samples
, and you can get a
print of the registered samples with:
print(parser.info_samples())
You can also add randomly chosen samples:
parser.add_sample(10)
adds 10 such samples.
Please refer to Parser for more information about how to use it.
Iterating through samples¶
The exp
object provides a set of iterators to parse data at each level,
with the appropriate applied filters:
Container
level with the methoditer_containers()
, filtered at the container level,Colony
level with the methoditer_colonies()
, filtered at the container, cell, and colony levels,Lineage
level with the methoditer_lineages()
, filtered at the container, cell, colony, and lineages levelsCell
level with the methoditer_cells()
, filtered at the container, cell, colony, and lineages levels.
The idea behind tunacell
is to decompose colonies into sets of lineages, i.e. into
sets of sequences of parentally linked cells. This way, it is possible
to extract time-series that span time ranges larger than single cell cycles.
Note
Decomposition in lineages is performed randomly: at cell division, one daughter cell is chosen randomly to be the next step in the lineage. This way, lineages are independent: a given cell belongs to one, and only one independent lineage.
Iterating over listed samples¶
Use above-mentioned methods on the Parser
instance.
See Parser for more details.
Iterating over all samples¶
Use above-mentioned methods on the Experiment
instance.
See Experiment for more details.
Defining the observable¶
To define an observable, i.e. a measurable quantity that evolves through time,
use the Observable
class:
from tunacell import Observable
and instantiate it with parameters to define a particular observable.
First parameter is the name to give to the observable (to find it back in the analysis process).
Second, mandatory parameter is the column to use as raw data (e.g. ‘length’, ‘size’, ‘fluo’, …).
Then, it is possible to use time-lapse data (as stored in data files, or processed using a time-derivative estimate) or to determine the value of said raw observable at a particular cell cycle stage, for example length at birth.
Indicating raw data¶
First, one needs to indicate which column to be used in the raw data
file, by specifying raw='<column-name>'
.
When raw data is expected to be steady, or to be a linear function of time
within cell cycle, then use scale='linear'
(default setting). When it is
expected to be an exponential function of time within cell cycle, use
scale='log'
. We will mention below how this parameter affects some
procedures.
Raw data can be used as is, or further processed to provide user-defined observable. Two main modes are used to process raw data:
- The dynamics mode is used when one wants to analyze observables for all time points; examples are: length, growth rate, …
- The cell-cycle mode indicates observables that are defined as a single value per cell cycle; examples are: length at birth, average growth rate, …
Dynamic mode¶
It corresponds to the parameter mode='dynamics'
.
It sets automatically the timing parameter as timing='t'
where t
stands for time-lapse timing. It is meant to study observables for all time
points (time-lapse, dynamic analysis).
Cell-cycle modes¶
Cell-cycle modes are used when one observable need to be quantified at the cell-cycle level, i.e. quantified once per cell cycle.There are few cell cycle modes:
mode='birth'
: extrapolates values to estimate observable at cell birth;mode='division'
: extrapolates values to estimate observable at cell division;mode='net-increase-additive'
: returns the difference between division and birth values of observable;mode='net-increase-multiplicative'
: returns the ratio between division and birth values of observable;mode='average'
: returns the average value of observable along cell cycle;mode='rate'
: proceeds to a linear/exponential fit of observable depending on the chosenscale
parameter. In fact, the procedure always performs linear fits, whenscale='log'
the log of raw data is used, thereby performing an exponential fit on raw data.
Choosing the timing¶
For dynamic mode, the only associated timing is t
(stands for “time-lapse”).
The parameter tref
may be used to align time points. When provided as a
number, it will be substracted to acquisition time. A string code can be given,
'root'
that aligns data with the colony’s root cell division time (caution:
when filtering happens, some cells that were acquired at the middle of your
experiment can become root cells if their parent cell is an outlier; this may
affect dangerously the alignement of your time-series).
For cell-cycle modes it associates to the estimated observable a time-point to be chosen between:
b
: time at birth, when known;d
: time at division, when known;m
: time at mid-point trhough cell-cycle;g
: generation index, which can be used in conjunction with the parametertref
. When the later is set to a floating number, generation index will be offset to the generation index of the cell’s ancestor that lived at this time of reference if it exists, otherwise, data from this lineage is discarded in analysis. Whentref=None
, then the generation index is relative to the colony to which belongs current cell.
End-point values are estimated by extrapolation. This is because cell divisions
are recorded halfway between parent cell last frame and daughter cell first
frame. The extrapolation uses local fits over join_points
points.
Warning
generation index may be used with care in statistical estimates over the dynamics of dividing cells, since generation 0 for a given colony does not necessarily correspond to generation 0 of another colony.
Differentiation¶
In dynamics mode, differentiation is obtained either by default using finite differences with two consecutive points, either by a sliding window fit. For an observable \(x(t)\), depending on the chosen scale, linear or log, it returns the estimate of \(\frac{dx}{dt}\) or \(\frac{d}{dt} \log x(t)\) respectively.
Local fit estimates¶
As finite difference estimates of derivatives are very sensitive to measurement precision, the user can opt for a local fitting procedure.
This procedure can be applied to estimate derivatives, or values of the
observables by performing local linear fit of the scaled observable over
a given time window. To use said option, user needs to provide the time window
extent, e.g. time_window=15
, will proceed to a local fit over
a time window of 15 units of time (usually minutes).
Such a local fit procedure restricted to scanning cell-cycle time segments
would lead to a loss of exploitable times, as large as the time window,
for each cell. To deal with that, the procedure provide a way to use daughter
cell information to “fill data estimates” towards the end of cell-cycle.
The parameter join_points=3
indicates that end-point values are
estimated using 3 first frames, or 3 last frames.
Warning
Using local fitting procedure is likely to artificially correlate time points over the time window time range. Such option can help with data visualization since it smoothens measurement errors, but extreme caution is adviced when this feature is used in statistical analysis.
Examples¶
Let’s assume that raw data column names include 'length'
and 'fluo'
.
Example 1: length vs. time¶
This is the trivial example. We stay in dynamic mode, and we do not associate any further processing to collected data:
>>> length = Observable(name='length', raw='length')
Example 2: length at birth¶
We go to the corresponding cell-cycle mode with the appropriate timing:
>>> length_at_birth = Observable(name='birth-length', raw='length', mode='birth', timing='b')
Note
one could associate the observable length at birth with another timing, e.g. time at mid cell cycle.
Example 3: Fluorescence production rate (finite differences)¶
>>> k = Observable(name='prod-rate', raw='fluo', differentiate=True)
Example 4: Fluorescence production rate (local fit)¶
We found that the later led to really noisy timeseries, so we choose to produce local estimates over 3 points, in an experiment where acquisition period is 4 minutes, it means to have a 12 minutes time-window:
>>> kw = Observable(name='window-prod-rate', raw='fluo', differentiate=True, scale='linear',
local_fit=True, time_window=12.)
It computes
using 12 minutes time windows.
Example 5: Fluorescence production rate (using local fits) at birth¶
And we want to have it as a function of generation index, setting 0 for cells that live at time 300 minutes:
>>> kw = Observable(name='window-prod-rate-at-birth'raw='fluo', differentiate=True, scale='linear',
local_fit=True, time_window=12.,
mode='birth', timing='g', tref=300.)
Conditional analysis¶
We saw in Defining the statistical ensemble that one can define filters that act on
cells, or colonies, and to group them in a FilterSet
instance that
essentially sets the statistical ensemble over which analysis is performed.
There is another utility of these FilterSet
objects: they may define
sub-ensembles over which analysis is performed in order to compare results
over chosen sub-populations. One example is to “gate” cell-cycle quantifiers
and observe the statistics of the different sub-populations. Here we extend
the gating procedure to analyse any dynamic observable.
To do os, a list of FilterSet
instances, one per condition, can be
provided to our analysis functions. We refer to the following users pages for
further reading on how to use filters, see Filters, and how to run
statistical analysis Statistics of the dynamics.
Plotting samples¶
To gain qualitative intuition about a dataset, it is common to visualize
trajectories among few samples. tunacell
provides a
matplotlib-based framework to visualize timeseries as well as the underlying
colony/lineage strutures arising from dividing cells.
Note
In order for the colour-code to work properly, matplotlib must be updated to a version >=2.
In this document we will describe how to use the set of tools defined in
tunacell.plotting.samples
.
We already saw in the 10 minute tutorial a simple plot of length vs. time in a colony from our numerical simulations. Here we will review the basics of plotting small samples in few test cases.
Note
If you cloned tunacell
repository, there are two ways of executing
quickly the following tutorial.
You may run the script plotting-samples.py
with the following command:
python plotting-samples.py -i --seed 951
The seed is used to select identical samples as the one printed below.
Alternatively it can be run from the root folder using the Makefile:
make plotting-demo
If you execute one of the commands above, there is no need to run the commands below. Follow the command line explanations and cross-reference it with the following commands to understand how it works. If you didn’t execute the commands above, you can run sequentially the commands below.
Contents
Setting up samples and observables¶
For plotting demonstration, we will create a numerically simulated experiment, where the dynamics is sampled on a time interval short enough for the colonies to be of reasonable size. Call from a terminal:
tunasimu -l simushort --stop 120 --seed 167389
In a Python script/shell, we load data with the usual:
from tunacell import Experiment, Parser, Observable, FilterSet
from tunacell.filters.cells import FilterCellIDparity
from tunacell.plotting.samples import SamplePlot
exp = Experiment('~/tmptunacell/simushort')
parser = Parser(exp)
np.random.seed(seed=951) # uncomment this line to match samples/plots below
parser.add_sample(10)
# define a condition
even = FilterCellIDparity('even')
condition = FilterSet(filtercell=even)
# define observable
length = Observable(name='length', raw='exp_ou_int')
ou = Observable(name='growth-rate', raw='ou')
We have defined two observables and one condition used as a toy example.
With these preliminary lines, we are ready to plot timeseries. The main object
to call is SamplePlot
, which accepts the following parameters:
samples
, an iterable overColony
orLineage
instances- the
Parser
instance used to parse data, - the list of conditions (optional).
We already saw how to define instances of the class Observable
.
Samples can be chosen samples, or random samples from the experiment. We will
review below the different cases with concrete examples from our settings.
We have 10 samples in our parser
, that have been chosen randomly.
Remember that they can also be specified on purpose with the container and
cell identifiers. Once stored in the parser object, they can be addressed by
their index in the table; to check the table of samples, call:
print(parser)
If you used the default settings, you should observe:
index container cell
------- ------------- ------
0 container_015 3
1 container_087 14
2 container_002 6
3 container_012 12
4 container_096 15
5 container_040 8
6 container_088 14
7 container_007 1
8 container_042 2
9 container_013 5
How to plot a colony sample¶
We start from the basic example initiated in the 10 minute tutorial:
colony = parser.get_colony(0) # any index between 0 and 9 would do
and we call our plotting environment:
colplt = SamplePlot([colony, ], parser=parser, conditions=[condition, ])
The first argument is an Observable
instance, the second the sample(s)
to be plotted, then it is more explicit. Conditions must be given as a list of
FilterSet
instances (the list can be left empty).
Using default settings¶
We start with the default settings and will inspect the role of each parameter:
colplt.make_plot(length)
The figure is stored as the fig
attribute of colplt
:
colplt.fig.show() # in non-interactive mode, colplt.fig in interactive mode
This kind of plot should be produced:

Timeseries of length vs time for one colony, default settings.
The default settings for a colony plot display:
- one lineage per row (it comes from keyword parameter
superimpose='none'
), - cell identifiers on top of each cell (
report_cids=True
), - container and colony root identifiers when they change,
- vertical lines to follow divisions (
report_divisions=True
).
Data points are represented by plain markers (show_markers=True
)
and with underlying, transparent connecting lines for visual help
(show_lines=True
).
Title of plot is made from the Observable.as_latex_string()
method.
Visualization of a given condition¶
The first feature we explore is to visualize whether samples verify a given
condition. To do so, use the report_condition
keyword parameter:
colplt.make_plot(length, report_condition=repr(condition))
Conditions are labeled according to their representation, this is why we used
the repr()
call.
Now the fig
attribute should store the following result:

Timeseries of length vs time for one colony. Plain markers are used for samples that verify the condition (cell identifier is even), empty markers point to samples that do not verify the condition.
Colouring options¶
Colour can be changed for distinct cells, lineages, colonies, or containers (given in order of priority), or not changed at all.
Changing cell colour¶
colplt.make_plot(length, report_condition=repr(condition), change_cell_color=True)

Colour is changed for each cell, and assigned with respect to the generation index of the cell in the colony. This allows to investigate how generations unsynchronize through time.
Changing lineage colour¶
colplt.make_plot(length, report_condition=repr(condition), change_lineage_color=True)

Colour is changed for each lineage, i.e each row in this colony plot.
Superimposition options¶
The default setting is not to superimpose lineages. It is possible to change
this behaviour by changing the superimpose
keyword parameter. Some
keywords are reserved:
'none'
: do not superimpose timeseries,'all'
: superimpose all timeseries into a single row plot,colony
: superimpose all timeseries from the same colony, thereby making as many rows as there are different colonies in the list of samples,container
: idem with container level,
and when an integer is given, each row will be filled with at most that number of lineages.
For example, if we superimpose at most 3 lineages:
colplt.make_plot(length, report_condition=repr(condition), change_lineage_color=True,
superimpose=3)

Superimposition of at most 3 lineages with superimpose=3
. Once
superimpose
is different from 'none'
(or 1), the vertical lines
showing cell divisions and cell identifiers are not shown (what happens is
that the options report_cids
and report_divisions
are
overriden to False
.
Plotting few colonies¶
So far our sample was a unique colony. It is possible to plot multiples colonies in the same plot, that can be given as an iterable over colonies:
splt = SamplePlot(parser.iter_colonies(mode='samples', size=2),
parser=parser, conditions=[condition, ])
splt.make_plot(length, report_condition=repr(condition), change_colony_color=True)
Here we iterated over colonies from the samples defined in parser.samples
.

First two colonies from parser.samples
, with changing colony colour
option.
Now we will switch to the other observable, ou
, which is the instantaneous
growth rate:
splt3.make_plot(ou, report_condition=repr(condition), change_colony_color=True,
superimpose=2)

Same samples as above, but we changed the observable to growth rate.
We can also iterate over unselected samples: iteration goes through container files:
splt = SamplePlot(parser.iter_colonies(size=5), parser=parser,
conditions=[condition, ])
splt.make_plot(ou, report_condition=repr(condition), change_colony_color=True,
superimpose=2)

Two lineages are superimposed on each row. Colour is changed for each new colony.
To get an idea of the divergence of growth rate, it is better to plot all timeseries in a single row plot. We mask markers and set the transparency to distinguish better individual timeseries:
splt.make_plot(ou, change_colony_color=True, superimpose='all', show_markers=False,
alpha=.6)

Lineages from the 5 colonies superimposed on a single row plot.
Plotting few lineages¶
Instead of a colony, or an iterable over colonies, one can use a lineage or an iterable over lineages as argument of the plotting environment:
splt = SamplePlot(parser.iter_lineages(size=10), parser=parser,
conditions=[condition, ])
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True,
superimpose='all', alpha=.6)

10 lineages from an iterator on a single row plot.
Adding reference values¶
One can add expectation values for the mean, and for the variance, to be plotted as a line for the mean and +/- standard deviations.
From the numerical simulation metadata, it is possible to compute the mean value and the variance of the process:
md = parser.experiment.metadata
# ou expectation values
ref_mean = float(md.target)
ref_var = float(md.noise)/(2 * float(md.spring))
and then to plot it to check how our timeseries compare to these theoretical values:
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True,
superimpose='all', alpha=.5, show_markers=False,
ref_mean=ref_mean, ref_var=ref_var)

Timeseries from lineages are reported together with theoretical mean value (dash-dotted horizontal line) +/- one standard deviation (dotted lines).
Adding information from computed statistics¶
We sill review the computation of the statistics in the next document, but we
will assume it has been performed for our observable ou
.
The data_statistics
option is used to display results of statistics, which
is useful when no theoretical values exist (most of the time):
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True,
superimpose='all', alpha=.5, show_markers=False,
data_statistics=True)

Data statistics have been added: grey line shows the estimated mean value and shadows show +/- one estimated standard deviation. Note that these values have been estimated over the entire statistical ensemble, not just the plotted timeseries.
Statistics of the dynamics¶
Once qualitative intuition has been gained by plotting time series from a few
samples (see Plotting samples) one can inspect quantitatively the
dynamics of a dataset by using tunacell
’s pre-defined tools.
Univariate and bivariate analysis tools are coded in tunacell
in order to
describe the statistics of single, resp. a couple of, observable(s).
We start with a background introduction to those concepts. Then we set the
session, and show tunacell
’s procedures.
Note
This manual page is rather tedious. To be more practical, open, read, and run
the following scripts univariate-analysis.py
,
univariate-analysis-2.py
, and bivariate-analysis.py
. Background
information on this page, and figures might be useful as a side help.
Background¶
We consider a stochastic process \(x(t)\).
One-point functions¶
One-point functions are statistical estimates of functions of a single time-point. Typical one-point functions are the average at a given time
where the notation \(\langle \cdot \rangle\) means taking the ensemble average of the quantity; or the variance
Inspecting these functions provides a first quantitative approach of the studied process.
In our case, time series are indexed to acquisition times \(t_i\), where \(i=0,\ 1,\ 2,\ \dots\). Usually
where \(\Delta t\) is the time interval between two frame acquisitions, and \(t_{\mathrm{offset}}\) is an offset that sets the origin of times.
Then if we denote \(s^{(1)}_i\) as the number of cells acquired at time index
\(i\), the average value at the same time of observable \(x\) is
evaluated in tunacell
as:
where \(x^{(k)}(t)\) is the value of observable \(x\) in cell \(k\) at time \(t\).
Two-point functions¶
Two-point functions are statistical estimates of functions of two time-points. The typical two-point function is the auto-correlation function, defined as:
In tuna
it is estimated using:
where the sum over \(k\) means over lineages connecting times \(t_i\) to \(t_j\) (there are \(s^{(2)}_{ij}\) such lineages).
With our method, for a cell living at time \(t_i\), there will be at most one associated descendant cell at time \(t_j\). There may be more descendants living at \(t_j\), but only one is picked at random according to our lineage decomposition procedure.
For identical times, the auto-correlation coefficient reduces to the variance:
Under stationarity hypothesis, the auto-correlation function depends only on time differences such that:
(and the function \(\tilde{a}\) is symmetric: \(\tilde{a}(-u)=\tilde{a}(u)\)).
In tuna
, there is a special procedure to estimate \(\tilde{a}\) which is
to use the lineage decomposition to generate time series, and then for a given
time interval \(u\), we collect all couple of points
\((t_i, t_j)\) such that \(u = t_i - t_j\), and perform the average
over all these samples.
We can extend such a computation to two observables \(x(t)\) and \(y(t)\). The relevant quantity is the cross-correlation function
which we estimate through cross-correlation coefficients
Again under stationarity hypothesis, the cross-correlation function depends only on time differences:
though now the function \(\tilde{c}\) may not be symmetric.
Note
At this stage of development, extra care has not been taken to ensure ideal properties for our statistical estimates such as unbiasedness. Hence caution should be taken for the interpretation of such estimates.
Warming-up [1]¶
We start with:
from tunacell import Experiment, Observable, FilterSet
from tunacell.filters.cells import FilterCellIDparity
exp = Experiment('~/tmptunacell/simutest')
# define a condition
even = FilterCellIDparity('even')
condition = FilterSet(label='evenID', filtercell=even)
Note
The condition we are using in this example serves only as a test; we do not expect that the subgroup of cells with even identifiers differ from all cells, though we expect to halve the samples and thus we can appreciate the finite-size effects.
In this example, we look at the following dynamic observables:
ou = Observable(name='exact-growth-rate', raw='ou')
The ou
—Ornstein-Uhlenbeck—observable process models instantaneous
growth rate. As it is a numerical simulation, we have some knowledge of the
statistics of such process. We import some of them from the metadata:
md = exp.metadata
params = md['ornstein_uhlenbeck_params']
ref_mean = params['target']
ref_var = params['noise']/(2 * params['spring'])
ref_decayrate = params['spring']
Starting with the univariate analysis¶
To investigate the statistics of a single observable over time, tuna
uses
the lineage decomposition to parse samples and computes incrementally one- and
two-point functions.
Estimated one-point functions are the number of samples and the average value at each time-point. Estimated two-point functions are the correlation matrix between any couple of time-points, which reduces to the variance for identical times.
The module tuna.stats.api
stores most of the functions to be used
To perform the computations, we import
tunacell.stats.api.compute_univariate()
and call it:
from tunacell.stats.api import compute_univariate
univ = compute_univariate_dynamics(exp, ou, cset=[condition, ])
This function computes one-point and two-point functions as described above
and stores the results in univ
, a
tuna.stats.single.Univariate
instance. Results are reported for
the unconditioned data, under the master
label, and for each of the
conditions provided in the cset
list. Each individual group is an
instance of tuna.stats.single.UnivariateConditioned
, which
attributes points directly to the estimated one- and two-point functions.
These items can be accessed as values of a dictionary:
result = univ['master']
result_conditioned = univ[repr(condition)]
As the master is always defined, one can alternatively use the attribute:
result = univ.master
Inspecting univariate results¶
The objects result
and result_conditioned
are instances of the
UnivariateConditioned
class, where one can find the following
attributes: time
, count_one
, average
,
count_two
, and autocorr
; these are Numpy arrays.
To be explicit, the time
array is the array of each \(t_i\) where
observables have been evaluated.
The count_one
array stores the corresponding number of samples
\(s^{(1)}_i\) (see Background), and the average
array
stores the \(\langle x(t_i) \rangle\) average values.
One can see an excerpt of the table of one-point functions by typing:
result.display_onepoint(10) # 10 lines excerpt
which should be like:
time counts average std_dev
0 0.0 200 0.011725 0.001101
1 5.0 207 0.011770 0.001175
2 10.0 225 0.011780 0.001201
3 15.0 253 0.011766 0.001115
4 20.0 265 0.011694 0.001119
5 25.0 286 0.011635 0.001149
6 30.0 301 0.011627 0.001147
7 35.0 318 0.011592 0.001173
8 40.0 337 0.011564 0.001189
9 45.0 354 0.011578 0.001150
The count_two
2d array stores matrix elements \(s^{(2)}_{ij}\)
corresponding to the number of independent lineages connecting time
\(t_i\) to \(t_j\), and the attribute autocorr
stores the
matrix elements \(a_{ij}\) (auto-covariance coefficients).
The std_dev
column of the latter table is in fact computed as the
square root of the diagonal of such auto-covariance matrix (such diagonal
is the variance at each time-point).
An excerpt of the auto-covariance function can be printed:
result.display_twopoint(10)
which should produce something like:
time-row time-col counts autocovariance
0 0.0 0.0 200 1.211721e-06
1 0.0 5.0 200 1.093628e-06
2 0.0 10.0 200 7.116838e-07
3 0.0 15.0 200 3.415255e-07
4 0.0 20.0 200 6.881773e-07
5 0.0 25.0 200 1.027559e-06
6 0.0 30.0 200 1.053278e-06
7 0.0 35.0 200 5.925049e-07
8 0.0 40.0 200 -7.884958e-08
9 0.0 45.0 200 -8.413113e-08
Examples¶
To fix the idea, if we want to plot the sample average as a function of time for the whole statistical ensemble, here’s how one can do:
import matplotlib.pyplot as plt
plt.plot(univ.master.time, univ.master.average)
plt.show()
If one wants to plot the variance as a function of time for the condition
results:
import numpy as np
res = univ[repr(condition)]
plt.plot(res.time, np.diag(res.autocorr))
To obtain a representation of the auto-correlation function, we set a time of reference and find the closest index in the time array:
tref = 80.
iref = np.argmin(np.abs(res.time - tref) # index in time array
plt.plot(res.time, res.autocorr[iref, :])
Such a plot represents the autocorrelation \(a(t_{\mathrm{ref}}, t)\) as a function of \(t\).
We will see below some pre-defined plotting capabilities.
Computations can be exported as text files¶
To save the computations, just type:
univ.export_text()
This convenient function exports computations as text files, under a folder structure that stores the context of the computation such as the filter set, the various conditions that have been applied, and the different observables over which computation has been performed:
simutest/analysis/filterset/observable/condition
The advantage of such export is that it is possible to re-load parameters from an analysis in a different session.
Plotting results¶
tunacell
comes with the following plotting functions:
from tunacell.plotting.dynamics import plot_onepoint, plot_two_points
that works with tuna.stats.single.Univariate
instances such
as our results stored in univ
:
fig = plot_onepoint(univ, mean_ref=ref_mean, var_ref=ref_var, show_ci=True, save=True)
One point plots are saved in the simutest/analysis/filterset/observable
folder since all conditions are represented.
The first figure, stored in fig1
, looks like:

Plot of one-point functions computed by tuna
. The first row shows the
sample counts vs. time, \(s^{(1)}_i\) vs. \(t_i\). The middle row
shows the sample average \(\langle x(t_i) \rangle\) vs. time.
Shadowed regions show the 99% confidence interval, computed in the large
sample size limit with the empirical standard deviation.
The bottom row shows the variance \(\sigma^2(t_i)\).
The blue line shows results for the whole statistical ensemble, whereas the
orange line shows results for the conditioned sub-population (cells with
even identifier).
We can represent two point functions:
fig2 = plot_twopoints(univariate, condition_label='master', trefs=[40., 80., 150.],
show_exp_decay=ref_decayrate)
The second figure, stored in fig2
, looks like so:

Plot of two-point functions. Three times of reference are chosen to display
the associated functions. Top row shows the sample counts, i.e. the
number of independent lineages used in the computation that connect tref
to \(t\). Middle row shows the associated auto-correlation functions
\(a(t_{\mathrm{ref}}, t)/\sigma^2(t_{\mathrm{ref}})\).
The bottom row show the translated functions
\(a(t_{\mathrm{ref}}, t-t_{\mathrm{ref}})/\sigma^2(t_{\mathrm{ref}})\).
One can guess that they peak at \(t-t_{\mathrm{ref}} \approx 0\),
though decay on both sides are quite irregular compared to the expected
behaviour due to the low sample size.
The view proposed on auto-correlation functions for specific times of reference is not enough to quantify the decay and associate a correlation time. A clever trick to gain statistics is to pool all data where the process is stationary and numerically evaluate \(\tilde{a}\).
Computing the auto-correlation function under stationarity¶
By inspecting the average and variance in the one-point function figure above, the user can estimate whether the process is stationary and where (over the whole time course, or just over a subset of it). The user is prompted to define regions where the studied process is (or might be) stationary. These regions are saved automatically:
# %% define region(s) for steady state analysis
# call the Regions object initialized on parser
regs = Regions(exp)
# this call reads previously defined regions, show them with
print(regs)
# then use one of the defined regions
region = regs.get('ALL') # we take the entire time course
Computation options need to be provided. They dictate how the mean value must be substracted: either global mean over all time-points defined within a region, either locally where the time-dependent average value is used; and how segments should be sampled: disjointly or not. Default settings are set to use global mean value and disjoint segments:
# define computation options
options = CompuParams()
To compute the stationary auto-correlation function \(\tilde{a}\) use:
from tunacell.stats.api import compute_stationary
stat = compute_stationary(univ, region, options)
The first argument is the Univariate
instance univ
, the second
argument is the time region over which to accept samples, and the third are the
computation options.
Here our process is stationary by construct over the whole time period of the simulation so we choose the ‘ALL’ region. Our options is to substract the global average value for the process, and to accept only disjoint segments for a given time interval: this will ensure that samples used for a given time interval are independent (as long as the process is Markovian) and we can estimate the confidence interval by computing the standard deviation of all samples for a given time interval.
stat
is an instance of tuna.stats.single.StationaryUnivariate
which is structured in the same way with respect to master
and conditions.
Each of its items (e.g. stat.master
, or stat[repr(condition)]
) is
an instance of tuna.stats.single.StationaryUnivariateConditioned
and stores information in the following attributes:
time
: the 1d array storing time interval values,counts
: the 1d array storing the corresponding sample counts,autocorr
: the 1d array storing the value of the auto-correlation function \(\tilde{a}\) for corresponding time intervals.dataframe
: aPandas.dataframe
instance that collects data points used in the computation ; each row corresponds to a single data point (in a single cell), with information on the acquisition time, the cell identifier, the value of the observable, and as many boolean columns as there are conditions, plus the master (no condition), that indicate whether a sample has been taken or not. This is a convenient dataframe to draw e.g. marginal distributions.
Plotting results¶
tunacell
provides a plotting function that returns a Figure
instance:
from tunacell.plotting.dynamics import plot_stationary
fig = plot_stationary(stat, show_exp_decay=ref_decayrate, save=True)
The first argument must be a tuna.stats.single.StationaryObservable
instance. The second parameter displays an exponential decay (to compare with
data).

Plot of stationary autocorrelation function. Top row is the number of samples, i.e. the number of (disjoint) segments of size \(\Delta t\) found in the decomposed lineage time series. Middle row is the auto-correlation function \(\tilde{a}(\Delta t)/\sigma^2(0)\). Confidence intervals are computed independently for each time interval, in the large sample size limit.
Exporting results as text files¶
Again it is possible to export results as text files under the same folder structure by typing:
stat.export_text()
This will create a tab-separated text file called
stationary_<region.name>.tsv
that can be read with any spreadsheet reader.
In addition, the dataframe of single time point values is exported as a csv file under
the filterset folder as data_<region.name>_<observable.label()>.csv
.
A note on loading results¶
As described above, results can be saved in a specific folder structure that not only store the numerical results but also the context (filterset, conditions, observables, regions).
Then it is possible to load results by parsing the folder structure and reading the text files. To do so, initialize an analysis object with some settings, and try to read results from files:
from tunacell.stats.api import load_univariate
# load univariate analysis of experiment defined in parser
univ = load_univariate(exp, ou, cset=[condition, ])
The last call will work only if the analysis has been performed and exported to text files before. Hence a convenient way to work is:
try:
univ = load_univariate(exp, ou, cset=[condition, ])
except UnivariateIOError as uerr:
print('Impossible to load univariate {}'.format(uerr))
print('Launching computation')
univ = compute_univariate(exp, ou, cset=[condition, ])
univ.export_text()
Bivariate analysis: cross-correlations¶
Key questions are to check which observables correlate, and how they correlate in time. The appropriate quantity to look at is the cross-correlation function, \(c(s, t)\), and the stationary cross-correlation funtion \(\tilde{c}(\Delta t)\) defined above (see Background).
To estimate these functions, one first need to have run the univariate analyses
on the corresponding observables. We take the univariate objects corresponding
to the ou
and gr
observables:
# local estimate of growth rate by using the differentiation of size measurement
# (the raw column 'exp_ou_int' plays the role of cell size in our simulations)
gr = Observable(name='approx-growth-rate', raw='exp_ou_int',
differentiate=True, scale='log',
local_fit=True, time_window=15.)
univ_gr = compute_univariate(exp, gr, [condition, ])
# import api functions
from tunacell.stats.api import (compute_bivariate,
compute_stationary_bivariate)
# compute cross-correlation matrix
biv = compute_cross(univ, univ_gr)
biv.export_text()
# compute cross-correlation function under stationarity hypothesis
sbiv = compute_cross_stationary(univ, univ_gr, region, options)
sbiv.export_text()
These objects again point to items corresponding to the unconditioned data and each of the conditions.
Again, cross-correlation functions as a function of two time-points (results
stored in biv
), the low sample size is a limit to get a smooth numerical
estimate and we turn to the estimate under stationary hypothesis in order to
pool all samples.
Inspecting cross-correlation results¶
We can inspect the master
result:
master = biv.master
or any of the conditioned dataset:
cdt = biv[repr(condition)]
where condition
is an item of each of the cset
lists (one for each
single
object). Important attributes are:
times
: a couple of lists of sequences of times, corresponding respectively to the times evaluated for each item insingles
, or \(\{ \{s_i\}_i, \{t_j\}_j \}\) where \(\{s_i\}_i\) is the sequence of times where the firstsingle
item has been evaluated, and \(\{t_j\}_j\) is the sequence of times where the secondsingle
observable has been evaluated. Note that the length \((p, q)\) of these vectors may not coincide.counts
: the \((p, q)\) matrix giving for entry \((i, j)\) the number of samples in data where an independent lineage has been drawn between times \(s_i\) and \(t_j\).corr
: the \((p, q)\) matrix giving for entry \((i, j)\) the value of estimated correlation \(c_(s_i, t_j)\).
It is possible to export data in text format using:
biv.export_text()
It will create a new folder <obs1>_<obs2>
under each condition folder and
store the items listed above in text files.
Inspecting cross-correlation function at stationarity¶
In the same spirit:
master = sbiv.master
gets among its attributes array
that stores time intervals, counts,
and values for correlation as a Numpy structured array. The dataframe
attribute points to a Pandas.dataframe
that recapitulates single
time point data in a table, with boolean columns for each condition.
It is possible to use the same plotting function used for stationary autocorrelation functions:
plot_stationary(sbiv, ref_decay=ref_decayrate)
which should plot something like:

Plot of the stationary cross-correlation function of the Ornstein-Uhlenbeck process with the local growth rate estimate using the exponential of the integrated process. It is symmetric and not very informative since it should more or less collapse with the auto-correlation of one of the two observables, since the second is merely a local approximation of the first.
Other examples¶
If one performs a similar analysis with the two cell-cycle observables, for example:

Plot of the stationary cross-correlation function of the cell-cycle average growth rate with the cell length at division, with respect to the number of generations. We expect that a fluctuation in cell-cycle average growth rate influences length at division in the same, or in later generations. This is why we observe the highest values of correlation for \(\Delta t = 0,\ 1,\ 2\) generations, and nearly zero correlation for previous generations (there is no size control mechanism in this simulation).
Footnotes
[1] | This document has been written during Roland Garros tournament… |
Filters¶
Outliers may have escaped segmentation/tracking quality control tools and thus there might be a need to apply further filtering when analysing their output data, as tunacell does. For example, filamentous cells may have been reported in data, but one might exclude them from a given analysis.
tunacell provides a set of user-customisable filters that allows user to define properly the statistical ensemble of samples over which its analysis will be performed.
In addition to removing outliers, filters are also used for conditional analysis, as they allow to divide the statistical ensemble in sub-populations of samples that verify certain rules.
Series of filters have already been defined for each of the following types:
cell, lineage, colony, and container. In addition boolean operations
AND, OR, NOT can be used within each type. Then filters of different types are
combined in FilterSet
instances: one is used to define the statistical
ensemble (remove outliers), and optionnally, others may be used to create
sub-populations of samples for comparative analyses.
Contents
How individual filters work¶
Filters are instances of the FilterGeneral
class.
A given filter class is instantiated (possibly) with parameters, that define
how the filter work.
Then the instantiated object is callable on the object to be filtered.
It returns either True
(the object is valid) or False
(the object is rejected).
Four main subclasses are derived from FilterGeneral
, one for each
structure that tuna recognizes: FilterCell
for Cell
objects,
FilterTree
for Colony
objects, FilterLineage
for
Lineage
objects, FilterContainer
for Container
objects.
Example: testing the parity of cell identifier¶
The filter FilterCellIdparity
has been designed for illustration:
it tests whether the cell identifier is even (or odd).
First we set the filter by instantiating its class with appropriate parameter:
>>> from tunacell.filters.cells import FilterCellIDparity
>>> filter_even = FilterCellIDparity(parity='even')
For this filter class, there is only one keyword parameter, parity
,
which we have set to 'even'
: accept cells with even identifier, rejects
cells with odd identifier.
First, we can print the string representation:
>>> print(str(filter_even))
CELL, Cell identifier is even
The first uppercase word in the message reminds the type of objects the filter is acting upon. Then the message is a label that has been defined in the class definition).
We set two Cell
instances, one with even identifier, and one odd:
>>> from tunacell.base.cell import Cell
>>> mygoodcell = Cell(identifier=12)
>>> mybadcell = Cell(identifier=47)
Then we can perform the test over both objects:
>>> print(filter_even(mygoodcell))
True
>>> print(filter_even(mybadcell))
False
We also mention another feature implemented in the representation of such filters:
>>> print(repr(filter_even))
FilterCellIDparity(parity='even', )
Such representation is the string one would type to re-instantiate the filter. This representation is used by tuna when a data analysis is exported to text files. Indeed, when tuna reads back this exported files, it is able to load the objects defined in the exported session. Hence, no need of remembering the precise parameters adopted on a particular analysis: if it’s exported, it can be loaded later on.
Creating a new filter¶
Few filters are already defined in the following modules:
tunacell.filters.cells
for filters acting on cells,tunacell.filters.lineages
for filters acting on lineages,tunacell.filters.trees
for filters acting on colonies,tunacell.filters.containers
for filters acting on containers.
Within each type, filters can be combined with boolean operations (see below), that allows user to explore a range of filters. However a user may need to define its own filter(s), and he/she is encouraged to do so following the general guidelines:
- define a
label
attribute (human-readable message, which was'Cell identifier is even'
in our previous example), - define the
func()
method that performs the boolean testing.
From the module tunacell.filters.cells
we copied below the class definition
of the filter used in our previous example:
class FilterCellIDparity(FilterCell):
"""Test whether identifier is odd or even"""
def __init__(self, parity='even'):
self.parity = parity
self.label = 'Cell identifier is {}'.format(parity)
return
def func(self, cell):
# test if even
try:
even = int(cell.identifier) % 2 == 0
if self.parity == 'even':
return even
elif self.parity == 'odd':
return not even
else:
raise ValueError("Parity must be 'even' or 'odd'")
except ValueError as ve:
print(ve)
return False
Although this filter may well be useless in actual analyses, it shows how to define a filter class. Also have a look at filters defined in the above-mentioned modules.
How to combine individual filters together with boolean operations¶
Filters already implemented are “atomic” filters, i.e. they perform one testing operation. It is possible to combine many atomic filters of the same type (type refers to the object type on which filter is applied: cell, lineage, colony, container) by using Boolean filter types.
There are 3 of them, defined in tuna.filters.main
: FilterAND
,
FilterOR
, FilterNOT
. The first two accepts any number of
filters, that are combined with the AND/OR logic respectively; the third accepts
one filter as argument.
With these boolean operations, complex combinations of atomic filters can be created.
How to define a FilterSet
instance¶
So far we saw how to use filters for each type of structures, independently: cell, lineage, colony, and container.
The FilterSet
registers filters to be applied on each of these types.
It is used to define the statistical ensemble of valid samples, or to define
a condition (rules to define a sub-population from the statistical ensemble).
Explicitely, if we would like to use our filter_even
from our
example above as the only filter to make the statistical ensemble, we would
define:
from tunacell.filters.main import FilterSet
fset = FilterSet(filtercell=filter_even)
(the other keyword parameters are filterlineage
, filtertree
, and
filtercontainer
)
Tunacell’s data model¶
tunacell’s top level data structure matches input files scaffold. Raw data is stored
in Cell
instances connected through a tree structure arising from
cell divisions.
Top level structures: Experiment
and Container
¶
tunacell’s top level structure is Experiment
and handles the experiment.
We refer to the API documentation for details about attributes and methods.
In particular, it stores the list of container files that allows to open/read
such containers.
These are stored under Container
instances, which label is set by file
name. Such objects gets two major attributes: cells
and trees
.
The former is the list of Cell
instances imported from raw data file,
the latter is the list of reconstructed trees formed by dividing cells, stored
as Colony
instances.
Low-level structures: Cell
and Colony
¶
These classes are derived from the treelib package Node
and
Tree
classes respectively.
Raw data is stored under the data
attribute of Cell
instances.
Methods are defined at the Container
level to retrieve objects
corresponding to an identifier. More importantly there is an iterator over
colonies that can be used when parsing data for statistical analysis.
Tree decomposition: Lineage
¶
For studying dynamics over times larger than one or few cell cycles, it is necessary to build timeseries of observables over sequences of more than one cells.
We use features from the treelib package to decompose trees in independent lineages. A lineage is a sequence \({c_i}_i\) of cells related through successive divisions: cell \(c_i\) is a daughter of cell :math`c_{i-1}`, and the mother of cell \(c_{i+1}\).
One way to decompose a tree in lineages is to build the sets of lineages from root to all leaves. Such decomposition implies that some cells may belong to more than one lineage. Using such decomposition require some statistical weighting procedure.
To avoid such weighting procedure, we used a decomposition in independent lineages. Such decomposition ensures that each cell is counted once and only once. More specifically our method to decompose a tree in independent lineages is to traverse the tree starting from the root and choosing randomly one daughter cell at each division until a leaf is reached, repeatidly.
A lineage is defined as a Lineage
instance. Such object gets
method to build the corresponding timeseries for a given observable.
Numerical simulations in tunacell
¶
Using the script¶
When installed using pip, a tunasimu
executable is provided to run
numerical simulations and save results on local directories. These
files can be used to try tunacell
’s analysis tools.
Such a command comes with various parameters, that will printed upon call:
$ tunasimu -h
There is a list of (optional) parameters.
What are these simulations about?¶
ou
is the value of the Ornstein-Uhlenbeck random process simulated
in each cell, int_ou
is the integrated value of the random process reset
to zero at each cell birth, exp_int_ou
is the exponential of the later
value. One can think of the Ornstein-Uhlenbeck as instantaneous growth rate of
the cell, and thus exp_int_ou
can be associated to cell length.
Experiment¶
This module implements the first core class: Experiment, and functions to parse containers, retrieve and build data.
Each experiment consists of multiple containers where data is stored under container folders. A container may correspond to a single field of view, to a subset thereof (e.g. a single channel in microfluidic experiments).
- Such containers must meet two requirements:
- Cell identifiers are unique within a container;
- Lineage reconstruction is defined and performed within a single container.
- This module stores classes and functions that allow to:
- explore data structure (and metadata if provided)
- keep track of every container where to look for data
- extract data in a Container instance from text containers
- build cells filiation, store time-lapse microscopy data, build trees
-
class
tunacell.base.experiment.
Experiment
(path='.', filetype=None, filter_set=None, count_items=False)¶ General class that stores experiment details.
Creates an Experiment instance from reading a file, records path, filetype, reads metadata, stores the list of containers.
Parameters: - path (str) – path to experiment root file
- -- str {None, 'text', 'supersegger'} (filetype) – leave to None for automatic detection.
-
abspath
¶ absolute path on disk of main directory for text containers
Type: str
-
label
¶ experiment label
Type: str
-
filetype
¶ one of the available file type (‘simu’ is not a filetype per se…)
Type: str {‘text’, ‘supersegger’}
-
fset
¶ filterset to be applied when parsing data
Type: FilterSet
instance
-
datatype
¶ provides the datatype of raw data stored in each Cell instance .data This attribute is defined only for text filetype, when a descriptor file is associated to the experiment.
Type: Numpy.dtype instance
-
metadata
¶ experiment metadata
Type: Metadata instance
-
containers
¶ list of absolute paths to containers
Type: list of pathlib.Path
-
period
¶ time interval between two successive aquisitions (this should be defined in the experiment metadata)
Type: float
-
iter_containers(self, read=True, build=True, prefilt=None,
- extend_observables=False, report_NaNs=True, size=None, shuffle=False)
browse containers
-
analysis_path
¶ Get analysis path (with appropriate filterset path)
-
count_items
(independent=True, seed=None, read=True)¶ Parse data to count items: cells, colonies, lineages, containers
Parameters: - independent (bool {True, False}) – lineage decomposition parameter
- seed (int, or None) – lineage decomposition parameter
- read (bool {True, False}) – try to read it in analysis folder
-
fset
Get current FilterSet
-
get_container
(label, read=True, build=True, prefilt=None, extend_observables=False, report_NaNs=True)¶ Open specified container from this experiment.
Parameters: - label (str) – name of the container file to be opened
- read (bool (default True)) – whether to read data and extract Cell instances list
- build (bool (default True)) – when read option is active, whether to build Colony instances
- extend_observables (bool (default False)) – whether to compute secondary observables from raw data
- report_NaNs (bool (default True)) – whether to report for NaNs found in data
Returns: container
Return type: Container instance
Raises: - ParsingExperimentError : when no container corresponds in this exp
- ParsingContainerError: when despite of existing container filename, – parsing of container failed and nothing is loaded
-
info
()¶ Show informations about experiment
-
iter_cells
(size=None, shuffle=False)¶ Iterate through valid cells.
Applies all filters defined in fset.
Parameters: - size (int (default None)) – limit the number of lineages to size. Works only in mode=’all’
- shuffle (bool (default False)) – whether to shuffle the ordering of lineages when mode=’all’
Yields: cell (
Cell
instance) – filtering removed outlier cells, containers, colonies, and lineages
-
iter_colonies
(filter_for_colonies='from_fset', size=None, shuffle=False)¶ Iterate through valid colonies.
Parameters: - filter_for_colonies (FilterTree instance or str {'from_fset', 'none'}) –
- size (int (default None)) – limit the number of colonies to size. Works only in mode=’all’
- shuffle (bool (default False)) – whether to shuffle the ordering of colonies when mode=’all’
Yields: colony (
Colony
instance) – filtering removed outlier cells, containers, and colonies
-
iter_containers
(read=True, build=True, filter_for_cells='from_fset', filter_for_containers='from_fset', apply_container_filter=True, extend_observables=False, report_NaNs=True, size=None, shuffle=False)¶ Iterator over containers.
Parameters: - size (int (default None)) – number of containers to be parsed
- read (bool (default True)) – whether to read data and extract Cell instances
- build (bool (default True), called only if read is True) – whether to build colonies
- filter_for_cells (FilterCell instance, or str {'from_fset', 'none'}) – filter applied to cells when data files are parsed
- filter_for_containers (FilterContainer instance or str {'from_fset', 'none'}) – filter applied to containers when data files are parsed
- extend_observables (bool (default False)) – whether to construct secondary observables from raw data
- report_NaNs (bool (default True)) – whether to report for NaNs found in data
- shuffle (bool (default False)) – when size is set to a number, whether to randomize ordering of upcoming containers
Returns: Return type: iterator iver Container instances of current Experiment instance.
-
iter_lineages
(filter_for_lineages='from_fset', size=None, shuffle=False)¶ Iterate through valid lineages.
Parameters: - filter_for_lineages (FilterLineage instance or str {'from_fset', 'none'}) – filter lineages
- size (int (default None)) – limit the number of lineages to size. Works only in mode=’all’
- shuffle (bool (default False)) – whether to shuffle the ordering of lineages when mode=’all’
Yields: lineage (
Lineage
instance) – filtering removed outlier cells, containers, colonies, and lineages
-
period
Return the experimental level period
The experimental level period is defined as the smallest acquisition period over all containers.
-
raw_text_export
(path='.', metadata_extension='.yml')¶ Export raw data as text containers in correct directory structure.
Parameters: - path (str) – path to experiment root directory
- metadata_extension (str (default '.yml')) – type of metadata file (now only yaml file works)
-
exception
tunacell.base.experiment.
FiletypeError
¶
-
exception
tunacell.base.experiment.
ParsingExperimentError
¶
-
tunacell.base.experiment.
count_items
(exp, independent_decomposition=True, seed=None)¶ Parse the experiment, with associated FilterSet, and count items
Parser¶
This module provides elements for parsing data manually, i.e. getting a handful list of samples and extract specific structures (colony, lineage, cell) from such samples.
Parser
: handles how to parse an experiment with a given filterset
-
class
tunacell.base.parser.
Parser
(exp=None, filter_set=None)¶ Defines how user wants to parse data.
Parameters: - exp (
Experiment
instance) – - filter_set (
FilterSet
instance) – this is the set of filters used to read/build data, used for for iterators (usually, only .cell_filter and .container_filter are used)
-
add_sample
(*args)¶ Add sample to sample list.
Parameters: args (list) – list of items such as: integer, strings, couple, and/or dict An integer denotes the number of sample_ids to be chosen randomly (if many integers are given, only the first is used) A string will result in loading the corresponding Container, with a cell identifier randomly chosen. A couple (str, cellID) denotes (container_label, cell_identifier) A dictionary should provide ‘container’ key, and ‘cellID’ key
-
clear_samples
()¶ Erase all samples.
-
get_cell
(sample_id)¶ Get
Cell
instance corresponding to sample_id.Parameters: sample_id (dict) – element of self.samples Returns: cell – corresponding to sample_id Return type: Cell
instance
-
get_colony
(sample_id)¶ Get
Colony
instance corresponding to sample_id.Parameters: sample_id (dict) – element of self.samples Returns: colony – corresponding to sample_id Return type: Colony
instance
-
get_container
(sample_id)¶ Get
Container
instance corresponding to sample_id.Parameters: sample_id (dict) – element of self.samples Returns: container Return type: Container
instance
-
get_lineage
(sample_id)¶ Get
Lineage
instance corresponding to sample_id.Parameters: sample_id (dict) – element of self.samples Returns: lineage – corresponding to sample_id Return type: :class:` Lineage` instance
-
get_sample
(index, level='cell')¶ Return sample corresponding to index.
Parameters: - index (int) – index of sample id in self.samples
- level (str {'cell'|'lineage'|'colony'}) –
Returns: structure level corresponding to sample id
Return type: out
-
info_samples
()¶ Table output showing stored samples.
-
iter_cells
(mode='samples', size=None)¶ Iterate through valid cells.
Parameters: - mode (str {'samples'} (default 'samples')) – whether to iterate over all cells (up to number limitation), or over registered samples
- size (int (default None)) – limit the number of lineages to size. Works only in mode=’all’
Yields: cell (
Cell
instance) – filtering removed outlier cells, containers, colonies, and lineages
-
iter_colonies
(mode='samples', size=None)¶ Iterate through valid colonies.
Parameters: - mode (str {'samples'} (default 'samples')) – whether to iterate over all colonies (up to number limitation), or over registered samples
- size (int (default None)) – limit the number of colonies to size.
Yields: colony (
Colony
instance) – filtering removed outlier cells, containers, and colonies
-
iter_containers
(mode='samples', size=None)¶ Iterate through valid containers.
Parameters: - mode (str {'samples'} (default 'samples')) – iterates over containers pointed by parser.samples
- size (int (default None)) – number of containers to be parsed
Yields: container (
tunacell.core.Container
instance) – filtering removed outlier cells, containers
-
iter_lineages
(mode='samples', size=None)¶ Iterate through valid lineages.
Parameters: - mode (str {''samples'} (default 'samples')) – whether to iterate over all lineages (up to number limitation), or over registered samples
- size (int (default None)) – limit the number of lineages to size.
Yields: lineage (
Lineage
instance) – filtering removed outlier cells, containers, colonies, and lineages
-
load_experiment
(path, filetype='text')¶ Loads an experiment from path to file.
Parameters: - path (str) – path to root directory (‘text’), or to datafile (‘h5’)
- filetype (str {'text', 'h5}) –
-
remove_sample
(index, verbose=True)¶ Remove sample of index in sample list.
-
set_filter
(fset)¶ Set build filter.
Parameters: fset ( FilterSet
instance) –
- exp (
Observable¶
This module provides API class definition to define observables.
Classes¶
Observable
: main object to define observables.
-
class
tunacell.base.observable.
FunctionalObservable
(name=None, f=None, observables=[])¶ Combination of
Observable
instancesParameters: - name (str) – user defined name for this observable
- f (callable) – the function to apply to observables
- observables (list of
Observable
instances) – parameters of the function f to be applied
Warning
Contrary to
Observable
, instances ofFunctionalObservable
cannot be represented as a string usingrepr()
, that could be turned into a new instance with identical parameters usingeval()
. This is due to the applied function, difficult to serialize as a string AND keeping a human-readable format to read its definition.-
label
¶ get unique string identifier
-
latexify
(show_variable=True, plus_delta=False, shorten_time_variable=False, prime_time=False, as_description=False, use_name=None)¶ Latexify observable name
-
mode
¶ Returns mode depending on observables passed as parameters
-
timing
¶ Return timing depending on observables passed as parameters
-
class
tunacell.base.observable.
Observable
(name=None, from_string=None, raw=None, differentiate=False, scale='linear', local_fit=False, time_window=0.0, join_points=3, mode='dynamics', timing='t', tref=None)¶ Defines how to retrieve observables.
Parameters: - name (str) – user name for this observable (can be one of the raw observable)
- raw (str (default None)) – raw name of the observable: must be a column name of raw data, i.e. first element of one entry of Experiment.datatype
- differentiate (boolean (default False)) – whether to differentiate raw observable
- scale (str {'linear', 'log'}) – expected scaling form as a function of time (used for extrapolating values at boundaries, including in the local_fit procedure)
- local_fit (boolean (default False)) – whether to perform local fit procedure
- time_window (float (default 20.)) – Time window over which local fit procedure is applied (used only when local_fit is activated)
- join_points (int (default 3)) – number of points over which extrapolation procedure is led by fitting linearly [the scale of] the observable w.r.t time
- mode (str {'dynamics', 'birth', 'division', 'net_increase', 'rate',) –
‘average’} mode used to retrieve data:
- ’dynamics’: all timepoints are retrieved
- ’birth’: only birth value is retrieved
- ’division’: only division value is retrieved
- ’net-increase-additive’: difference between division value
- and birth value
- ’net-increase-multiplicative’: ratio between division value
- and birth value
- ’rate’: rate of linear fit of [scale of] observable
- ’average’: average of observable over cell cycle
- timing (str {'t', 'b', 'd', 'm', 'g'}) –
set the time at which cell cycle observable is associated:
- ’t’ : time-lapse timing (associated to mode ‘dynamics’)
- ’b’ : cell cycle birth time
- ’d’ : cell cycle division time
- ’m’ : cell cycle midpoint (half time)
- ’g’ : cell cycle generation index
- tref (float or 'root' (default None)) – when timing is set to ‘g’, sets the 0th generation to the cell that bounds this reference time when timing is set to ‘t’ (time-lapse timing), allows to translate time values by substracting floating point value (if given as a float), or aligns to the colony root cell last time value as origin.
-
as_latex_string
¶ Export as LaTeX string. Old format, replaced by latexify
-
as_string_table
()¶ Human readable output as a table.
-
as_timelapse
()¶ Convert current observable to its dynamic counterpart
This is needed when computing cell-cycle observables.
-
label
¶ Label is outputing a unique string representation
This method creates a string label that specifies each parameter to re-construct the Observable. The output string is (kind-of) human readable. More importantly, it is suitable to be a filename (only alphanumeric caracters, and underscores), and in fact serves to name directories in analysis folder.
Note
__repr__()
: returns another string representation, that can be called by the built-ineval()
, to instantiate a new object with identical functional parameters.
-
latexify
(show_variable=True, plus_delta=False, shorten_time_variable=False, prime_time=False, as_description=False, use_name=None)¶ Returns a latexified string for observable
Parameters: - show_variable (bool) – whether to print out time/generation variable
- plus_delta (bool) – whether to add a $Delta$ to time/generation variable; used for auto- and cross-correlation labeling
- shorten_time_variable (bool) – when active, will display only $t$/$g$
- prime_time (bool) – whether to set a prime on the time/generation variable
- as_description (bool (default False)) – sets up the description of the observable from rules to compute it (derivatives, log, and raw label)
- use_name (str (default None)) – when the observable name is too cumbersome to be printed, and the user wants to choose a specific name for such a printout
-
load_from_string
(codestring)¶ Set Observable instance from string code created by label method
Parameters: codestring (str) – must follow some rules for parsing
-
exception
tunacell.base.observable.
ObservableError
¶
-
exception
tunacell.base.observable.
ObservableNameError
¶
-
exception
tunacell.base.observable.
ObservableStringError
¶
-
tunacell.base.observable.
set_observable_list
(*args, **kwargs)¶ Make raw, and functional observable lists for running analyses
Parameters: - *args – Variable length argument list of
Observable
orFunctionalObservable
instances - **kwargs – Accepted keyword arguments: ‘filters=[]’ with a list of
FilterSet
orFilterGeneral
instance (must have a .obs attribute)
Returns: lists of raw observables, functional observables (correctly ordered)
Return type: raw_obs, func_obs
- *args – Variable length argument list of
-
tunacell.base.observable.
unroll_func_obs
(obs)¶ Returns flattened list of FunctionalObservable instances
It inspect recursively the observable content of the argument to yield all nested FunctionalObservable instances. They are ordered from lower to deeper layers in nested-ness. If you need to compute f(g(h(x))), where x is a raw Observable, the generator yields h, g, and f lastly, so that evaluation can be performed in direct order.
Parameters: obs ( FunctionalObservable
instance) – the observable to inspectYields: FunctionalObservable
instance – The generator yields funcObs instance in appropriate order (from lower to higher level in nested-ness).
-
tunacell.base.observable.
unroll_raw_obs
(obs)¶ Returns a generator over flattened list of Observable instances
Parameters: obs ((list of) Observable
orFunctionalObservable
instances) –Yields: flatten – Observable
instances found in argument list, going into nested layers in the case of nested list, or forFunctionalObservable
instances
Filters¶
This module defines the structure for filters objects (see FilterGeneral).
Various filters are then defined in submodules, by subclassing.
- The subclass needs to define at least two things:
- the attribute _label (string or unicode) which defines filter operation
- the func method, that performs and outputs the boolean test
Some useful functions are defined here as well.
-
class
tunacell.filters.main.
FilterAND
(*filters)¶ Defines boolean AND operation between same type filters.
Parameters: filters (sequence of FilterGeneral
instances) –Returns: will perform AND boolean operation between the various filters passed as arguments. Return type: FilterGeneral
instanceNotes
Defines a FilterTrue
-
func
(target)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
exception
tunacell.filters.main.
FilterArgError
¶ Exception raised when Argument format is not suitable.
-
class
tunacell.filters.main.
FilterBoolean
¶ General class to implement Boolean operations between filters
-
label
¶ Get label of applied filter(s)
-
-
exception
tunacell.filters.main.
FilterError
¶ Superclass for errors while filtering
-
class
tunacell.filters.main.
FilterGeneral
¶ General class for filtering cell (i.e. tunacell.base.cell.Cell) instances.
Important property is to make the instance callable, and define with a human readable label the action of filter.
-
func
(*args)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
label
¶ Get label of applied filter(s)
-
obs
¶ Provides the list of hidden observables
-
-
exception
tunacell.filters.main.
FilterLabelError
¶ Exception when label is not (properly) set.
-
class
tunacell.filters.main.
FilterNOT
(filt)¶ Defines boolean NOT operation on filter.
Parameters: filter ( FilterGeneral
instance) –-
func
(target)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.main.
FilterOR
(*filters)¶ Defines boolean OR operation between same type filters.
Parameters: filters (sequence of FilterGeneral
instances) –-
func
(target)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
exception
tunacell.filters.main.
FilterParamsError
¶ Exception when parameters are not set for a given test.
-
class
tunacell.filters.main.
FilterSet
(label=None, filtercell=FilterTRUE(), filterlineage=FilterTRUE(), filtertree=FilterTRUE(), filtercontainer=FilterTRUE())¶ Collects filters of each type in a single object
-
obs
¶ Provides the list of hidden observables
-
-
class
tunacell.filters.main.
FilterTRUE
¶ Returns True for argument of any type, can be used in Boolean filters
We need this artificial Filter to plug in defaults FilterSets.
-
func
(*args)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
exception
tunacell.filters.main.
FilterTypeError
¶ Error raised when one tries to combine different types of filters
-
tunacell.filters.main.
bounded
(arg, lower_bound=None, upper_bound=None)¶ Function that test whether argument is bounded.
By convention, lower bound is included, upper bound is excluded. Thus it tests whether lower_bound <= arg < upper_bound. If arg is an iterable, the test must be satisfied for every element (hence the minimal value must be greater or equal to lower_bound and the maximal value must be lower than the upper bound).
Parameters: - arg (int or float) – quantity to be tested for bounds
- lower_bound (int or float (default None)) –
- upper_bound (int or float (default None)) –
Returns: Return type: Boolean
This module defines filters for Cell instances
-
class
tunacell.filters.cells.
FilterCell
¶ General class for filtering cell objects (reader.Cell instances)
-
class
tunacell.filters.cells.
FilterCellAny
¶ Class that does not filter anything.
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterCellIDbound
(lower_bound=None, upper_bound=None)¶ Test class
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterCellIDparity
(parity='even')¶ Test whether identifier is odd or even
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterCompleteCycle
(daughter_min=1)¶ Test whether a cell has a given parent and at least one daughter.
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterCycleFrames
(lower_bound=None, upper_bound=None)¶ Check whether cell has got a minimal number of datapoints.
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterCycleSpanIncluded
(lower_bound=None, upper_bound=None)¶ Check that cell cycle time interval is within valid bounds.
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterData
¶ Default filter test only if cell exists and cell.data non empty.
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterDaughters
(daughter_min=1, daughter_max=2)¶ Test whether a given cell as at least one daughter cell
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterHasParent
¶ Test whether a cell has an identified parent cell
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterLengthIncrement
(lower_bound=None, upper_bound=None)¶ Check increments are bounded.
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
-
class
tunacell.filters.cells.
FilterObservableBound
(obs=Observable(name='undefined', raw='undefined', scale='linear', differentiate=False, local_fit=False, time_window=0.0, join_points=3, mode='dynamics', timing='t', tref=None, ), tref=None, lower_bound=None, upper_bound=None)¶ Check that a given observable is bounded.
Parameters: - obs (Observable instance) – observable that will be tested for bounds works only for continuous observable (mode=’dynamics’)
- tref (float (default None)) – Time of reference at which to test dynamics observable value
- lower_bound (float (default None)) –
- upper_bound (float (default None)) –
-
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
class
tunacell.filters.cells.
FilterSymmetricDivision
(raw='area', lower_bound=0.4, upper_bound=0.6)¶ Check that cell division is (roughly) symmetric.
Parameters: raw (str) – column label of raw observable to test for symmetric division (usually one of ‘length’, ‘area’). This quantity will be approximated -
func
(cell)¶ This is the boolean operation to define in specific Filter instances
Default operation returns True.
-
Analysis of the dynamics¶
This module sets up api functions for dynamical correlation analysis.
-
exception
tunacell.stats.api.
ParamsError
¶
-
tunacell.stats.api.
compute_bivariate
(row_univariate, col_univariate, size=None)¶ Computes cross-correlation between observables defiend in univs.
This functions handles conditions and time-window binning:
- all conditions provided in cset are applied independently, in addition to the computation with unconditioned data (labelled ‘master’)
- A time-binning window is provided with a given offset and a period. Explicitely a given time value t found in data will be rounded up to closest offset_t + k * delta_t, where k is an integer.
Parameters: - univs (couple of Univariate instances) –
- size (int (default None)) – limit the iterator to size Lineage instances (used for testing)
Returns: Return type: TwoObservable instance
-
tunacell.stats.api.
compute_stationary
(univ, region, options, size=None)¶ Computes stationary autocorrelation. API level.
Parameters: - univ (
Univariate
instance) – the stationary autocorr is based on this object - region (
Region
instance) – - options (
CompuParams
instance) – set the ‘adjust_mean’ and ‘disjoint’ options - size (int (default None)) – limit number of parsed Lineages
- univ (
-
tunacell.stats.api.
compute_stationary_bivariate
(row_univariate, col_univariate, region, options, size=None)¶ Computes stationary cross-correlation function from couple of univs
Need to compute stationary univariates as well.
-
tunacell.stats.api.
compute_univariate
(exp, obs, region='ALL', cset=[], times=None, size=None)¶ Computes one-point and two-point functions of statistical analysis.
This functions handles conditions and time-window binning:
- all conditions provided in cset are applied independently, in addition to the computation with unconditioned data (labelled ‘master’)
- A time-binning window is provided with a given offset and a period. Explicitely a given time value t found in data will be rounded up to closest offset_t + k * delta_t, where k is an integer.
Parameters: - exp (
Experiment
instance) – - obs (
Observable
instance) – - region (
Region
instance or str (default ‘ALL’)) – in case of str, must be the name of a registered region - cset (list of
FilterSet
instances) – - times (1d ndarray, or str (default None)) – array of times at which process is evaluated. Default is to use the ‘ALL’ region with the period taken from experiment metadata. User can opt for a specific time array, or for the label of a region as a string
- size (int (default None)) – limit the iterator to size Lineage instances (used for testing)
Returns: Return type: Univariate instance
-
tunacell.stats.api.
load_bivariate
(row_univariate, col_univariate)¶ Initialize a StationaryBivariate instance from its dynamical one.
Parameters: - row_univariate (
Univariate
instance) – - col_univariate (
Univariate
instance) –
Returns: set up with empty arrays
Return type: Bivariate
instance- row_univariate (
-
tunacell.stats.api.
load_stationary
(univ, region, options)¶ Initialize a StationaryUnivariate instance from its dynamical one.
Parameters: - univ (
Univariate
instance) – - region (
Region
instance) – - options (
CompuParams
instance) –
Returns: set up with empty arrays
Return type: StationaryInstance
instance- univ (
-
tunacell.stats.api.
load_stationary_bivariate
(row_univariate, col_univariate, region, options)¶ Initialize a StationaryBivariate instance from its dynamical one.
Parameters: - row_univariate (
Univariate
instance) – - col_univariate (
Univariate
instance) – - region (
Region
instance) – - options (
CompuParams
instance) –
Returns: set up with empty arrays
Return type: StationaryBivariate
instance- row_univariate (
-
tunacell.stats.api.
load_univariate
(exp, obs, region='ALL', cset=[])¶ Initialize an empty Univariate instance.
Such a Univariate instance is bound to an experiment (through exp), an observable, and a set of conditions.
Parameters: - exp (
Experiment
instance) – - obs (
Observable
instance) – - region (
Region
instance or str (default ‘ALL’)) – in case of str, must be the name of a registered region - cset (sequence of
FilterSet
instances) –
Returns: initialized, nothing computed yet
Return type: Univariate
instanceRaises: UnivariateIOError
– when importing fails (no data corresponds to input params)- exp (
Plotting dynamic analysis¶
This module defines plotting functions for the statistics of the dynamics.
-
tunacell.plotting.dynamics.
plot_onepoint
(univariate, show_cdts='all', show_ci=False, mean_ref=None, var_ref=None, axe_xsize=6.0, axe_ysize=2.0, time_range=(None, None), time_fractional_pad=0.1, counts_range=(None, None), counts_fractional_pad=0.1, average_range=(None, None), average_fractional_pad=0.1, variance_range=(None, None), variance_fractional_pad=0.1, show_legend=True, show_cdt_details_in_legend=False, use_obs_name=None, save=False, user_path=None, ext='.png', verbose=False)¶ Plot one point statistics: counts, average, abd variance.
One point functions are plotted for each condition set up in show_cdts argument: ‘all’ for all conditions, or the string representation (or label) of a particuler condition (or a list thereof).
Parameters: - univariate (Univariate instance) –
- show_cdts (str (default 'all')) – must be either ‘all’, or ‘master’, or the repr of a condition, or a list thereof
- show_ci (bool {False, True}) – whether to show 99% confidence interval
- mean_ref (float) – reference mean value: what user expect to see as sample average to compare with data
- var_ref (float) – reference variance value: what user expect to see as sample variance to compare with data
- axe_xsize (float (default 6)) – size of the x-axis (inches)
- axe_ysize (float (default 2.)) – size if a single ax y-axis (inches)
- time_range (couple of floats (default (None, None))) – specifies (left, right) bounds
- time_fractional_pad (float (default .1)) – fraction of x-range to add as padding
- counts_range (couple of floats (default (None, None))) – specifies range for the Counts y-axis
- counts_fractional_pad (float (default .2)) – fractional amount of y-range to add as padding
- average_range (couple of floats (default (None, None))) – sepcifies range for the Average y-axis
- average_fractional_pad (couple of floats (default .2)) – fractional amounts of range to padding
- variance_range (couple of floats (default (None, None))) – sepcifies range for the Variance y-axis
- average_fractional_pad – fractional amounts of range to padding
- show_legend (bool {True, False}) – print out legend
- show_cdt_details_in_legend (bool {False, True}) – show details about filters
- use_obs_name (str (default None)) – when filled, the plot title will use this observable name instead of looking for the observable registered name
- save (bool {False, True}) – whether to save plot
- user_path (str (default None)) – user defined path where to save figure; default is canonical path (encouraged)
- ext (str {'.png', '.pdf'}) – extension to be used when saving file
- verbose (bool {False, True}) –
-
tunacell.plotting.dynamics.
plot_stationary
(stationary, show_cdts='all', axe_xsize=6.0, axe_ysize=2.0, time_range=(None, None), time_fractional_pad=0.1, time_guides=[0.0], counts_range=(None, None), counts_fractional_pad=0.1, corr_range=(None, None), counts_logscale=False, corr_fractional_pad=0.1, corr_logscale=False, corr_guides=[0.0], show_exp_decay=None, show_legend=True, show_cdt_details_in_legend=False, use_obs_name=None, save=False, ext='.png', verbose=False)¶ Plot stationary autocorrelation.
Parameters: - stationary (StationaryUnivariate or StationaryBivariate instance) –
- axe_xsize (float (default 6)) – size (in inches) of the x-axis
- axe_ysize (float (default 2)) – size (in inches) of the individual y-axis
- time_range (couple of floats) – bounds for time (x-axis)
- time_fractional_pad (float) – fractional padding for x-axis
- counts_range (couple of ints) – bounds for counts axis
- counts_fractional_pad (float) – fractional padding for counts axis
- corr_range (couple of floats) – bounds for correlation values
- counts_logscale (bool {False, True}) – use logscale for counts axis
- corr_fractional_pad (float) – fractional padding for correlation values
- corr_logscale (bool {False, True}) – use logscale for correlation values (symlog is used to display symmetrically negative values)
- corr_guides (list of float) – values where to plot shaded grey horizontal lines
- show_exp_decay (float (default None)) – whether to plot an exponential decay with corresponding rate exp(-rate * t)
- save (bool {False, True}) – whether to save plot at canonical path
- use_obs_name (str (default None)) – when filled, the plot title will use this observable name instead of looking for the observable registered name
- ext (str {'.png', '.pdf'}) – extension used for file
Returns: fig
Return type: Figure instance
-
tunacell.plotting.dynamics.
plot_twopoints
(univariate, condition_label=None, trefs=[], ntrefs=4, axe_xsize=6.0, axe_ysize=2.0, time_range=(None, None), time_fractional_pad=0.1, counts_range=(None, None), counts_fractional_pad=0.1, corr_range=(None, None), corr_fractional_pad=0.1, delta_t_max=None, show_exp_decay=None, show_legend=True, show_cdt_details_in_legend=False, use_obs_name=None, save=False, ext='.png', verbose=False)¶ Plot two-point functions: counts and autocorrelation functions.
These plots are able to show only one extra condition with ‘master’, and are plotted for a set of time of references.
Parameters: - univariate (
Univariate
instance) – - condition_label (str (default None)) – must be the repr of a given FilterSet
- trefs (flist of floats) – indicate the times that you would like to have as references if left empty, reference times will be computed automatically
- ntrefs (int) – if trefs is empty, number of times of reference to display
- axe_xsize (float (default 6)) – size of the x-axis (inches)
- axe_ysize (float (default 2.)) – size if a single ax y-axis (inches)
- time_range (couple of floats (default (None, None))) – specifies (left, right) bounds
- time_fractional_pad (float (default .1)) – fraction of x-range to add as padding
- counts_range (couple of floats (default (None, None))) – specifies range for the Counts y-axis
- counts_fractional_pad (float (default .2)) – fractional amount of y-range to add as padding
- corr_range (couple of floats (default (None, None))) – sepcifies range for the Average y-axis
- corr_fractional_pad (couple of floats (default .2)) – fractional amounts of range to padding
- delta_t_max (float (default None)) – when given, bottom plot will be using this max range symmetrically; otherwise, will use the largest intervals found in data (often too large to see something)
- show_exp_decay (float (default None)) –
- when a floating point number is passed, a light exponential decay
- curve is plotted for each tref
- show_legend : bool {True, False}
- print out legend
- show_cdt_details_in_legend (bool {False, True}) – show details about filters
- use_obs_name (str (default None)) – when filled, the plot title will use this observable name instead of looking for the observable registered name
- save (bool {False, True}) – whether to save figure at canonical path
- ext (str {'.png', '.pdf'}) – extension to be used when saving figure
- verbose (bool {False, True}) –
- univariate (