Welcome to the recipy documentation!¶
Installation¶
The easiest way to install is by simply running:
pip install recipy
Alternatively, you can clone this repository and run:
python setup.py install
If you want to install the dependencies manually (they should be installed automatically if you’re following the instructions above) then run:
pip install -r requirements.txt
You can upgrade from a previous release by running:
pip install -U recipy
To find out what has changed since the last release, see the changelog
Note: Previous (unreleased) versions of recipy required MongoDB to be installed and set up manually. This is no longer required, as a pure Python database (TinyDB) is used instead. Also, the GUI is now integrated fully into recipy and does not require installing separately.
User Manual¶
With the addition of a single line of code to the top of a Python script, recipy logs each run of your code to a database, keeping track of the input files, output files and the version of your code. It then lets you query this database to help you to recall the exact steps you took to create a certain output file.
Logging Provenance Information¶
To log provenance information, simply add the following line to the top of your code:
import recipy
Note that this must be the very top line of your script, before you import anything else.
Then just run your script as usual, and all of the data will be logged into the
TinyDB database (don’t worry, the database is automatically created if needed).
You can then use the recipy
command to quickly query the database to find out
what run of your code produced what output file. So, for example, if you run some
code like this:
import recipy
import numpy
arr = numpy.arange(10)
arr = arr + 500
numpy.save('test.npy', arr)
(Note the addition of import recipy
at the beginning of script - but there
are no other changes from a standard script.)
Alternatively, run an unmodified script with python -m recipy SCRIPT [ARGS ...]
to enable recipy logging. This invokes recipy’s module entry point, which takes
care of import recipy for you, before running your script.
Retrieving Information about Runs¶
it will produce an output called test.npy
. To find out the details of the
run which created this file you can search using
recipy search test.npy
and it will display information like the following:
Created by robin on 2015-05-25 19:00:15.631000
Ran /Users/robin/code/recipy/example_script.py using /usr/local/opt/python/bin/python2.7
Git: commit 91a245e5ea82f33ae58380629b6586883cca3ac4, in repo /Users/robin/code/recipy, with origin git@github.com:recipy/recipy.git
Environment: Darwin-14.3.0-x86_64-i386-64bit, python 2.7.9 (default, Feb 10 2015, 03:28:08)
Inputs:
Outputs:
/Users/robin/code/recipy/test.npy
An alternative way to view this is to use the GUI. Just run recipy gui
and
a browser window will open with an interface that you can use to search all of
your recipy ‘runs’:

Logging Files Using Built-In Open¶
If you want to log inputs and outputs of files read or written with built-in
open, you need to do a little more work. Either use recipy.open
(only requires import recipy
at the top of your script), or add
from recipy import open
and just use open
.
This workaround is required, because many libraries use built-in open internally, and you only want to record the files you explicitly opened yourself.
If you use Python 2, you can pass an encoding
parameter to recipy.open
.
In this case codecs
is used to open the file with proper encoding.
Annotating Runs¶
Once you’ve got some runs in your database, you can ‘annotate’ these runs with any notes that you want to keep about them. This can be particularly useful for recording which runs worked well, or particular problems you ran into. This can be done from the ‘details’ page in the GUI, or by running
recipy annotate [run-id]
which will open an editor to allow you to write notes that will be attached to the run. These will then be viewable via the command-line and the GUI when searching for runs.
Saving Custom Values¶
In your script, you can also add custom key-value pairs to the run:
recipy.log_values(key='value')
recipy.log_values({'key': 'value'})
Please note that, at the moment, these values are not displayed in the CLI or in the GUI.
Command Line Interface¶
There are other features in the command-line interface too: recipy --help
to see the other options. You can view diffs, see all runs that created a file
with a given name, search based on ids, show the latest entry and more:
recipy - a frictionless provenance tool for Python
Usage:
recipy search [options] <outputfile>
recipy latest [options]
recipy gui [options]
recipy annotate [<idvalue>]
recipy pm [--format <rst|plain>]
recipy (-h | --help)
recipy --version
Options:
-h --help Show this screen
--version Show version
-p --filepath Search based on filepath rather than hash
-f --fuzzy Use fuzzy searching on filename
-r --regex Use regex searching on filename
-i --id Search based on (a fragment of) the run ID
-a --all Show all results (otherwise just latest result given)
-v --verbose Be verbose
-d --diff Show diff
-j --json Show output as JSON
--no-browser Do not open browser window
--debug Turn on debugging mode
Configuration¶
By default, recipy stores all of its configuration and the database itself in
~/.recipy
. Recipy’s main configuration file is inside this folder, called
recipyrc
. The configuration file format is very simple, and is based on
Windows INI files - and having a configuration file is completely optional:
the defaults will work fine with no configuration file.
An example configuration is:
[ignored metadata]
diff
[general]
debug
This simply instructs recipy not to save git diff
information when it
records metadata about a run, and also to print debug messages (which can be
really useful if you’re trying to work out why certain functions aren’t
patched). At the moment, the only possible options are:
[general]
debug
- print debug messageseditor = vi
- Configure the default text editor that will be used when recipy needs you to type in a message. Use notepad if on Windows, for examplequiet
- don’t print any messagesport
- specify port to use for the GUI
[data]
file_diff_outputs
- store diff between the old output and new output file, if the output file exists before the script is executed
[database]
path = /path/to/file.json
- set the path to the database file
[ignored metadata]
diff
- don’t store the output ofgit diff
in the metadata for a recipy rungit
- don’t store anything relating to git (origin, commit, repo etc) in the metadata for a recipy runinput_hashes
- don’t compute and store SHA-1 hashes of input filesoutput_hashes
- don’t compute and store SHA-1 hashes of output files
[ignored inputs]
- List any module here (eg.
numpy
) to instruct recipy not to record inputs from this module, orall
to ignore inputs from all modules
- List any module here (eg.
[ignored outputs]
- List any module here (eg.
numpy
) to instruct recipy not to record outputs from this module, orall
to ignore outputs from all modules
- List any module here (eg.
By default all metadata is stored (ie. no metadata is ignored) and debug messages
are not shown. A .recipyrc
file in the current directory takes precedence over
the ~/.recipy/recipyrc
file, allowing per-project configurations to be easily
handled.
Note: No default configuration file is provided with recipy, so if you wish to configure anything you will need to create a properly-formatted file yourself.
Patched Modules¶
This table lists the modules recipy has patches for, and the input and output functions that are patched.
Are you missing a patch for your favourite Python package? Learn how to create a new patch. We are looking forward to your pull request!
modulename | input_functions | output_functions |
---|---|---|
pandas |
read_csv ,
read_table ,
read_excel ,
read_hdf ,
read_pickle ,
read_stata ,
read_msgpack |
DataFrame.to_csv ,
DataFrame.to_excel ,
DataFrame.to_hdf ,
DataFrame.to_msgpack ,
DataFrame.to_stata ,
DataFrame.to_pickle ,
Panel.to_excel ,
Panel.to_hdf ,
Panel.to_msgpack ,
Panel.to_pickle ,
Series.to_csv ,
Series.to_hdf ,
Series.to_msgpack ,
Series.to_pickle |
matplotlib.pyplot |
savefig |
|
numpy |
genfromtxt ,
loadtxt ,
fromfile |
save ,
savez ,
savez_compressed ,
savetxt |
lxml.etree |
parse ,
iterparse |
|
bs4 |
BeautifulSoup |
|
gdal |
Open |
Driver.Create ,
Driver.CreateCopy |
sklearn |
datasets.load_svmlight_file |
datasets.dump_svmlight_file |
nibabel |
nifti1.Nifti1Image.from_filename ,
nifti2.Nifti2Image.from_filename ,
freesurfer.mghformat.MGHImage.from_filename ,
spm99analyze.Spm99AnalyzeImage.from_filename ,
minc1.Minc1Image.from_filename ,
minc2.Minc2Image.from_filename ,
analyze.AnalyzeImage.from_filename ,
parrec.PARRECImage.from_filename ,
spm2analyze.Spm2AnalyzeImage.from_filename |
nifti1.Nifti1Image.to_filename ,
nifti2.Nifti2Image.to_filename ,
freesurfer.mghformat.MGHImage.to_filename ,
spm99analyze.Spm99AnalyzeImage.to_filename ,
minc1.Minc1Image.to_filename ,
minc2.Minc2Image.to_filename ,
analyze.AnalyzeImage.to_filename ,
parrec.PARRECImage.to_filename ,
spm2analyze.Spm2AnalyzeImage.to_filename |
tifffile |
imread |
imsave |
imageio |
core.functions.get_reader ,
core.functions.read |
core.functions.get_writer |
netCDF4 |
Dataset |
Dataset |
xarray |
open_dataset ,
open_mfdataset ,
open_rasterio ,
open_dataarray |
Dataset.to_netcdf ,
DataArray.to_netcdf |
iris |
iris.load ,
iris.load_cube ,
iris.load_cubes ,
iris.load_raw |
iris.save |
To generate this table do:
recipy pm --format rst
For Developers¶
How does it work?¶
For each run of your script, recipy logs information about which script was run, the command line arguments, Python version, git or svn repo, warnings, errors, and inputs and outputs (see Database Schema for a complete overview). Gathering most of this information is straightforward using the Python Standard Library and specialized packages (such as, GitPython or svn). Automatically logging inputs and outputs, recipy’s most important feature, is more complicated.
When a Python module that reads or writes files is imported, recipy wraps
methods for reading and writing files to log file paths to the database.
To make this happen, recipy contains a patch for every library that reads or
writes files. When you import recipy, the patches are added to
sys.meta_path
so they can be used to wrap a module’s functions that read
or write files when it is imported. This is why import recipy should be called
before importing other modules.
Currently, recipy contains patches for many popular (scientific) libraries,
including numpy
and pandas
. For an overview see Patched Modules.
Is your favourite module missing? Have a look at Creating Patches.
We are looking forward to your pull request!
Creating Patches¶
Patches are derived from PatchImporter
, often
using the easier interface provided by PatchSimple
.
To create a patch, you need to specify the module name, the input and output
functions, and the argument referencing the file path.
Simple patches¶
Because most of the complexity is hidden away, the actual code to wrap a module
is fairly simple. Using PatchSimple
, the patch
for numpy looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | from .PatchSimple import PatchSimple
from .log import log_input, log_output, add_module_to_db
from recipyCommon.utils import create_wrapper
# Inherit from PatchSimple
class PatchNumpy(PatchSimple):
# Specify the full name of the module
modulename = 'numpy'
# List functions that are involved in input/output
# these can be anything that can go after "modulename."
# so they could be something like "pyplot.savefig" for example
input_functions = ['genfromtxt', 'loadtxt', 'load', 'fromfile']
output_functions = ['save', 'savez', 'savez_compressed', 'savetxt']
# Define the functions that will be used to wrap the input/output
# functions.
# In this case we are calling the log_input function to log it to
# the DB and we are giving it the 0th argument from the function
# (because all of the functions above take the filename as the
# 0th argument), and telling it that it came from numpy.
input_wrapper = create_wrapper(log_input, 0, 'numpy')
output_wrapper = create_wrapper(log_output, 0, 'numpy')
# Add the module to the database, so we can list it.
add_module_to_db(modulename, input_functions, output_functions)
|
First, we import the required functionality (lines 1-3). In addition to
PatchSimple
, we need functions that do the actual
logging (log_input()
and log_output()
),
write details about the module to the database
(add_module_to_db()
), and a function to create wrappers
(create_wrapper()
). Next, we define a class
for the patch that inherits from PatchSimple
(line
6). In this class, we need to define the module name (line 8), and the input and
output functions (line 13 and 14).
Because the patches rely on class attributes, it is important to use the
variable names specified in the example code.
Then, we need to define the input and output
wrapper (line 22 and 23). The wrappers are created using a predefined function,
that needs as inputs a logging function (i.e., log_input()
or log_output()
), the index of the argument that specifies
the file path in the input or output function, and the name of the module.
Finally, the module is added to the database (line 26).
Enabling a patch¶
Patch objects for specific modules can be found in
PatchBaseScientific
and PatchScientific
. To enable
a new patch, one more step is required; the constructor of the new object should
be called inside multiple_insert()
. This function is
called at the bottom of PatchBaseScientific
and
PatchScientific
.
Patching more complex modules¶
Most of the patched modules are based on PatchSimple
.
However, some modules require more complexity. Some modules, e.g.,
netcdf4-python, use a file open like
method for reading and writing files; the method called is the same, and whether
the file is read or written depends on arguments:
1 2 3 4 | from netCDF4 import Dataset
ds = Dataset('test.nc', 'r') # read test.nc
ds = Dataset('test.nc', 'w') # write test.nc
|
For this situation a different patch type and wrapper creation function are
available: PatchFileOpenLike
in combination with
create_argument_wrapper()
.
The complete code example (PatchNetCDF4
):
1 2 3 4 5 6 7 8 9 10 11 12 13 | from .PatchFileOpenLike import PatchFileOpenLike
from .log import log_input, log_output, add_module_to_db
from recipyCommon.utils import create_argument_wrapper
class PatchNetCDF4(PatchFileOpenLike):
modulename = 'netCDF4'
functions = ['Dataset']
wrapper = create_argument_wrapper(log_input, log_output, 0, 'mode', 'ra',
'aw', 'r', 'netCDF4')
add_module_to_db(modulename, functions, functions)
|
Another instance where it isn’t possible to use
PatchSimple
is when not all input or output
functions have the file path in the same position. For example,
xarray
has three functions for writing files, i.e., to_netcdf()
,
to_netcdf()
, and save_mfdataset()
.
For to_netcdf()
and to_netcdf()
,
the file paths are argument 0, as in the previous examples. However, the
argument for the file paths of save_mfdataset()
(it is a method
for writing multiple files at once) is 1.
With PatchSimple
there is no way to represent this.
A patch class that allows specifying separate wrappers for different functions
is PatchMultipleWrappers
. Using this
class involves defining a WrapperList
to
which inputs and outputs can be added. As can be seen in the following code
example (PatchXarray
), you can specify a wrapper for a list
of functions (line 13 and 14) or for one function (line 15).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | from .PatchMultipleWrappers import PatchMultipleWrappers, WrapperList
from .log import log_input, log_output, add_module_to_db
class PatchXarray(PatchMultipleWrappers):
modulename = 'xarray'
wrappers = WrapperList()
input_functions = ['open_dataset', 'open_mfdataset', 'open_rasterio',
'open_dataarray']
output_functions = ['Dataset.to_netcdf', 'DataArray.to_netcdf']
wrappers.add_inputs(input_functions, log_input, 0, modulename)
wrappers.add_outputs(output_functions, log_output, 0, modulename)
wrappers.add_outputs('save_mfdataset', log_output, 1, modulename)
add_module_to_db(modulename, input_functions, output_functions)
|
Writing tests¶
If you make a new patch, please include tests (your pull request won’t be accepted without them)! Recipy has a testing framework that checks whether inputs and outputs are actually logged when a function is called (see Test Framework for more details).
If you create a patch for a module, follow these steps to create tests:
Prepare small data files for testing, and create a directory
integration_test/packages/data/<module name>
containing these files.- Small means kilobytes!
Create a test script for the patch. The name of the script should be
run_<module name>.py
.- It is probably easiest to copy one of the
existing scripts in
integration_test/packages/
, so you can reuse the set up (it is pretty self-explanatory).
- It is probably easiest to copy one of the
existing scripts in
Write a test method for each input/output method.
- Be sure to add docstrings!
Add the test configuration.
- Open
integration_test/config/test_packages.yml
- Add a new section by typing:
--- script: run_<module name>.py libraries: [ modulename ] test_cases:
- Add a test case for each method in
run_<module name>.py
, specifyingarguments
, a list containing the name of the test method,inputs
, a list of the input files this method needs (if any), andoutputs
, a list of the output files this method creates (if any):
- arguments: [ <test method name> ] inputs: [<input names>] outputs: [<output names>]
- Specified input files should exist in
integration_test/packages/data/<module name>
- Test cases can skipped by adding the line:
skip: "<reason for skipping>"
- If you need to skip a test, please add a description of the problem plus how to reproduce it in the Recipy and Third-Party Package Issues section
- If a test case should be skipped in certain Python versions, add
skip_py_version: [ <python versions> ]
- Open
Run the tests by typing:
py.test -v integration_test/
Database Schema¶
Field | Description | Example |
---|---|---|
unique_id |
ff129fee-c0d9-47cc-b0a9-997a530230a8 | |
author |
Username of the one running the script | |
description |
Currently not used? | |
inputs |
List of input files (can be empty) | |
outputs |
List of output files (can be empty) | |
script |
Path to the script that was run | |
command |
Path to the python executable that was used to run the script | /usr/bin/python2.7 |
environment |
Information about the environment in which the script was run (including Python version) | Linux-4.15.0-29-generic-x86_64-with-debian-stretch-sid python 2.7.15 -Anaconda, Inc.- (default, May 1 2018, 23:32:55) |
date |
Date and time the script was run (in UTC) | |
command_args |
Command line arguments of the script (can be empty) | |
warnings |
Warnings issued during execution of the script (if any) | |
exception |
Exception raised during execution of the script (if any) | |
libraries |
Names and versions of patched libraries used in this script | |
notes |
Notes added by the user by running recipy annotate or in the gui |
|
custom_values |
Values logged explicitly in the script by calling recipy.log_values({'name': 'value'}) |
|
gitrepo /svnrepo |
||
gitcommit /svncommit |
Current git/svn commit | |
gitorigin |
||
diff |
Test Framework¶
recipy’s test framework is in integration_test
. The test framework
has been designed to run under both Python 2.7+ and Python 3+.
Running Tests with py.test¶
The tests use the py.test test framework, and are
run using its py.test
command. Useful py.test
flags and
command-line options include:
-v
: increase verbosity. This shows the name each test function that is run.-s
: show any output to standard output.-rs
: show extra test summary information for tests that were skipped.--junit-xml=report.xml
: create a JUnit-style test report in the filereport.xml
.
Running General Tests¶
To run tests of recipy’s command-line functions, run:
py.test -v integration_test/test_recipy.py
To run tests of recipy’s .recipyrc
configuration file, run:
py.test -v integration_test/test_recipyrc.py
To run tests of recipy invocations using python -m recipy script.py
,
run:
py.test -v integration_test/test_m_flag.py
Running Package-Specific Tests¶
To run tests that check recipy logs information about packages it has been configured to log, run:
py.test -v -rs integration_test/test_packages.py
Note: this assumes that all the packages have been installed.
To run a single test, provide the name of the test, for example:
$ py.test -v -rs integration_test/test_packages.py::test_scripts\[run_numpy_py_loadtxt\]
Note: [
and ]
need to be prefixed by \
.
Package-specific tests use a test configuration file located in
integration_test/config/test_packages.yml
.
You can specify a different test configuration file using a
RECIPY_TEST_CONFIG
environment variable. For example:
RECIPY_TEST_CONFIG=test_my_package.yml py.test -v -rs \
integration_test/test_packages.py
Note: the above command should be typed on one line, omitting \
.
For Windows, run:
set RECIPY_TEST_CONFIG=test_my_package.yml
py.test -v -rs integration_test\test_packages.py
Test Configuration File¶
The test configuration file is written in YAML (YAML Ain’t Markup Language). YAML syntax is:
---
indicates the start of a document.:
denotes a dictionary.:
must be followed by a space.-
denotes a list.
The test configuration file has format:
---
script: SCRIPT
standalone: True | False
libraries: [LIBRARY, LIBRARY, ... ]
test_cases:
- libraries: [LIBRARY, LIBRARY, ... ]
arguments: [..., ..., ...]
inputs: [INPUT, INPUT, ...]
outputs: [OUTPUT, OUTPUT, ...]
- libraries: [LIBRARY, LIBRARY, ... ]
arguments: [..., ..., ...]
inputs: [INPUT, INPUT, ...]
outputs: [OUTPUT, OUTPUT, ...]
skip: "Known issue with recipy"
skip_py_version: [3.4, ...]
- etc
---
script: SCRIPT
etc
Each script to be tested is defined by:
SCRIPT
: Python script, with a relative or absolute path. For recipy sample scripts (see below), the script is assumed to be in a sub-directoryintegration_test/packages
.standalone
: is the script a standalone script? IfFalse
, or if omitted, then the script is assumed to be a recipy sample script (see below).libraries
: A list of zero or more Python libraries used by the script, which are expected to be logged by recipy when the script is run regardless of arguments (i.e. any libraries common to all test cases). If none, then this can be omitted.
Each script also has one or more test cases, each of which defines:
libraries
: A list of zero or more Python libraries used by the script, which are expected to be logged by recipy when the script is run with the given arguments for this test case. If none, then this can be omitted.arguments
: A list of arguments to be passed to the script. If none, then this can be omitted.inputs
: A list of zero or more input files which the script will read, and which are expected to be logged by recipy when running the script with the arguments. If none, then this can be omitted.outputs
: A list of zero or more output files which the script will write, and which are expected to be logged by recipy when running the script with the arguments. If none, then this can be omitted.skip
: An optional value. If present this test case is marked as skipped. The value is the reason for skipping the test case.skip_py_version
\ : An optional value. If present this test case is marked- as skipped if the current Python version is in the list of values. Should be used when a patched library does not support a Python version that is supported by recipy.
For example:
---
script: run_numpy.py
libraries: [numpy]
test_cases:
- arguments: [loadtxt]
inputs: [input.csv]
- arguments: [savetxt]
outputs: [output.csv]
- arguments: [load_and_save_txt]
inputs: [input.csv]
outputs: [output.csv]
---
script: "/home/users/user/run_my_script.py"
standalone: True
test_cases:
- arguments: [ ]
libraries: [ numpy ]
outputs: [ data.csv ]
---
script: run_nibabel.py
libraries: [ nibabel ]
test_cases:
- arguments: [ analyze_from_filename ]
inputs: [ analyze_image ]
- arguments: [ analyze_to_filename ]
outputs: [ out_analyze_image ]
- arguments: [ minc1_from_filename ]
inputs: [ minc1_image ]
- arguments: [ minc1_to_filename ]
outputs: [ out_minc1_image ]
skip: "nibabel.minc1.Minc1Image.to_filename raises NotImplementedError"
There may be a number of entries for a single script, if desired. For example:
---
script: run_numpy.py
libraries: [numpy]
test_cases:
- arguments: [loadtxt]
inputs: [input.csv]
- arguments: [savetxt]
outputs: [output.csv]
---
script: run_numpy.py
libraries: [numpy]
test_cases:
- arguments: [load_and_save_txt]
inputs: [input.csv]
outputs: [output.csv]
It is up to you to ensure the library
, input
and output
file
names record the libraries, input and output files used by the
associated script, and which you expect to be logged by recipy.
Comments can be added to configuration files, prefixed by #
, for
example:
# This is a comment
Issues¶
The sample scripts in integration_tests/packages
may fail to run
with older versions of third-party packages. Known package versions
that can cause failures are listed in Package versioning
problems.
Certain third-party packages gave rise to issues, when attempting to configure the test framework for these. The packages and issues, and how the test framework has been configured to currently skip these are described in recipy and third-party package issues.
How the Test Framework uses Test Configuration Files¶
A test configuration file is used to auto-generate test functions for each test case using py.test’s support for parameterization.
In the first example above, 8 test functions are created, 3 for
run_numpy.py
and 1 for run_my_scripy.py
and 4 for run_nibabel.py
(of which 1 is marked to be skipped. In the second example, 3 test
functions are created, 2 for the first group of run_numpy.py
test
cases and 1 for the second group.
Test function names are auto-generated according to the following template:
test_scripts[SCRIPT_ARGUMENTS]
where SCRIPT
is the script
value and arguments the argument
values, concatenated using underscores (_
) and with all forward
slashes, backslashes, colons, semi-colons and spaces also replaced by
_
. For example, test_scripts[run_nibabel_py_analyze_from_filename]
.
The test framework runs the script with its arguments as follows. For recipy sample scripts:
python -m integration_test.package.SCRIPT ARGUMENTS
For scripts marked standalone: True
:
python SCRIPT ARGUMENTS
Once the script has run, the test framework carries out the following checks on the recipy database:
- There is only one new run in the database i.e. number of logs has increased by 1.
script
refers to the same file as thescript
.command_args
matches the test case’sarguments
.libraries
matches all the test case’slibraries
and all thelibraries
common to all test case’s for a script, and the recorded versions match the versions used whenscript
was run.inputs
match the test case’sinputs
(in terms of local file names).outputs
match test case’soutputs
(in terms of local file names).date
is a valid date.exit_date
is a valid date and is <=date
.command
holds the current Python interpreter.environment
holds the operating system and version of the current Python interpreter.author
holds the current user.description
is empty.
Recipy Sample Scripts¶
integration_test/packages
has a collection of package-specific
scripts. Each script corresponds to one package logged by recipy. Each
script has a function that invokes each of the input/output functions
of a specific package logged by recipy. For example, run_numpy.py
has functions that invoke:
numpy.loadtxt
numpy.savetxt
numpy.fromfile
numpy.save
numpy.savez
numpy.savez_compressed
numpy.genfromtxt
Each function is expected to invoke input and/or output functions using one or more functions which recipy can log.
Input and output files are the responsibility of each script itself. It can either create its own input and output files, or read these in from somewhere (but it is not expected that the caller (i.e. the test framework) create these.
Each test class has access to its own directory, via a
self.current_dir
field. It can use this to access any files it
needs within the current directory or, by convention, within a
sub-directory of data
(for example run_numpy.py
assumes a
data/numpy
sub-directory).
These scripts consist of classes that inherit from
integration_test.base.Base
which provides sub-classes with a simple
command-line interface which takes a function name as argument and, if
that function is provided by the script’s class (and takes no
arguments beyond self
), invokes that function. For example:
python SCRIPT.py FUNCTION
A sample script can be run as follows:
python -m integration_test.packages.SCRIPT FUNCTION
For example:
python -m integration_test.packages.run_numpy loadtxt
test_packages.py
assumes that if it is given a relative path to a
script, then that script is in integration_test/packages
and will
create this form of invocation.
Running scripts as modules
Note that the script needs to be specified as a module that is run as
a script (the -m
flag). Running it directly as a script e.g.
$ python integration_test/packages/run_numpy.py loadtxt
will fail:
Traceback (most recent call last):
File "integration_test/packages/run_numpy.py", line 17, in <module>
from integration_test.packages.base import Base
ImportError: No module named 'integration_test.packages'
For the technical detail of why this is so, please see Execution of Python code with -m option or not.
Providing a Test Configuration for Any Script¶
A recipy test configuration can be written for any script that uses
recipy. For example, to test a script that uses numpy.loadtxt
you
could write a configuration file which specifies:
- Full path to your script.
- Command-line arguments to be passed to your script.
- Libraries you expect to be logged by recipy.
- Local names of input files you expect to be logged by recipy.
- Local names of output files you expect to be logged by recipy.
For example, my_tests.yml
:
---
script: "/home/ubuntu/sample/run_numpy.py"
standalone: True
test_cases:
- arguments: [ "/home/ubuntu/data/data.csv",
"/home/ubuntu/data/out.csv" ]
libraries: [ numpy ]
inputs: [ data.csv ]
outputs: [ out.csv ]
You can run this as follows:
RECIPY_TEST_CONFIG=my_tests.yml py.test -v -rs \
integration_test/test_packages.py
The output might look like
============================= test session starts ==============================
platform linux2 -- Python 2.7.12, pytest-2.9.2, py-1.4.31, pluggy-0.3.1 -- /home/ubuntu/anaconda2/bin/python
cachedir: .cache
rootdir: /home/ubuntu/recipy, inifile:
collected 1 items
integration_test/test_packages.py::test_scripts[run_numpy_py__home_ubuntu_data_data_csv__home_ubuntu_data_out_csv] PASSED
=========================== 1 passed in 4.39 seconds ===========================
If using Anaconda and Git Bash on Windows, the file might look like:
---
script: "c:/Users/mjj/Local\ Documents/sample/run_numpy.py"
standalone: True
test_cases:
- arguments: [ "c:/Users/mjj/Local\ Documents/data/data.csv",
"c:/Users/mjj/Local\ Documents/data/out.csv" ]
libraries: [ numpy ]
inputs: [ data.csv ]
outputs: [ out.csv ]
Note the escaped spaces in the path.
Test Framework Limitations¶
The test framework does not support filtering tests depending upon which versions of packages are being tested e.g. specific versions of numpy or matplotlib. The test framework is designed to run tests within a single execution environment: a Python interpreter and a set of installed libraries.
If wishing to test different versions of packages then this could be done by:
- Writing a Python script that invokes input/output functions of that package.
- Writing a test configuration file that just runs that script.
- Setting up a test environment (e.g. as part of a Travis CI or
AppVeyor configuration file) that installs the specific package and
runs
py.test integration_test/test_packages.py
using the test configuration file.
test_recipy.py
does not validate whether multiple test results are
returned by recipy search -i
.
Recipy Dependencies¶
The test framework has no dependencies on any other part of the recipy repository: it uses recipy as if it were a stand-alone tool and queries the recipy database directly.
Known Problems¶
Recipy and Third-Party Package Issues¶
Issues encountered during test framework development and how the test framework has been configured to get round these issues.
No information logged by recipy (recipy code is commented-out)¶
All these tests are marked as skipped.
PIL
operations¶
To replicate:
python -m integration_test.packages.run_pil image_open
recipy latest
Run ID: cf695a6c-e9f0-4f4b-896f-9fa64127e2e8
Created by ubuntu on 2016-10-26 11:59:04 UTC
Ran /home/ubuntu/recipy/integration_test/packages/run_pil.py using /home/ubuntu/anaconda2/bin/python
Using command-line arguments: image_open
Git: commit 5dc58a3cf3432441c83fd9899768eb9b63583208, in repo /home/ubuntu/recipy, with origin https://mikej888@github.com/mikej888/recipy
Environment: Linux-3.19.0-25-generic-x86_64-with-debian-jessie-sid, python 2.7.12 |Anaconda custom (64-bit)| (default, Jul 2 2016, 17:42:40)
Libraries: recipy v0.3.0
Inputs: none
Outputs: none
skimage
operations¶
To replicate:
python -m integration_test.packages.run_skimage io_imread
recipy latest
Run ID: 10c803f7-3741-4008-8548-2f8d7ba4462c
Created by ubuntu on 2016-10-26 11:57:07 UTC
Ran /home/ubuntu/recipy/integration_test/packages/run_skimage.py using /home/ubuntu/anaconda2/bin/python
Using command-line arguments: io_imread
Git: commit 5dc58a3cf3432441c83fd9899768eb9b63583208, in repo /home/ubuntu/recipy, with origin https://mikej888@github.com/mikej888/recipy
Environment: Linux-3.19.0-25-generic-x86_64-with-debian-jessie-sid, python 2.7.12 |Anaconda custom (64-bit)| (default, Jul 2 2016, 17:42:40)
Libraries: recipy v0.3.0
Inputs: none
Outputs: none
Inaccurate Information Logged by Recipy¶
None.
Bugs Arising within Recipy Logging¶
All these tests are marked as skipped.
recipy.open
¶
To replicate:
python -m integration_test.packages.run_python open
Under Python 3 this fails with:
recipy run inserted, with ID 9abe4113-5158-4dcb-8fe8-afd1b1f6505e
Traceback (most recent call last):
File "c:\Users\mjj\AppData\Local\Continuum\Anaconda3\lib\runpy.py", line 170,in _run_module_as_main
"__main__", mod_spec)
File "c:\Users\mjj\AppData\Local\Continuum\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\Users\mjj\Local Documents\recipy\recipy\integration_test\packages\run_python.py", line 45, in <module>
PythonSample().invoke(sys.argv)
File "c:\Users\mjj\Local Documents\recipy\recipy\integration_test\packages\base.py", line 57, in invoke
function()
File "c:\Users\mjj\Local Documents\recipy\recipy\integration_test\packages\run_python.py", line 39, in open
with recipy.open('out.txt', 'w') as f:
File "c:\Users\mjj\Local Documents\recipy\recipy\recipy\utils.py", line 20, in open
mode = kwargs['mode']
KeyError: 'mode'
Under Python 2 this fails with:
recipy run inserted, with ID 5d80b88b-0d56-428d-b9e0-d95eca423044
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_python.py", line 42, in <module>
python_sample.invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_python.py", line 35, in open
with recipy.open('out.txt', 'w') as f:
File "recipy/utils.py", line 35, in open
log_output(args[0], 'recipy.open')
File "recipy/log.py", line 153, in log_output
db.update(append("libraries", get_version(source), no_duplicates=True), eids=[RUN_ID])
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 377, in update
cond, eids
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 230, in process_elements
data = self._read()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 277, in _read
return self._storage.read()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 31, in read
raw_data = (self._storage.read() or {})[self._table_name]
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb_serialization/__init__.py", line 139, in read
data = self.storage.read()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/storages.py", line 93, in read
self._handle.seek(0, 2)
ValueError: I/O operation on closed file
bs4.beautifulsoup.prettify
¶
To replicate:
python -m integration_test.packages.run_bs4 beautifulsoup
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_bs4.py", line 53, in <module>
Bs4Sample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_bs4.py", line 49, in beautifulsoup
print((soup.prettify()))
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/bs4/element.py", line 1160, in prettify
return self.decode(True, formatter=formatter)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/bs4/__init__.py", line 439, in decode
return prefix + super(BeautifulSoup, self).decode(
TypeError: super() argument 1 must be type, not FunctionWrapper
pandas.Panel.to_excel
¶
To replicate:
python -m integration_test.packages.run_pandas panel_to_excel
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_pandas.py", line 355, in <module>
PandasSample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_pandas.py", line 195, in panel_to_excel
panel.to_excel(file_name)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/pandas/core/panel.py", line 460, in to_excel
df.to_excel(writer, name, **kwargs)
File "recipyCommon/utils.py", line 90, in f
function(args[arg_loc], source)
File "recipy/log.py", line 139, in log_output
filename = os.path.abspath(filename)
File "/home/ubuntu/anaconda2/lib/python2.7/posixpath.py", line 360, in abspath
if not isabs(path):
File "/home/ubuntu/anaconda2/lib/python2.7/posixpath.py", line 54, in isabs
return s.startswith('/')
AttributeError: '_XlwtWriter' object has no attribute 'startswith'
nibabel.minc2.Minc2Image.from_filename
¶
To replicate:
python -m integration_test.packages.run_nibabel minc2_from_filename
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 302, in <module>
NibabelSample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 143, in minc2_from_filename
data = nib.minc2.Minc2Image.from_filename(file_name)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 699, in from_filename
return klass.from_file_map(file_map)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/minc1.py", line 299, in from_file_map
minc_file = Minc1File(netcdf_file(fobj))
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/externals/netcdf.py", line 230, in __init__
self._read()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/externals/netcdf.py", line 513, in _read
self.filename)
TypeError: Error: None is not a valid NetCDF 3 file
nibabel.Nifti2Image.from_filename
¶
To replicate:
python -m integration_test.packages.run_nibabel nifti2_from_filename
sizeof_hdr should be 348; set sizeof_hdr to 348
data code 0 not supported; not attempting fix
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 302, in <module>
NibabelSample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 182, in nifti2_from_filename
data = nib.Nifti2Image.from_filename(file_name)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/keywordonly.py", line 16, in wrapper
return func(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/analyze.py", line 986, in from_filename
return klass.from_file_map(file_map, mmap=mmap)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/keywordonly.py", line 16, in wrapper
return func(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/analyze.py", line 947, in from_file_map
header = klass.header_class.from_fileobj(hdrf)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/nifti1.py", line 594, in from_fileobj
hdr = klass(raw_str, endianness, check)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/nifti1.py", line 577, in __init__
check)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/analyze.py", line 252, in __init__
super(AnalyzeHeader, self).__init__(binaryblock, endianness, check)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/wrapstruct.py", line 176, in __init__
self.check_fix()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/wrapstruct.py", line 361, in check_fix
report.log_raise(logger, error_level)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/batteryrunners.py", line 275, in log_raise
raise self.error(self.problem_msg)
nibabel.spatialimages.HeaderDataError: data code 0 not supported
sklearn.load_svmlight_file
and sklearn.dump_svmlight_file
¶
To replicate:
python -m integration_test.packages.run_sklearn load_svmlight_file
Under Python 3 this fails with:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/home/ubuntu/anaconda3/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ubuntu/recipy/integration_test/packages/run_sklearn.py", line 16, in <module>
from sklearn import datasets
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 664, in _load_unlocked
File "<frozen importlib._bootstrap>", line 634, in _load_backward_compatible
File "/home/ubuntu/recipy/recipy/PatchImporter.py", line 52, in load_module
mod = self.patch(mod)
File "/home/ubuntu/recipy/recipy/PatchSimple.py", line 25, in patch
patch_function(mod, f, self.input_wrapper)
File "/home/ubuntu/recipy/recipyCommon/utils.py", line 82, in patch_function
setattr(mod, old_f_name, recursive_getattr(mod, function))
File "/home/ubuntu/recipy/recipyCommon/utils.py", line 54, in recursive_getattr
prev_part = getattr(prev_part, part)
AttributeError: module 'sklearn' has no attribute 'datasets'
Under Python 2 this fails with:
Traceback (most recent call last):
File "recipy/log.py", line 165, in log_exception
db.update({"exception": exception}, eids=[RUN_ID])
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 382, in update
cond, eids
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 235, in process_elements
func(data, eid)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 381, in <lambda>
lambda data, eid: data[eid].update(fields),
KeyError: 316
Original exception was:
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_sklearn.py", line 16, in <module>
from sklearn import datasets
File "recipy/PatchImporter.py", line 52, in load_module
mod = self.patch(mod)
File "recipy/PatchSimple.py", line 25, in patch
patch_function(mod, f, self.input_wrapper)
File "recipyCommon/utils.py", line 82, in patch_function
setattr(mod, old_f_name, recursive_getattr(mod, function))
File "recipyCommon/utils.py", line 54, in recursive_getattr
prev_part = getattr(prev_part, part)
AttributeError: 'module' object has no attribute 'datasets'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "recipy/log.py", line 244, in hash_outputs
for filename in run.get('outputs')]
AttributeError: 'NoneType' object has no attribute 'get'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "recipy/log.py", line 231, in log_exit
db.update({'exit_date': exit_date}, eids=[RUN_ID])
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 382, in update
cond, eids
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 235, in process_elements
func(data, eid)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 381, in <lambda>
lambda data, eid: data[eid].update(fields),
KeyError: 316
Error in sys.exitfunc:
Error in sys.excepthook:
Traceback (most recent call last):
File "recipy/log.py", line 165, in log_exception
db.update({"exception": exception}, eids=[RUN_ID])
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 382, in update
cond, eids
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 235, in process_elements
func(data, eid)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 381, in <lambda>
lambda data, eid: data[eid].update(fields),
KeyError: 316
Original exception was:
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "recipy/log.py", line 231, in log_exit
db.update({'exit_date': exit_date}, eids=[RUN_ID])
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 382, in update
cond, eids
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 235, in process_elements
func(data, eid)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tinydb/database.py", line 381, in <lambda>
lambda data, eid: data[eid].update(fields),
KeyError: 316
Operations Not Implemented by Packages¶
All these tests are marked as skipped.
nibabel.minc1.Minc1Image.to_filename
¶
To replicate:
python -m integration_test.packages.run_nibabel minc1_to_filename
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 302, in <module>
NibabelSample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 134, in minc1_to_filename
img.to_filename(file_name)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 781, in to_filename
self.to_file_map()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 790, in to_file_map
raise NotImplementedError
NotImplementedError
nibabel.minc2.Minc2Image.to_filename
¶
To replicate:
python -m integration_test.packages.run_nibabel minc2_to_filename
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 302, in <module>
NibabelSample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 154, in minc2_to_filename
img.to_filename(file_name)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 781, in to_filename
self.to_file_map()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 790, in to_file_map
raise NotImplementedError
NotImplementedError
nibabel.parrec.PARRECImage.to_filename
¶
To replicate:
python -m integration_test.packages.run_nibabel parrec_to_filename
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/ubuntu/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 302, in <module>
NibabelSample().invoke(sys.argv)
File "integration_test/packages/base.py", line 57, in invoke
function()
File "/home/ubuntu/recipy/integration_test/packages/run_nibabel.py", line 217, in parrec_to_filename
img.to_filename(par_file_name)
File "recipyCommon/utils.py", line 91, in f
return wrapped(*args, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 781, in to_filename
self.to_file_map()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/nibabel/spatialimages.py", line 790, in to_file_map
raise NotImplementedError
NotImplementedError
Using py.test and recipy¶
An issue that does not affect the test framework, but may affect future test development is that recipy and py.test do not integrate. For example, given test_sample.py:
class TestSample:
def test_sample(self):
pass
Running:
py.test test_sample.py
gives:
============================= test session starts =============================
platform win32 -- Python 3.5.1, pytest-3.0.2, py-1.4.31, pluggy-0.3.1
rootdir: c:\Users\mjj\Local Documents, inifile:
collected 1 items
test_sample.py .
========================== 1 passed in 0.02 seconds ===========================
Adding:
import recipy
Running py.test gives:
============================= test session starts =============================
platform win32 -- Python 3.5.1, pytest-3.0.2, py-1.4.31, pluggy-0.3.1
rootdir: c:\Users\mjj\Local Documents, inifile:
collected 0 items / 1 errors
=================================== ERRORS ====================================
_______________________ ERROR collecting test_sample.py _______________________
..\appdata\local\continuum\anaconda3\lib\site-packages\_pytest\python.py:209: in fget
return self._obj
E AttributeError: 'Module' object has no attribute '_obj'
During handling of the above exception, another exception occurred:
test_sample.py:1: in <module>
import recipy
..\appdata\local\continuum\anaconda3\lib\site-packages\recipy-0.3.0-py3.5.egg\recipy\__init__.py:12: in <module>
log_init()
..\appdata\local\continuum\anaconda3\lib\site-packages\recipy-0.3.0-py3.5.egg\recipy\log.py:74: in log_init
add_git_info(run, scriptpath)
..\appdata\local\continuum\anaconda3\lib\site-packages\recipy-0.3.0-py3.5.egg\recipyCommon\version_control.py:30: in add_git_info
repo = Repo(scriptpath, search_parent_directories=True)
..\appdata\local\continuum\anaconda3\lib\site-packages\gitpython-2.0.8-py3.5.egg\git\repo\base.py:139: in __init__
raise NoSuchPathError(epath)
E git.exc.NoSuchPathError: c:\Users\mjj\AppData\Local\Continuum\Anaconda3\Scripts\py.test
!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!
=========================== 1 error in 4.55 seconds ===========================
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "c:\users\mjj\appdata\local\continuum\anaconda3\lib\site-packages\recipy-0.3.0-py3.5.egg\recipy\log.py", line 242, in hash_outputs
run = db.get(eid=RUN_ID)
File "c:\users\mjj\appdata\local\continuum\anaconda3\lib\site-packages\tinydb-3.2.1-py3.5.egg\tinydb\database.py", line 432, in get
TypeError: unhashable type: 'dict'
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "c:\users\mjj\appdata\local\continuum\anaconda3\lib\site-packages\recipy-0.3.0-py3.5.egg\recipy\log.py", line 231, in log_exit
db.update({'exit_date': exit_date}, eids=[RUN_ID])
File "c:\users\mjj\appdata\local\continuum\anaconda3\lib\site-packages\tinydb-3.2.1-py3.5.egg\tinydb\database.py", line 382, in update
File "c:\users\mjj\appdata\local\continuum\anaconda3\lib\site-packages\tinydb-3.2.1-py3.5.egg\tinydb\database.py", line 235, in process_elements
File "c:\users\mjj\appdata\local\continuum\anaconda3\lib\site-packages\tinydb-3.2.1-py3.5.egg\tinydb\database.py", line 381, in <lambda>
TypeError: unhashable type: 'dict'
Package Versioning Problems¶
The sample scripts in integration_tests/packages
may fail to run
with older versions of third-party packages. Known package versions
that can cause failures are listed below.
These can arise due to differences in versions of packages (especially
their APIs) installed by conda
, pip
, easy_install
or within
Python-related packages installed via apt-get
or yum
under Linux.
NiBabel AttributeError
¶
$ python -m integration_test.packages.run_nibabel nifti2_from_filename
data = nib.Nifti2Image.from_filename(file_name)
AttributeError: 'module' object has no attribute 'Nifti2Image'
$ python -m integration_test.packages.run_nibabel nifti2_to_filename
img = nib.Nifti2Image(self.get_data(), self,get_affine())
AttributeError: 'module' object has no attribute 'Nifti2Image'
$ python -m integration_test.packages.run_nibabel minc1_from_filename
data = nib.minc1.Minc1Image.from_filename(file_name)
AttributeError: 'module' object has no attribute 'minc1'
$ python -m integration_test.packages.run_nibabel minc1_to_filename
img = nib.minc1.Minc1Image(self.get_data(), np.eye(4))
AttributeError: 'module' object has no attribute 'minc1'
$ python -m integration_test.packages.run_nibabel minc2_from_filename
data = nib.minc2.Minc2Image.from_filename(file_name)
AttributeError: 'module' object has no attribute 'minc2'
$ python -m integration_test.packages.run_nibabel minc2_to_filename
img = nib.minc2.Minc2Image(self.get_data(), np.eye(4))
AttributeError: 'module' object has no attribute 'minc2'
$ python -m integration_test.packages.run_nibabel parrec_from_filename
data = nib.parrec.PARRECImage.from_filename(file_name)
AttributeError: 'module' object has no attribute 'parrec'
$ python -m integration_test.packages.run_nibabel parrec_to_filename
img = nib.parrec.PARRECImage.from_filename(file_name)
AttributeError: 'module' object has no attribute 'parrec'
`
Fails on nibabel 1.2.2 (bundled in Ubuntu package python-nibabel
)
due to change in package API.
Succeeds on 2.0.2.
PIL AttributeError
¶
$ python -m integration_test.packages.run_pil image_open
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 528, in __getattr__
raise AttributeError(name)
AttributeError: __exit__
$ python -m integration_test.packages.run_pil image_save
...as above...
Fails on PIL/pillow 2.3.0 (bundled in Ubuntu package python-pillow
)
due to change in package API.
Succeeds on 3.2.0+.
pandas TypeError
¶
$ python -m integration_test.packages.run_pandas read_excel
TypeError: read_excel() takes exactly 2 arguments (1 given)
$ python3 -m integration_test.packages.run_pandas read_excel
TypeError: read_excel() missing 1 required positional argument: 'sheetname'
$ python -m integration_test.packages.run_pandas read_hdf
TypeError: read_hdf() takes exactly 2 arguments (1 given)
$ python3 -m integration_test.packages.run_pandas read_hdf
TypeError: read_hdf() missing 1 required positional argument: 'key'
Fails on pandas 0.13.1 (bundled in Ubuntu package python-pandas
) due
to change in package API.
Succeeds on 0.18.1.
pandas ImportError
¶
$ python -m integration_test.packages.run_pandas read_pickle
ImportError: No module named indexes.base
$ python3 -m integration_test.packages.run_pandas read_pickle
ImportError: No module named 'pandas.indexes'
During handling of the above exception, another exception occurred:
...
Fails on pandas 0.13.1 (bundled in Ubuntu package python-pandas
) due
to change in package API.
Succeeds on 0.18.1.
pandas ValueError
¶
$ python -m integration_test.packages.run_pandas read_msgpack
ValueError: Unpack failed: error = -1
$ python3 -m integration_test.packages.run_pandas read_msgpack
...as above...
Fails on pandas 0.13.1 (bundled in Ubuntu package python-pandas
) due
to change in file format. The scripts work if run using data files
created by pandas 0.13.1.
Succeeds on 0.18.1.
skimage NameError
¶
$ python -m integration_test.packages.run_skimage io_load_sift
NameError: name 'file' is not defined
$ python -m integration_test.packages.run_skimage io_load_surf
NameError: name 'file' is not defined
Fails on Python 3 as file()
built-in removed in Python -M
Integration_Test.Packages.3 (see
Builtins).
skimage ImportError
¶
run_skimage.py
examples fail with:
Traceback (most recent call last):
File "run_skimage.py", line 13, in <module>
from skimage import external
ImportError: cannot import name 'external'
Commenting out:
from skimage import external
allows non-external.tifffile examples to run.
Fails on skimage 0.9.3 (bundled in Ubuntu package python-skimage
)
due to change in package API.
Succeeds on 0.12.3.