Reliure documentation

Welcome on reliure documenation. Just after the table of contents are presented basic information on reliure.

Contents

Pipeline and components

Simple reliure pipeline

Here is a very simple pipeline of two components:

>>> from reliure import Composable
>>> plus_two = Composable(lambda x: x+2)
>>> times_three = Composable(lambda x: x*3)
>>> pipeline = plus_two | times_three
>>> # one can then run it :
>>> pipeline(3)
15
>>> pipeline(10)
36

Note that we wrap simple function into Composable, let’s detail it further.

Build a more complex component

A reliure component is basicely function with spectial wrapping arround to make it “pipeline-able”. In other word it is a callable object that inherit from reliure.Composable.

You can build it using reliure.Composable as a decorator:

>>> @Composable
... def my_processing(value):
...     return value**2
...
>>> my_processing(2)
4
>>> my_processing.name      # this has been added by Composable
'my_processing'

Or you can build a class that inherit from reliure.Composable:

>>> class MyProcessing(Composable):
...     def __init__(self, pow):
...         super(MyProcessing, self).__init__()
...         self.pow = pow
...
...     def __call__(self, intdata):
...         return intdata**self.pow
...
>>> my_processing = MyProcessing(4)
>>> my_processing.name
'MyProcessing'
>>> my_processing(2)
16

Tip

Defining a component as an object (with a __call__) has the avantage to make it cofigurable. Indeed some parameters can be given in the __init__ and can then be used in __call__.

Add options to components

Todo

better presentation of options

An other key feature of reliure is to have reliure.Optionable components:

>>> from reliure import Optionable
>>> from reliure.types import Numeric
>>>
>>> class ProcessWithOption(Optionable):
...     def __init__(self):
...         super(ProcessWithOption, self).__init__()
...         self.add_option("pow", Numeric(default=2, help="power to apply", min=0))
...
...     @Optionable.check
...     def __call__(self, intdata, pow=None):
...         return intdata**pow
...
>>> my_processing = ProcessWithOption()
>>> my_processing.name
'ProcessWithOption'
>>> my_processing(2)
4
>>> my_processing(2, pow=4)
16
>>> my_processing(2, pow=-2)
Traceback (most recent call last):
reliure.exceptions.ValidationError: ['Ensure this value ("-2") is greater than or equal to 0.']
>>> 2
2

Processing engine

Table of contents

wtf exemple

Here is a simple exemple of Engine usage. First you need to setup your engine:

>>> from reliure.engine import Engine
>>> egn = Engine()
>>> egn.requires('foo', 'bar', 'boo')

one can make imaginary components:

>>> from reliure.pipeline import Pipeline, Optionable, Composable
>>> from reliure.types import Numeric
>>> class One(Optionable):
...     def __init__(self):
...         super(One, self).__init__(name="one")
...         self.add_option("val", Numeric(default=1))
...
...     @Optionable.check
...     def __call__(self, input, val=None):
...         return input + val
...
>>> one = One()
>>> two = Composable(name="two", func=lambda x: x*2)
>>> three = Composable(lambda x: x - 2) | Composable(lambda x: x/2.)
>>> three.name = "three"

one can configure a block with this three components:

>>> foo_comps = [one, two, three]
>>> foo_options = {'defaults': 'two'}
>>> egn.set('foo', *foo_comps, **foo_options)

or

>>> egn['bar'].setup(multiple=True)
>>> egn['bar'].append(two, default=True)
>>> egn['bar'].append(three, default=True)

or

>>> egn["boo"].set(two, three)
>>> egn["boo"].setup(multiple=True)
>>> egn["boo"].defaults = [comp.name for comp in (two, three)]

One can have the list of all configurations:

>>> from pprint import pprint
>>> pprint(egn.as_dict())
{'args': ['input'],
 'blocks': [{'args': None,
             'components': [{'default': False,
                             'name': 'one',
                             'options': [{'name': 'val',
                                          'otype': {'choices': None,
                                                    'default': 1,
                                                    'help': '',
                                                    'max': None,
                                                    'min': None,
                                                    'multi': False,
                                                    'type': 'Numeric',
                                                    'uniq': False,
                                                    'vtype': 'int'},
                                          'type': 'value',
                                          'value': 1}]},
                            {'default': True,
                             'name': 'two',
                             'options': None},
                            {'default': False,
                             'name': 'three',
                             'options': []}],
             'multiple': False,
             'name': 'foo',
             'required': True,
             'returns': 'foo'},
            {'args': None,
             'components': [{'default': True,
                             'name': 'two',
                             'options': None},
                            {'default': True,
                             'name': 'three',
                             'options': []}],
             'multiple': True,
             'name': 'bar',
             'required': True,
             'returns': 'bar'},
            {'args': None,
             'components': [{'default': True,
                             'name': 'two',
                             'options': None},
                            {'default': True,
                             'name': 'three',
                             'options': []}],
             'multiple': True,
             'name': 'boo',
             'required': True,
             'returns': 'boo'}]}

And then you can configure and run it:

>>> request_options = {
...     'foo':[
...         {
...             'name': 'one',
...             'options': {
...                 'val': 2
...             }
...        },     # input + 2
...     ],
...     'bar':[
...         {'name': 'two'},
...     ],     # input * 2
...     'boo':[
...         {'name': 'two'},
...         {'name': 'three'},
...     ], # (input - 2) / 2.
... }
>>> egn.configure(request_options)
>>> # test before running:
>>> egn.validate()

One can then run only one block:

>>> egn['boo'].play(10)
{'boo': 4.0}

or all blocks :

>>> res = egn.play(4)
>>> res['foo']      # 4 + 2
6
>>> res['bar']      # 6 * 2
12
>>> res['boo']      # (12 - 2) / 2.0
5.0

HTTP/Json API

Reliure permits to build json api for simple processing function (or Optionable) as well as for more complex Engine. The idea of the reliure API mechanism is : you manage data processing logic, reliure manages the “glue” job.

Reliure web API are based on Flask. A reliure API (ReliureAPI) is a Flask Blueprint where you plug view of your processing modules.

Let’s see how it works on some simple examples !

Component or simple function

Expose a simple function

Let’s imagine that we have the following hyper-complex data procesing method:

>>> def count_a_and_b(chaine):
...     return chaine.count("a") + chaine.count("b")

and you want it to be accessible on an HTTP/json supercool-powered API... In ohter word we just want that a GET on http://myapi.me.com/api/count_ab/totocotata returns 2 and eventualy some other metadata (processing time for instance).

Here is how we can do that with reliure.

First you need to build a “view” (a ComponentView) on this function:

>>> from reliure.web import ComponentView
>>> view = ComponentView(count_a_and_b)

Then you have to define the type of the input (the type will manage parsing from string/json):

>>> from reliure.types import Text
>>> view.add_input("in", Text())
>>> # Note that, by default, the output will be named with the function name

You can also specify a short url patern to reach your function, this is done with flask route paterns syntax. Here we will simply indicate that the url (note that there will be url prefix) should match our uniq input:

>>> view.play_route("<in>")

Then you can create a ReliureAPI object and register this view on it:

>>> from reliure.web import ReliureAPI
>>> api = ReliureAPI("api")
>>> api.register_view(view, url_prefix="count_ab")

This api object can be plug to a flask app (it is a Flask Blueprint):

>>> from flask import Flask
>>> app = Flask("my_app")
>>> app.register_blueprint(api, url_prefix="/api")

To illustrate API call, let’s use Flask testing mechanism:

>>> resp = client.get("/api/count_ab/abcdea")    # call our API
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'count_a_and_b': 3}
>>>
>>> resp = client.get("/api/count_ab/abcdea__bb_aaa")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'count_a_and_b': 8}

Note that meta information is also available:

>>> pprint(results["meta"])         
{'details': [{'errors': [],
              'name': 'count_a_and_b',
              'time': 3.314018249511719e-05,
              'warnings': []}],
 'errors': [],
 'name': 'count_a_and_b:[count_a_and_b]',
 'time': 3.314018249511719e-05,
 'warnings': []}
Managing options and multiple inputs

Let’s mouv on a more complex exemple...

First, write your processing component

One can imagine the following component that merge two string with two possible methods (choice is made with an option):

>>> from reliure import Optionable
>>> from reliure.types import Text
>>>
>>> class StringMerge(Optionable):
...     """ Stupid component that merge to string together
...     """
...     def __init__(self):
...         super(StringMerge, self).__init__()
...         self.add_option("method", Text(
...             choices=[u"concat", u"altern"],
...             default=u"concat",
...             help="How to merge the inputs"
...         ))
...
...     @Optionable.check
...     def __call__(self, left, right, method=None):
...         if method == u"altern":
...             merge = "".join("".join(each) for each in zip(left, right))
...         else:
...             merge = left + right
...         return merge

One can use this directly in python:

>>> merge_component = StringMerge()
>>> merge_component("aaa", "bbb")
'aaabbb'
>>> merge_component("aaa", "bbb", method=u"altern")
'ababab'
Then create a view on it, and register it on your API

If you want to expose this component on a HTTP API, as for our first exemple, you need to build a “view” (a ComponentView) on it:

>>> view = ComponentView(merge_component)
>>> # you need to define the type of the input
>>> from reliure.types import Text
>>> view.add_input("in_lft", Text())
>>> view.add_input("in_rgh", Text(default=u"ddd"))
>>> # ^ Note that it is possible to give default value for inputs
>>> view.add_output("merge")
>>> # we specify two short urls to reach the function:
>>> view.play_route("<in_lft>/<in_rgh>", "<in_lft>")

Warning

Note that for a ComponentView the order of the inputs matters to match with component (or function) arguments. It is not the name of that permits the match.

Warning

when you define default value for inputs, None can not be a default value.

Then we can register this new view to a reliure API object:

>>> api.register_view(view, url_prefix="merge")
Finaly, just use it !

And then we can use it:

>>> resp = client.get("/api/merge/aaa/bbb")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'aaabbb'}

As we have specify a route that require only one argument, and a default value for this second input (in_rgh), it is also possible to do:

>>> resp = client.get("/api/merge/aaa")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'aaaddd'}

It is also possible to call the API with options:

>>> resp = client.get("/api/merge/aaa/bbb?method=altern")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'ababab'}

Alternatively you can use a POST to send inputs. There is two posibility to provide inputs and options. First by using direct form encoding:

>>> resp = client.post("/api/merge", data={"in_lft":"ee", "in_rgh":"hhhh"})
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'eehhhh'}

And with options in the url:

>>> resp = client.post("/api/merge?method=altern", data={"in_lft":"ee", "in_rgh":"hhhh"})
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'eheh'}

The second option is to use a json payload:

>>> data = {
...     "in_lft":"eeee",
...     "in_rgh":"gg",
...     "options": {
...         "name": "StringMerge",
...         "options": {
...             "method": "altern",
...         }
...     }
... }
>>> json_data = json.dumps(data)
>>> resp = client.post("/api/merge", data=json_data, content_type='application/json')
>>> # note that it is important to specify content_type to 'application/json'
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'egeg'}

Note that a GET call on the root /api/merge returns a json that specify the API. With this, it is possible do list all the options of the component:

>>> resp = client.get("/api/merge")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results)
{'args': ['in_lft', 'in_rgh'],
 'components': [{'default': True,
                 'name': 'StringMerge',
                 'options': [{'name': 'method',
                              'otype': {'choices': ['concat', 'altern'],
                                        'default': 'concat',
                                        'encoding': 'utf8',
                                        'help': 'How to merge the inputs',
                                        'multi': False,
                                        'type': 'Text',
                                        'uniq': False,
                                        'vtype': 'unicode'},
                              'type': 'value',
                              'value': 'concat'}]}],
 'multiple': False,
 'name': 'StringMerge',
 'required': True,
 'returns': ['merge']}

Complex processing engine

Define your engine

Here is a simple reliure engine that we will expose as an HTTP API.

>>> from reliure.engine import Engine
>>> engine = Engine("vowel", "consonant", "concat")
>>> engine.vowel.setup(in_name="text")
>>> engine.consonant.setup(in_name="text")
>>> engine.concat.setup(in_name=["vowel", "consonant"], out_name="merge")
>>>
>>> from reliure import Composable
>>> vowels = u"aiueoéèàùêôûîï"
>>> @Composable
... def extract_vowel(text):
...     return "".join(char for char in text if char in vowels)
>>> engine.vowel.set(extract_vowel)
>>>
>>> @Composable
... def extract_consonant(text):
...     return "".join(char for char in text if char not in vowels)
>>> engine.consonant.set(extract_consonant)
>>>
>>> # for the merge we re-use the component defined in previous section:
>>> engine.concat.set(StringMerge())

The Figure Engine schema. draw the processing schema of this small engine.

Engine schema

Engine schema.

Exemple of engine that we will expose as an API.
(See engine_schema() to see how to generate such schema from an engine)
Create a view and register it on your api

As for a simple component we need to create a view over our engine :

>>> from reliure.web import EngineView
>>> view = EngineView(engine)

And then to define the input and output types:

>>> view.add_input("text", Text())
>>> view.add_output("merge", Text())

We can also specify a short url patern to run the engine:

>>> view.play_route("<text>")

Then you can create a ReliureAPI object and register this view on it:

Then we can register this new view to a reliure API object:

>>> api = ReliureAPI("api")
>>> api.register_view(view, url_prefix="process")
>>> # and register thi api to our flask app :
>>> app.register_blueprint(api, url_prefix="/api")
Use it !
>>> resp = client.get("/api/process/abcdea")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'merge': 'aeabcd'}
>>>
>>> resp = client.get("/api/process/abcdea__bb_aaa")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'merge': 'aeaaaabcd__bb_'}

Note that meta information is also available:

>>> pprint(results["meta"])     
    {'details': [{'details': [{'errors': [],
                                 'name': 'extract_vowel',
                                 'time': 3.695487976074219e-05,
                                 'warnings': []}],
                   'errors': [],
                   'name': 'vowel:[extract_vowel]',
                   'time': 3.695487976074219e-05,
                   'warnings': []},
                  {'details': [{'errors': [],
                                 'name': 'extract_consonant',
                                 'time': 3.0040740966796875e-05,
                                 'warnings': []}],
                   'errors': [],
                   'name': 'consonant:[extract_consonant]',
                   'time': 3.0040740966796875e-05,
                   'warnings': []},
                  {'details': [{'errors': [],
                                 'name': 'StringMerge',
                                 'time': 5.507469177246094e-05,
                                 'warnings': []}],
                   'errors': [],
                   'name': 'concat:[StringMerge]',
                   'time': 5.507469177246094e-05,
                   'warnings': []}],
     'errors': [],
     'name': 'engine:[vowel:[extract_vowel], consonant:[extract_consonant], concat:[StringMerge]]',
     'time': 0.0001220703125,
     'warnings': []}

Offline processing

Reliure provides some helpers to run offline bash processing.

Run a component, or pipeline of components

To illustrate how you can easily run a pipeline of components with reliure, let’s consider that we have a sequence of “document” we want to process:

>>> documents = ["doc1", "doc2", "doc3", "doc4"]

For that we have this two components, that we pipe:

>>> from reliure.pipeline import Composable
>>> @Composable
... def doc_analyse(docs):
...     for doc in docs:
...         yield {
...             "title": doc,
...             "id": int(doc[3:]),
...             "url": "http://lost.com/%s" % doc,
...         }
>>>
>>> @Composable
... def print_ulrs(docs):
...     for doc in docs:
...         print(doc["url"])
...         yield doc
>>>
>>> pipeline = doc_analyse | print_ulrs

To run this pipeline on our documents, you just have to do:

>>> from reliure.offline import run
>>> res = run(pipeline, documents)
http://lost.com/doc1
http://lost.com/doc2
http://lost.com/doc3
http://lost.com/doc4
>>> pprint(res)
[{'id': 1, 'title': 'doc1', 'url': 'http://lost.com/doc1'},
 {'id': 2, 'title': 'doc2', 'url': 'http://lost.com/doc2'},
 {'id': 3, 'title': 'doc3', 'url': 'http://lost.com/doc3'},
 {'id': 4, 'title': 'doc4', 'url': 'http://lost.com/doc4'}]

The exact same pipeline can now be run in // by using run_parallel() instead of run():

>>> from reliure.offline import run_parallel
>>> res = run_parallel(pipeline, documents, ncpu=2, chunksize=5)

Command line interface

reliure.utils.cli provides some helper to create or populate argument parser from Optionable. Let’s look at a simple exemple.

First you have an Optionable component with an option:

>>> class PowerBy(Optionable):
...    def __init__(self):
...        super(PowerBy, self).__init__("testOptionableName")
...        self.add_option("alpha", Numeric(default=4, min=1, max=20,
...                        help='power exponent'))
...
...    @Optionable.check
...    def __call__(self, value, alpha=None):
...        return value**alpha
>>>

Note that it could aslo be a pipeline of components.

Next you want to build a script with a __main__ and you want to map your component options to script option using argparse. Here is how to do (file reliure_cli.py:

#!/usr/bin/env python
import sys

from reliure import Optionable
from reliure.types import Numeric

# definition of your component here
class PowerBy(Optionable):
    def __init__(self):
        super(PowerBy, self).__init__("testOptionableName")
        self.add_option("alpha", Numeric(default=4, min=1, max=20,
                        help='power exponent'))

    @Optionable.check
    def __call__(self, value, alpha=None):
        return value**alpha

# creation of your processing component
mycomp = PowerBy()

def main():
    from argparse import ArgumentParser
    from reliure.utils.cli import arguments_from_optionable, get_config_for

    # build the option parser
    parser = ArgumentParser()
    # Add options form your component
    arguments_from_optionable(parser, mycomp, prefix="power_")

    # Add other options, here the input ;
    parser.add_argument('INPUT', type=int, help='the number to process !')

    # Parse the options and get the config for the component
    args = parser.parse_args()
    config = get_config_for(args, mycomp, prefix="power_")

    result = mycomp(args.INPUT, **config)
    print(result)

    return 0

if __name__ == '__main__':
    sys.exit(main())

With that you will have a nice doc generated:

$ python reliure_cli.py -h
usage: reliure_cli.py [-h] [--power_alpha POWER_ALPHA] INPUT

positional arguments:
  INPUT                 the number to process !

optional arguments:
  -h, --help            show this help message and exit
  --power_alpha POWER_ALPHA
                        power exponent

$ python reliure_cli.py --power_alpha 3 3
27

API reference

Voir reliure.pipeline par exemple, ou meme reliure.pipeline.Optionable !

reliure

class reliure.Composable(func=None, name=None)

Bases: object

Basic composable element

Composable is abstract, you need to implemented the __call__() method

>>> e1 = Composable(lambda element: element**2, name="e1")
>>> e2 = Composable(lambda element: element + 10, name="e2")

Then Composable can be pipelined this way :

>>> chain = e1 | e2
>>> # so yo got:
>>> chain(2)
14
>>> # which is equivalent to :
>>> e2(e1(2))
14
>>> # not that by defaut the pipeline agregate the components name
>>> chain.name
'e1|e2'
>>> # however you can override it
>>> chain.name = "chain"
>>> chain.name
'chain'

It also possible to ‘map’ the composables >>> cmap = e1 & e2 >>> # so you got: >>> cmap(2) [4, 12]

__call__(*args, **kwargs)
__init__(func=None, name=None)

You can create a Composable from a simple function:

>>> def square(val, pow=2):
...     return val ** pow
>>> cfct = Composable(square)
>>> cfct.name
'square'
>>> cfct(2)
4
>>> cfct(3, 3)
27
init_logger(reinit=False)
logger
name

Name of the optionable component

class reliure.Optionable(name=None)

Bases: reliure.pipeline.Composable

Abstract class for an optionable component

__init__(name=None)
Parameters:name (str) – name of the component
add_option(opt_name, otype, hidden=False)

Add an option to the object

Parameters:
  • opt_name (str) – option name
  • otype (subclass of GenericType) – option type
  • hidden (bool) – if True the option will be hidden
change_option_default(opt_name, default_val)

Change the default value of an option

Parameters:
  • opt_name (str) – option name
  • value – new default option value
static check(call_fct)

Decorator for optionable __call__ method It check the given option values

clear_option_value(opt_name)

Clear the stored option value (so the default will be used)

Parameters:opt_name (str) – option name
clear_options_values()

Clear all stored option values (so the defaults will be used)

force_option_value(opt_name, value)

force the (default) value of an option. The option is then no more listed by get_options().

Parameters:
  • opt_name (str) – option name
  • value – option value
get_option_default(opt_name)

Return the default value of a given option

Parameters:opt_name (str) – option name
Returns:the default value of the option
get_option_value(opt_name)

Return the value of a given option

Parameters:opt_name (str) – option name
Returns:the value of the option
get_options(hidden=False)
Parameters:hidden (bool) – whether to return hidden options
Returns:dictionary of all options (with option’s information)
Return type:dict
get_options_values(hidden=False)

return a dictionary of options values

Parameters:hidden (bool) – whether to return hidden options
Returns:dictionary of all option values
Return type:dict
get_ordered_options(hidden=False)
Parameters:hidden (bool) – whether to return hidden option
Returns:ordered list of options pre-serialised (as_dict)
Return type:list [opt_dict, ...]
has_option(opt_name)

Whether the component have a given option

option_is_hidden(opt_name)

Whether the given option is hidden

options
parse_options(option_values)

Set the options (with parsing) and returns a dict of all options values

print_options()

print description of the component options

set_option_value(opt_name, value, parse=False)

Set the value of one option.

# TODO/FIXME
  • add force/hide argument
  • add default argument
  • remove methods force option value
  • remove change_option_default
Parameters:
  • opt_name (str) – option name
  • value – the new value
  • parse (bool) – if True the value is converted from string to the correct type
set_options_values(options, parse=False, strict=False)

Set the options from a dict of values (in string).

Parameters:
  • option_values (dict) – the values of options (in format {“opt_name”: “new_value”})
  • parse (bool) – whether to parse the given value
  • strict (bool) – if True the given option_values dict should only contains existing options (no other key)
class reliure.Map(comp, as_list=False)

Bases: reliure.pipeline.OptionableSequence

Apply a composable to each element of an generator like input.

>>> item_process = Composable(lambda x: x+2) | Composable(lambda x: x if x > 3 else 0)
>>> flux_process = Map(item_process)
>>>
>>> inputs = range(5)
>>> [e for e in flux_process(inputs)]
[0, 0, 4, 5, 6]
__call__(*args, **kwargs)
__init__(comp, as_list=False)

reliure.engine

see Processing engine for documentation

class reliure.engine.BasicPlayMeta(component)

Bases: object

Object to store and manage meta data for one component exec

Here is a typical usage :

>>> import time
>>> comp = Composable(name="TheComp", func=lambda x: x)
>>> # create the meta result before to use the component
>>> meta = BasicPlayMeta(comp)
>>> # imagine some input and options for the component
>>> args, kwargs = [12], {}
>>> # store these data:
>>> meta.run_with(args, kwargs)
>>> # run the component
>>> start = time.time()     # starting time
>>> try:
...     output = comp(*args, **kwargs)
... except Exception as error:
...     # store the exception if any
...     meta.add_error(error)
...     # one can raise a custom error (or not)
...     #raise RuntimeError()
... finally:
...     # this will always be executed (even if the exception is not catched)
...     meta.time = time.time() - start
...     # for testing purpose we put a fixed time
...     meta.time = 9.2e-5
>>> # one can get a pre-serialization of the collected meta data
>>> from pprint import pprint
>>> pprint(meta.as_dict())
{'errors': [], 'name': 'TheComp', 'time': 9.2e-05, 'warnings': []}
__init__(component)
add_error(error)

Register an error that occurs during component running

>>> comp = Composable(name="TheComp", func=lambda x: x)
>>> meta = BasicPlayMeta(comp)
>>> try:
...     output = 1/0
... except Exception as error:
...     # store the exception if any
...     meta.add_error(error)
>>> from pprint import pprint
>>> pprint(meta.as_dict())                           
{'errors': ['...division ...by zero'],
 'name': 'TheComp',
 'time': 0.0,
 'warnings': []}
as_dict()

Pre-serialisation of the meta data

errors
has_error

wether any error happened

has_warning

wether there where a warning during play

name

Name of the component

run_with(inputs, options)

Store the run parameters (inputs and options)

time

Execution time (walltime)

>>> comp = Composable(name="TheComp", func=lambda x: x)
>>> meta = BasicPlayMeta(comp)
>>> meta.time = 453.6
>>> meta.time
453.6
warnings
class reliure.engine.Block(name)

Bases: object

A block is a processing step realised by one component.

A component is a callable object that has a name attribute, often it is also a reliure.Optionable object or a pipeline beeing a reliure.Composable.

Block object provides methods to discover and parse components options (if any).

Warning

You should not have to use a Block directly but always throught a Engine.

__init__(name)

Intialise a block. This should be done only from the Engine.

Parameters:name (str) – name of the Block
all_outputs()

Returns a set of outputs name

append(component, default=False)

Add one component to the block

Parameters:default (bool) – if true this component will be use by default
as_dict()

returns a dictionary representation of the block and of all component options

clear_selections()

Reset the current selections and reset option values to default for all components

Warning

This method also reset the components options values to the defaults values.

component_names()

returns the list of component names.

Component names will have the same order than components

configure(config)

Configure the block from an (horible) configuration dictionary (or list) this data are coming from a json client request and has to be parsed. It takes the default value if missing (for component selection and options).

Parameters:config (dict or list) – component to use and the associated options

config format

{
    'name': name_of_the_comp_to_use,
    'options': {
        name: value,
        name: va...
    }
}

or for multiple selection

[
    {
        'name': name_of_the_comp_to_use,
        'options': {
            name: value,
            name: va...
        }
    },
    {...}
]

Warning

values of options in this dictionnary are strings

defaults

component default component

Note

default components is just an indication for user and the views, except if the Block is required. If required then default is selected if nothing explisitely selected.

name

Name of the optionable component

needed_inputs()

Return a list of (needed) inputs names

play(*inputs, **named_inputs)

Run the selected components of the block. The selected components are run with the already setted options.

Warning

Defaut ‘multiple’ behavior is a pipeline !

Parameters:*inputs

arguments (i.e. inputs) to give to the components

reset()

Removes all the components of the block

select(comp_name, options=None)

Select the components that will by played (with given options).

options will be passed to Optionable.parse_options() if the component is a subclass of Optionable.

Warning

this function also setup the options (if given) of the selected component. Use clear_selections() to restore both selection and component’s options.

This method may be call at play ‘time’, before to call play() to run all selected components.

Parameters:
  • name – name of the component to select
  • options (dict) – options to set to the components
selected()

returns the list of selected component names.

if no component selected return the one marked as default. If the block is required and no component where indicated as default, then the first component is selected.

set(*components)

Set the possible components of the block

Parameters:components – components to append Optionables or Composables
setup(in_name=None, out_name=None, required=None, hidden=None, multiple=None, defaults=None)

Set the options of the block. Only the not None given options are set

Note

a block may have multiple inputs but have only one output

Parameters:
  • in_name (str or list of str) – name(s) of the block input data
  • out_name (str) – name of the block output data
  • required (bool) – whether the block will be required or not
  • hidden (bool) – whether the block will be hidden to the user or not
  • multiple (bool) – if True more than one component may be selected/ run)
  • defaults (list of str, or str) – names of the selected components
validate()

check that the block can be run

class reliure.engine.Engine(*names)

Bases: object

The Reliure engine.

DEFAULT_IN_NAME = 'input'
__init__(*names)

Create the engine

Parameters:names – names of the engine blocks
all_outputs()

Returns a list of all engine possible outputs (note that inputs are also possible inputs)

>>> engine = Engine("op1", "op2")
>>> engine.op1.setup(in_name="in", out_name="middle", required=False)
>>> engine.op2.setup(in_name="middle", out_name="out")
>>> sorted(list(engine.all_outputs()))
['in', 'middle', 'out']

More complex example:

>>> engine = Engine("op1", "op2")
>>> engine.op1.setup(in_name="in", out_name="middle")
>>> engine.op2.setup(in_name=["middle", "in2"], out_name="out")
>>> sorted(list(engine.all_outputs()))
['in', 'in2', 'middle', 'out']

Note that by default the needed input is ‘input’:

>>> engine = Engine("op1", "op2")
>>> engine.op1.append(lambda x:x+2)
>>> engine.op2.append(lambda x:x*2)
>>> sorted(list(engine.all_outputs()))
['input', 'op1', 'op2']
as_dict()

dict repr of the components

configure(config)

Configure all the blocks from an (horible) configuration dictionary this data are coming from a json client request and has to be parsed. It takes the default value if missing (for component selection and options).

Parameters:config (dict) – dictionary that give the component to use for each step and the associated options

config format

 {
     block_name: [{
         'name': name_of_the_comp_to_use,
         'options': {
                 name: value,
                 name: va...
             }
         },
         {...}
     ]
}

Warning

values of options in this dictionnary are strings

in_name

Give the input name of the first block.

If this first block is not required or if other block need some inputs then you beter have to look at needed_inputs().

names()

Returns the list of block names

needed_inputs()

List all the needed inputs of a configured engine

>>> engine = Engine("op1", "op2")
>>> engine.op1.setup(in_name="in", out_name="middle", required=False)
>>> engine.op2.setup(in_name="middle", out_name="out")
>>> engine.op1.append(lambda x:x+2)
>>> engine.op2.append(lambda x:x*2)
>>> engine.op1.select('<lambda>')
>>> list(engine.needed_inputs())
['in']

But now if we unactivate the first component:

>>> engine.op1.clear_selections()
>>> list(engine.needed_inputs())
['middle']

More complex example:

>>> engine = Engine("op1", "op2")
>>> engine.op1.setup(in_name="in", out_name="middle")
>>> engine.op2.setup(in_name=["middle", "in2"], out_name="out")
>>> engine.op1.append(lambda x:x+2)
>>> engine.op2.append(lambda x, y:x*y)
>>> engine.needed_inputs() == {'in', 'in2'}
True

Note that by default the needed input is ‘input’:

>>> engine = Engine("op1", "op2")
>>> engine.op1.append(lambda x:x+2)
>>> engine.op2.append(lambda x:x*2)
>>> list(engine.needed_inputs())
['input']
play(*inputs, **named_inputs)

Run the engine (that should have been configured first)

if the inputs are given without name it should be the inputs of the first block, ig named_inputs are used it may be the inputs of any block.

Note

Either inputs or named_inputs should be provided, not both

Parameters:
  • inputs – the data to give as input to the first block
  • named_inputs – named input data should match with needed_inputs() result.
requires(*names)

Declare what block will be used in this engine.

It should be call before adding or setting any component. Blocks order will be preserved for runnning task.

set(name, *components, **parameters)

Set available components and the options of one block.

Parameters:
  • name – block name
  • components – the components (see Block.set())
  • parameters – block configuration (see Block.setup())

for example :

>>> engine = Engine("op1")
>>> engine.set("op1", Composable(), required=True, in_name="query", out_name="holygrail")
validate(inputs=None)

Check that the blocks configuration is ok

Parameters:inputs (list of str) – the names of the play inputs
class reliure.engine.PlayMeta(name)

Bases: reliure.engine.BasicPlayMeta

Object to store and manage meta data for a set of component or block play

>>> gres = PlayMeta("operation")
>>> res_plus = BasicPlayMeta(Composable(name="plus"))
>>> res_plus.time = 1.6
>>> res_moins = BasicPlayMeta(Composable(name="moins"))
>>> res_moins.time = 5.88
>>> gres.append(res_plus)
>>> gres.append(res_moins)
>>> from pprint import pprint
>>> pprint(gres.as_dict())
{'details': [{'errors': [], 'name': 'plus', 'time': 1.6, 'warnings': []},
             {'errors': [], 'name': 'moins', 'time': 5.88, 'warnings': []}],
 'errors': [],
 'name': 'operation:[plus, moins]',
 'time': 7.48,
 'warnings': []}
__init__(name)
add_error(error)

It is not possible to add an error here, you sould add it on a BasicPlayMeta

append(meta)

Add a BasicPlayMeta

as_dict()

Pre-serialisation of the meta data

errors

get all the errors

>>> gres = PlayMeta("operation")
>>> res_plus = BasicPlayMeta(Composable(name="plus"))
>>> gres.append(res_plus)
>>> res_plus.add_error(ValueError("invalid data"))
>>> res_moins = BasicPlayMeta(Composable(name="moins"))
>>> gres.append(res_moins)
>>> res_plus.add_error(RuntimeError("server not anwsering"))
>>> gres.errors
[ValueError('invalid data',), RuntimeError('server not anwsering',)]
name

Compute a name according to sub meta results names

>>> gres = PlayMeta("operation")
>>> res_plus = BasicPlayMeta(Composable(name="plus"))
>>> res_moins = BasicPlayMeta(Composable(name="moins"))
>>> gres.append(res_plus)
>>> gres.append(res_moins)
>>> gres.name
'operation:[plus, moins]'
time

Compute the total time (walltime)

>>> gres = PlayMeta("operation")
>>> res_plus = BasicPlayMeta(Composable(name="plus"))
>>> res_plus.time = 1.6
>>> res_moins = BasicPlayMeta(Composable(name="moins"))
>>> res_moins.time = 5.88
>>> gres.append(res_plus)
>>> gres.append(res_moins)
>>> gres.time
7.48
warnings

reliure.exceptions

exception reliure.exceptions.ReliureError

Bases: exceptions.Exception

Basic reliure error

exception reliure.exceptions.ReliurePlayError(msg)

Bases: exceptions.Exception

Error occuring at engine ‘play’ time

This errors can be show to the user

>>> error = ReliurePlayError("an error message")
>>> error.msg
'an error message'
__init__(msg)
Parameters:msg – the message for the user
exception reliure.exceptions.ReliureTypeError

Bases: reliure.exceptions.ReliureError

Error in a reliure Type

exception reliure.exceptions.ReliureValueError

Bases: reliure.exceptions.ReliureError, exceptions.ValueError

Reliure value error: one value (attribute) was wrong

exception reliure.exceptions.ValidationError(message, params=None)

Bases: reliure.exceptions.ReliureTypeError

An error while validating data of a given type.

It may be either a single validation error or a list of validation error

>>> from reliure.utils.i18n import _
>>> error = ValidationError(_("a message with a value : %(value)s"), {'value': 42})
>>> for err in error: print(err)
a message with a value : 42
__init__(message, params=None)

reliure.offline

reliure.offline.main()

Small run usage exemple

reliure.offline.run(pipeline, input_gen, options={})

Run a pipeline over a input generator

>>> # if we have a simple component
>>> from reliure.pipeline import Composable
>>> @Composable
... def print_each(letters):
...     for letter in letters:
...         print(letter)
...         yield letter
>>> # that we want to run over a given input:
>>> input = "abcde"
>>> # we just have to do :
>>> res = run(print_each, input)
a
b
c
d
e

it is also possible to run any reliure pipeline this way: >>> import string >>> pipeline = Composable(lambda letters: (l.upper() for l in letters)) | print_each >>> res = run(pipeline, input) A B C D E

reliure.offline.run_parallel(pipeline, input_gen, options={}, ncpu=4, chunksize=200)

Run a pipeline in parallel over a input generator cutting it into small chunks.

>>> # if we have a simple component
>>> from reliure.pipeline import Composable
>>> # that we want to run over a given input:
>>> input = "abcde"
>>> import string
>>> pipeline = Composable(lambda letters: (l.upper() for l in letters))
>>> res = run_parallel(pipeline, input, ncpu=2, chunksize=2)
>>> #Note: res should be equals to [['C', 'D'], ['A', 'B'], ['E']]
>>> #but it seems that there is a bug with py.test and mp...

reliure.option

Option objects used in reliure.Optionable.

inheritance diagrams

Inheritance diagram of ValueOption

Class
class reliure.options.ListOption(name, otype)

Bases: reliure.options.Option

option with multiple value

parse(values)
validate(values)
class reliure.options.Option(name, otype)

Bases: object

Abstract option value

static FromType(name, otype)

ValueOption subclasses factory, creates a convenient option to store data from a given Type.

attribute precedence :

  • |attrs| > 0 (multi and uniq are implicit) => NotImplementedError
  • uniq (multi is implicit) => NotImplementedError
  • multi and not uniq => NotImplementedError
  • not multi => ValueOption
Parameters:
  • name (str) – Name of the option
  • otype (subclass of GenericType) – the desired type of field
__init__(name, otype)
Parameters:
  • name (str) – option name
  • otype (subclass of GenericType) – option type
as_dict()

returns a dictionary view of the option

Returns:the option converted in a dict
Return type:dict
clear()

Reset the option value (default will be used)

default

Default value of the option

Warning

changing the default value also change the current value

name

Name of the option.

parse(value)

Convert the value from data just decoded from json.

Parameters:value – a potential value for the option
Returns:the value converted to the good type
set(value, parse=False)

Set the value of the option.

One can also set the ‘value’ property:

>>> opt = ValueOption("oname", Numeric(default=1,help="an option exemple"))
>>> opt.value = 12
Parameters:value – the new value
summary()
validate(value)

Raises ValidationError if the value is not correct, else just returns the given value.

It is called when a new value is setted.

Parameters:value – the value to validate
Returns:the value
value

Value of the option

class reliure.options.SetOption(name, otype)

Bases: reliure.options.Option

option with multiple value

parse(values)
validate(values)
class reliure.options.ValueOption(name, otype)

Bases: reliure.options.Option

Single value option

parse(value)
validate(value)

reliure.pipeline

inheritance diagrams

Inheritance diagram of reliure.pipeline

Class
class reliure.pipeline.Composable(func=None, name=None)

Bases: object

Basic composable element

Composable is abstract, you need to implemented the __call__() method

>>> e1 = Composable(lambda element: element**2, name="e1")
>>> e2 = Composable(lambda element: element + 10, name="e2")

Then Composable can be pipelined this way :

>>> chain = e1 | e2
>>> # so yo got:
>>> chain(2)
14
>>> # which is equivalent to :
>>> e2(e1(2))
14
>>> # not that by defaut the pipeline agregate the components name
>>> chain.name
'e1|e2'
>>> # however you can override it
>>> chain.name = "chain"
>>> chain.name
'chain'

It also possible to ‘map’ the composables >>> cmap = e1 & e2 >>> # so you got: >>> cmap(2) [4, 12]

__call__(*args, **kwargs)
__init__(func=None, name=None)

You can create a Composable from a simple function:

>>> def square(val, pow=2):
...     return val ** pow
>>> cfct = Composable(square)
>>> cfct.name
'square'
>>> cfct(2)
4
>>> cfct(3, 3)
27
init_logger(reinit=False)
logger
name

Name of the optionable component

class reliure.pipeline.Map(comp, as_list=False)

Bases: reliure.pipeline.OptionableSequence

Apply a composable to each element of an generator like input.

>>> item_process = Composable(lambda x: x+2) | Composable(lambda x: x if x > 3 else 0)
>>> flux_process = Map(item_process)
>>>
>>> inputs = range(5)
>>> [e for e in flux_process(inputs)]
[0, 0, 4, 5, 6]
__call__(*args, **kwargs)
__init__(comp, as_list=False)
class reliure.pipeline.MapReduce(reduce, *composables)

Bases: reliure.pipeline.MapSeq

MapReduce implentation for components

One can pass a simple function:

>>> mapseq = MapReduce(sum, lambda x: x+1, lambda x: x+2, lambda x: x+3)
>>> mapseq(10)
36

Or implements sub class of MapReduce:

>>> class MyReduce(MapReduce):
...     def __init__(self, *composables):
...         super(MyReduce, self).__init__(None, *composables)
...     def reduce(self, array, *args, **kwargs):
...         return  list(args) + [sum(array)]
>>> mapreduce = MyReduce(lambda x: x+1, lambda x: x+2, lambda x: x+3)
>>> mapreduce(10)
[10, 36]
__call__(*args, **kwargs)
__init__(reduce, *composables)
reduce(array, *args, **kwargs)
class reliure.pipeline.MapSeq(*composants, **kwargs)

Bases: reliure.pipeline.OptionableSequence

Map implentation for components

>>> mapseq = MapSeq(lambda x: x+1, lambda x: x+2, lambda x: x+3)
>>> mapseq(10)
[11, 12, 13]
>>> sum(mapseq(10))
36
__call__(*args, **kwargs)
map(*args, **kwargs)
class reliure.pipeline.Optionable(name=None)

Bases: reliure.pipeline.Composable

Abstract class for an optionable component

__init__(name=None)
Parameters:name (str) – name of the component
add_option(opt_name, otype, hidden=False)

Add an option to the object

Parameters:
  • opt_name (str) – option name
  • otype (subclass of GenericType) – option type
  • hidden (bool) – if True the option will be hidden
change_option_default(opt_name, default_val)

Change the default value of an option

Parameters:
  • opt_name (str) – option name
  • value – new default option value
static check(call_fct)

Decorator for optionable __call__ method It check the given option values

clear_option_value(opt_name)

Clear the stored option value (so the default will be used)

Parameters:opt_name (str) – option name
clear_options_values()

Clear all stored option values (so the defaults will be used)

force_option_value(opt_name, value)

force the (default) value of an option. The option is then no more listed by get_options().

Parameters:
  • opt_name (str) – option name
  • value – option value
get_option_default(opt_name)

Return the default value of a given option

Parameters:opt_name (str) – option name
Returns:the default value of the option
get_option_value(opt_name)

Return the value of a given option

Parameters:opt_name (str) – option name
Returns:the value of the option
get_options(hidden=False)
Parameters:hidden (bool) – whether to return hidden options
Returns:dictionary of all options (with option’s information)
Return type:dict
get_options_values(hidden=False)

return a dictionary of options values

Parameters:hidden (bool) – whether to return hidden options
Returns:dictionary of all option values
Return type:dict
get_ordered_options(hidden=False)
Parameters:hidden (bool) – whether to return hidden option
Returns:ordered list of options pre-serialised (as_dict)
Return type:list [opt_dict, ...]
has_option(opt_name)

Whether the component have a given option

option_is_hidden(opt_name)

Whether the given option is hidden

options
parse_options(option_values)

Set the options (with parsing) and returns a dict of all options values

print_options()

print description of the component options

set_option_value(opt_name, value, parse=False)

Set the value of one option.

# TODO/FIXME
  • add force/hide argument
  • add default argument
  • remove methods force option value
  • remove change_option_default
Parameters:
  • opt_name (str) – option name
  • value – the new value
  • parse (bool) – if True the value is converted from string to the correct type
set_options_values(options, parse=False, strict=False)

Set the options from a dict of values (in string).

Parameters:
  • option_values (dict) – the values of options (in format {“opt_name”: “new_value”})
  • parse (bool) – whether to parse the given value
  • strict (bool) – if True the given option_values dict should only contains existing options (no other key)
class reliure.pipeline.OptionableSequence(*composants, **kwargs)

Bases: reliure.pipeline.Optionable

Abstract class to manage a composant made of a sequence of callable (Optionable or not).

This object is an Optionable witch as all the options of it composants

__call__(*args, **kwargs)
__init__(*composants, **kwargs)

Create a optionable sequence

Parameters:shared_option – allow two components to share an option, if the two

options have the same name (False by default)

add_option(opt_name, otype, hidden=False)
call_item(item, *args, **kwargs)
change_option_default(opt_name, default_val)
clear_option_value(opt_name)
clear_options_values()
close()

Close all the neested components

force_option_value(opt_name, value)
get_option_default(opt_name)
get_option_value(opt_name)
get_options_values(hidden=False)
has_option(opt_name)
option_is_hidden(opt_name)
options
set_option_value(opt_name, value, parse=False)
set_options_values(option_values, parse=True, strict=False)
class reliure.pipeline.Pipeline(*composables, **kwargs)

Bases: reliure.pipeline.OptionableSequence

A Pipeline is a sequence of function called sequentially.

It may be created explicitely:

>>> step1 = lambda x: x**2
>>> step2 = lambda x: x-1
>>> step3 = lambda x: min(x, 22)
>>> processing = Pipeline(step1, step2, step3)
>>> processing(4)
15
>>> processing(40)
22

Or it can be created implicitely with the pipe operator (__or__) if the first function is Composable:

>>> step1 = Composable(step1)
>>> processing = step1 | step2 | step3
>>> processing(3)
8
>>> processing(0)
-1
__call__(*args, **kwargs)
__init__(*composables, **kwargs)

reliure.schema

copyright:
  1. 2013 - 2014 by Yannick Chudy, Emmanuel Navarro.
license:

${LICENSE}

inheritance diagrams

Inheritance diagram of Schema

Inheritance diagram of Doc

Inheritance diagram of DocField, VectorField, ValueField, SetField

Class
class reliure.schema.Doc(schema=None, **data)

Bases: dict

Document object

Here is an exemple of document construction from a simple text. First we define document’s schema:

>>> from reliure.types import Text, Numeric
>>> term_field = Text(attrs={'tf':Numeric(default=1), 'positions':Numeric(multi=True)})
>>> schema = Schema(docnum=Numeric(), text=Text(), terms=term_field)

Now it is how one can build a document from this simple text:

>>> text = """i have seen chicken passing the street and i believed
... how many chicken must pass in the street before you
... believe"""

Then we can create the document:

>>> doc = Doc(schema, docnum=1, text=text)
>>> doc.text[:6]
'i have'
>>> len(doc.text)
113
>>> doc["docnum"]
1

Then we can analyse the text:

>>> tokens = text.split(' ')
>>> from collections import OrderedDict
>>> text_terms =  list(OrderedDict.fromkeys(tokens))
>>> terms_tf = [ tokens.count(k) for k in text_terms ]
>>> terms_pos = [[i for i, tok in enumerate(tokens) if tok == k ] for k in text_terms]

and one can store the result in the field “terms”:

>>> doc.terms = text_terms
>>> doc.terms.tf.values()   # here we got only '1', it's the default value
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
>>> doc.terms.tf = terms_tf
>>> doc.terms.positions = terms_pos

One can access the information, for example, for the term “chicken”:

>>> key = "chicken"
>>> doc.terms[key].tf
2
>>> doc.terms[key].positions
[3, 11]
>>> doc.terms.get_attr_value(key, 'positions')
[3, 11]
>>> doc.terms._keys[key]
3
>>> doc.terms.positions[3]
[3, 11]

#TODO: la valeur de docnum doit être passer en argument de __init__

__init__(schema=None, **data)

Document initialisation

Warning

a copy of the given schema is stored in the document

Simple exemple:

>>> from reliure.types import Text, Numeric
>>> doc = Doc(Schema(titre=Text()), titre='Un titre')

Not that a “docnum” field is always present, i.e. it is added if not given in schema: >>> doc = Doc(docnum=”42”) >>> doc.docnum ‘42’

add_field(name, ftype, docfield=None)

Add a field to the document (and to the underlying schema)

Parameters:
  • name (str) – name of the new field
  • ftype (subclass of GenericType) – type of the new field
export(exclude=[])

returns a dictionary representation of the document

get_field(name)

Returns the DocField field for the given name

set_field(name, value, parse=False)

Set the value of a field

class reliure.schema.DocField(ftype)

Bases: object

Abstract document field

Theses objects are containers of document’s data.

static FromType(ftype)

DocField subclasses factory, creates a convenient field to store data from a given Type.

attribute precedence :

  • |attrs| > 0 (multi and uniq are implicit) => VectorField
  • uniq (multi is implicit) => SetField
  • multi and not uniq => ListField
  • not multi => ValueField
Parameters:ftype (subclass of GenericType) – the desired type of field
__init__(ftype)
Parameters:ftype (subclass of GenericType) – the type for the field
export()

Returns a serialisable representation of the field

ftype
get_value()

return the value of the field.

parse(value)
exception reliure.schema.FieldValidationError(field, value, errors)

Bases: exceptions.Exception

Error in a field validation

__init__(field, value, errors)
class reliure.schema.ListField(fieldtype)

Bases: reliure.schema.DocField, list

list container for non-uniq field

usage example:

>>> from reliure.types import Text
>>> schema = Schema(tags=Text(multi=True, uniq=False))
>>> doc = Doc(schema, docnum='abc42')
>>> doc.tags.add('boo')
>>> doc.tags.add('foo')
>>> doc.tags.add('foo')
>>> len(doc.tags)
3
>>> doc.tags.export()
['boo', 'foo', 'foo']
__init__(fieldtype)
add(value)

Adds a value to the list (as append). convenience method, to have the same signature than SetField and VectorField

append(value)
export()

returns a list pre-seriasation of the field

>>> from reliure.types import Text
>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True) 
>>> doc.terms.add('rat')
>>> doc.terms.add('chien')
>>> doc.terms.add('chat')
>>> doc.terms.add('léopart')
>>> doc.terms.export()
['rat', 'chien', 'chat', 'l\xe9opart']
get_value()
parse(value)
set(values)

set new values (values have to be iterable)

class reliure.schema.Schema(**fields)

Bases: object

Schema definition for documents (Doc). Class inspired from Matt Chaput’s Whoosh.

Creating a schema:

>>> from reliure.types import Text, Numeric
>>> schema = Schema(title=Text(), score=Numeric())
>>> sorted(schema.field_names())
['score', 'title']
__init__(**fields)

Create a schema from pairs of field name and field type

For exemple:

>>> from reliure.types import Text, Numeric
>>> schema = Schema(tags=Text(multi=True), score=Numeric(vtype=float, min=0., max=1.))
add_field(name, field)

Add a named field to the schema.

Warning

the field name should not contains spaces and should not start with an underscore.

Parameters:
  • name (str) – name of the new field
  • field (subclass of GenericType) – type instance for the field
copy()

Returns a copy of the schema

field_names()
has_field(name)
iter_fields()
remove_field(field_name)
exception reliure.schema.SchemaError

Bases: exceptions.Exception

Error

class reliure.schema.SetField(fieldtype)

Bases: reliure.schema.DocField, set

Document field for a set of values (i.e. the fieldtype is “multi” and “uniq”)

usage example:

>>> from reliure.types import Text
>>> schema = Schema(tags=Text(multi=True, uniq=True))
>>> doc = Doc(schema, docnum='abc42')
>>> doc.tags.add('boo')
>>> doc.tags.add('foo')
>>> len(doc.tags)
2
>>> sorted(doc.tags.export())
['boo', 'foo']
__init__(fieldtype)
add(value)
export()
get_value()
parse(value)
set(values)
class reliure.schema.ValueField(fieldtype)

Bases: reliure.schema.DocField

Stores only one value

usage example:

>>> from reliure.types import Text
>>> schema = Schema(title=Text(), like=Numeric(default=45))
>>> doc = Doc(schema, docnum='abc42')
>>> # 'title' field
>>> doc.title = 'Un titre cool !'
>>> doc.title
'Un titre cool !'
>>> doc.get_field('title').export()
'Un titre cool !'
>>> doc.get_field('title').ftype
Text(multi=False, uniq=False, default=, attrs=None)
>>> # 'like' field
>>> doc.like
45
__init__(fieldtype)
export()
get_value()
set(value)
class reliure.schema.VectorAttr(vector, attr)

Bases: object

Internal class used to acces an attribute of a VectorField

>>> from reliure.types import Text, Numeric
>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) 
>>> doc.terms.add('chat')
>>> type(doc.terms.tf)
<class 'reliure.schema.VectorAttr'>
__init__(vector, attr)
export()
values()
class reliure.schema.VectorField(ftype)

Bases: reliure.schema.DocField

More complex document field container

Hide:
>>> from pprint import pprint

usage:

>>> from reliure.types import Text, Numeric
>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) 
>>> doc.terms.add('chat')
>>> doc.terms['chat'].tf = 12
>>> doc.terms['chat'].tf
12
>>> doc.terms.add('dog', tf=55)
>>> doc.terms['dog'].tf
55

One can also add an atribute after the field is created:

>>> doc.terms.add_attribute('foo', Numeric(default=42))
>>> doc.terms.foo.values()
[42, 42]
>>> doc.terms['dog'].foo = 20
>>> doc.terms.foo.values()
[42, 20]

It is also possible to delete elements from the field

>>> pprint(doc.terms.export())
{'foo': [42, 20], 'keys': {'chat': 0, 'dog': 1}, 'tf': [12, 55]}
>>> del doc.terms['chat']
>>> pprint(doc.terms.export())
{'foo': [20], 'keys': {'dog': 0}, 'tf': [55]}
__init__(ftype)
add(key, **kwargs)

Add a key to the vector, do nothing if the key is already present

>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, attrs={'tf': Numeric(default=1, min=0)}) 
>>> doc.terms.add('chat')
>>> doc.terms.add('dog', tf=2)
>>> doc.terms.tf.values()
[1, 2]
>>> doc.terms.add('mouse', comment="a small mouse")
Traceback (most recent call last):
...
ValueError: Invalid attribute name: 'comment'
>>> doc.terms.add('mouse', tf=-2)
Traceback (most recent call last):
ValidationError: ['Ensure this value ("-2") is greater than or equal to 0.']
add_attribute(name, ftype)

Add a data attribute. Note that the field type will be modified !

Parameters:
  • name (str) – name of the new attribute
  • ftype (subclass of GenericType) – type of the new attribute
attribute_names()

returns the names of field’s data attributes

Returns:set of attribute names
Return type:frozenset
clear_attributes()

removes all attributes

export()

returns a dictionary pre-seriasation of the field

Hide:
>>> from pprint import pprint
>>> from reliure.types import Text, Numeric
>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric(default=1)}) 
>>> doc.terms.add('chat')
>>> doc.terms.add('rat', tf=5)
>>> doc.terms.add('chien', tf=2)
>>> pprint(doc.terms.export())
{'keys': {'chat': 0, 'chien': 2, 'rat': 1}, 'tf': [1, 5, 2]}
get_attr_value(key, attr)

returns the value of a given attribute for a given key

>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) 
>>> doc.terms.add('chat', tf=55)
>>> doc.terms.get_attr_value('chat', 'tf')
55
get_attribute(name)
get_value()

from DocField, convenient method

has(key)
keys()

list of keys in the vector

set(keys)

Set new keys. Mind this will clear all attributes and keys before adding new keys

>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, attrs={'tf': Numeric(default=1)}) 
>>> doc.terms.add('copmputer', tf=12)
>>> doc.terms.tf.values()
[12]
>>> doc.terms.set(['keyboard', 'mouse'])
>>> list(doc.terms)
['keyboard', 'mouse']
>>> doc.terms.tf.values()
[1, 1]
set_attr_value(key, attr, value)

set the value of a given attribute for a given key

class reliure.schema.VectorItem(vector, key)

Bases: object

Internal class used to acces an item (= a value) of a VectorField

>>> from reliure.types import Text, Numeric
>>> doc = Doc(docnum='1')
>>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) 
>>> doc.terms.add('chat')
>>> type(doc.terms['chat'])
<class 'reliure.schema.VectorItem'>
__init__(vector, key)
as_dict()
attribute_names()

reliure.types

inheritance diagrams

Inheritance diagram of reliure.types

Class
class reliure.types.Boolean(**kwargs)

Bases: reliure.types.GenericType

TRUE_VALUES = set([True, 'oui', 'o', '1', 'yes', 'true'])
__init__(**kwargs)
default_validators = [<reliure.validators.TypeValidator object at 0x7efc62433090>]
parse(value)
class reliure.types.Datetime(**kwargs)

Bases: reliure.types.GenericType

datetime type

__init__(**kwargs)
as_dict()
parse(value)
class reliure.types.GenericType(default=None, help='', multi=None, uniq=None, choices=None, attrs=None, validators=[], parse=None, serialize=None)

Bases: object

Define a type.

__init__(default=None, help='', multi=None, uniq=None, choices=None, attrs=None, validators=[], parse=None, serialize=None)
Parameters:
  • default – default value for the field
  • help (str) – description of what the data is
  • multi (bool) – field is a list or a set
  • uniq (bool) – wether the values are unique, only apply if multi is True
  • choices (list) – if setted the value should be one of the given choice
  • attrs – field attributes, dictionary of {“name”: AbstractType()}
  • validators – list of additional validators
  • parse – a parsing function
  • serialize – a pre-serialization function
as_dict()

returns a dictionary view of the option

Returns:the option converted in a dict
Return type:dict
default

Default value of the type

default_validators = []
parse(value)

parsing from string

serialize(value, **kwargs)

pre-serialize value

validate(value)

Abstract method, check if a value is correct (type). Should raise TypeError if the type the validation fail.

Parameters:value – the value to validate
Returns:the given value (that may have been converted)
class reliure.types.Numeric(vtype=<type 'int'>, min=None, max=None, **kwargs)

Bases: reliure.types.GenericType

Numerical type (int or float)

__init__(vtype=<type 'int'>, min=None, max=None, **kwargs)
Parameters:
  • vtype – the type of numbers that can be stored in this field, either int, float.
  • signed (bool) – if the value may be negatif (True by default)
  • min – if not None, the minimal possible value
  • max – if not None, the maximal possible value
as_dict()
parse(value)
class reliure.types.Text(encoding=None, **kwargs)

Bases: reliure.types.GenericType

Text type (in python 2 take care that is unicode)

if not setted default value is an empty string.

__init__(encoding=None, **kwargs)
as_dict()
default_encoding = 'utf8'
parse(value)

reliure.utils

reliure.utils.log

Helper function to setup a basic logger for a reliure app

class reliure.utils.log.ColorFormatter(*args, **kwargs)

Bases: logging.Formatter

BLACK = 0
BLUE = 4
BOLD_SEQ = '\x1b[1m'
COLORS = {'BLUE': 4, 'YELLOW': 3, 'GREEN': 2, 'ERROR': 1, 'DEBUG': 4, 'CYAN': 6, 'WHITE': 7, 'RED': 1, 'INFO': 7, 'WARNING': 3, 'CRITICAL': 3, 'MAGENTA': 5}
COLOR_SEQ = '\x1b[1;%dm'
CYAN = 6
GREEN = 2
MAGENTA = 5
RED = 1
RESET_SEQ = '\x1b[0m'
WHITE = 7
YELLOW = 3
__init__(*args, **kwargs)
format(record)
class reliure.utils.log.SpeedLogger(each=1000, elements='documents')

Bases: reliure.pipeline.Composable

Pipeline element that do nothing but log the procesing speed every K element and at the end.

if you have a processing pipe, for example:

>>> from reliure.pipeline import Composable
>>> pipeline = Composable(lambda data: (x**3 for x in data))

you can add a SpeedLogger in it:

>>> pipeline |= SpeedLogger(each=30000, elements="numbers")

And then when you run your pipeline:

>>> import logging
>>> from reliure.utils.log import get_basic_logger
>>> logger = get_basic_logger(logging.INFO)
>>> from reliure.offline import run
>>> results = run(pipeline, range(100000))

You will get a logging like that:

2014-09-26 15:19:56,139:INFO:reliure.SpeedLogger:Process 30000 numbers in 0.010 sec (2894886.12 numbers/sec)
2014-09-26 15:19:56,148:INFO:reliure.SpeedLogger:Process 30000 numbers in 0.008 sec (3579572.14 numbers/sec)
2014-09-26 15:19:56,156:INFO:reliure.SpeedLogger:Process 30000 numbers in 0.008 sec (3719343.80 numbers/sec)
2014-09-26 15:19:56,159:INFO:reliure.SpeedLogger:Process 9997 numbers in 0.003 sec (3367096.85 numbers/sec)
2014-09-26 15:19:56,159:INFO:reliure.SpeedLogger:In total: 100000 numbers proceded in 0.030 sec (3307106.53 numbers/sec)
__call__(inputs)
__init__(each=1000, elements='documents')
Parameters:
  • each – log the speed every each element
  • elements – name of the elements in the produced log lines
reliure.utils.log.get_app_logger_color(appname, app_log_level=20, log_level=30, logfile=None)

Configure the logging for an app using reliure (it log’s both the app and reliure lib)

Parameters:
  • appname – the name of the application to log
  • log_level – log level for the reliure
  • logfile – file that store the log, time rotating file (by day), no if None
Parap app_log_level:
 

log level for the app

reliure.utils.log.get_basic_logger(level=30, scope='reliure')

return a basic logger that print on stdout msg from reliure lib

reliure.utils.i18n

helpers for internationalisation

reliure.utils.cli
copyright:
  1. 2013 - 2014 by Yannick Chudy, Emmanuel Navarro.
license:

${LICENSE}

Helper function to setup argparser from reliure components and engines

reliure.utils.cli.argument_from_option(parser, component, opt_name, prefix='')

Add an argparse argument to a parse from one option of one Optionable

>>> comp = Optionable()
>>> comp.add_option("num", Numeric(default=1, max=12, help="An exemple of option"))
>>> parser = argparse.ArgumentParser(prog="PROG")
>>> argument_from_option(parser, comp, "num")
>>> parser.print_help()
usage: PROG [-h] [--num NUM]

optional arguments:
  -h, --help  show this help message and exit
  --num NUM   An exemple of option
>>> parser.parse_args(["--num", "2"])
Namespace(num=2)
>>> parser.parse_args(["--num", "20"])          
Traceback (most recent call last):
...
ValidationError: [u'Ensure this value ("20") is less than or equal to 12.']
reliure.utils.cli.arguments_from_optionable(parser, component, prefix='')

Add argparse arguments from all options of one Optionable

>>> # Let's build a dummy optionable component:
>>> comp = Optionable()
>>> comp.add_option("num", Numeric(default=1, max=12, help="An exemple of option"))
>>> comp.add_option("title", Text(help="The title of the title"))
>>> comp.add_option("ok", Boolean(help="is it ok ?", default=True))
>>> comp.add_option("cool", Boolean(help="is it cool ?", default=False))
>>>
>>> # one can then register all the options of this component to a arg parser
>>> parser = argparse.ArgumentParser(prog="PROG")
>>> arguments_from_optionable(parser, comp)
>>> parser.print_help()
usage: PROG [-h] [--num NUM] [--title TITLE] [--not-ok] [--cool]

optional arguments:
  -h, --help     show this help message and exit
  --num NUM      An exemple of option
  --title TITLE  The title of the title
  --not-ok       is it ok ?
  --cool         is it cool ?

The option values for a componant can then be retrieved with get_config_for()

>>> args = parser.parse_args() 
>>> config = get_config_for(args, comp)
>>> comp("input", **config) 
"comp_result"
reliure.utils.cli.get_config_for(args, component, prefix='')

Returns a dictionary of option value for a given component

See arguments_from_optionable() documentation

reliure.utils.deprecated(new_fct_name, logger=None)

Decorator to notify that a fct is deprecated

reliure.utils.engine_schema(engine, out_names=None, filename=None, format='pdf')

Build a graphviz schema of a reliure Engine.

It depends on graphviz.

Parameters:
  • engine (Engine) – reliure engine to graph
  • out_names – list of block output to consider as engine output (all by default)
  • filename – output filename, create the file if given
  • formal – output file formal (pdf, svg, png, ...)
>>> from reliure.engine import Engine
>>> egn = Engine("preproc", "proc1", "proc2")
>>> egn.preproc.setup(in_name="input", out_name="data")
>>> egn.proc1.setup(in_name="data", out_name="gold_data")
>>> egn.proc2.setup(in_name="input", out_name="silver_data")
>>> # you can create
>>> schema = engine_schema(egn, filename='docs/img/engine_schema', format='png')

it create the following image :

../_static/engine_schema.png

You can specify which block output will be consider as engine output:

>>> schema = engine_schema(egn, ["gold_data", "silver_data"], filename='docs/img/engine_schema_simple', format='png')
../_static/engine_schema_simple.png

Note that it is also possible to draw a pdf;

>>> schema = engine_schema(egn, ["gold_data", "silver_data"], filename='docs/img/engine_schema_simple', format='pdf')
reliure.utils.parse_bool(value)

Convert a string to a boolean

>>> parse_bool(None)
False
>>> parse_bool("true")
True
>>> parse_bool("TRUE")
True
>>> parse_bool("yes")
True
>>> parse_bool("1")
True
>>> parse_bool("false")
False
>>> parse_bool("sqdf")
False
>>> parse_bool(False)
False
>>> parse_bool(True)
True

reliure.validators

class reliure.validators.ChoiceValidator(ref_value)

Bases: reliure.validators.compareValidator

compare(value, ref)
message = 'Ensure this value ("%(show_value)s") is in %(ref_value)s.'
class reliure.validators.MaxValueValidator(ref_value)

Bases: reliure.validators.compareValidator

compare(value, ref)
message = 'Ensure this value ("%(show_value)s") is less than or equal to %(ref_value)s.'
class reliure.validators.MinValueValidator(ref_value)

Bases: reliure.validators.compareValidator

compare(value, ref)
message = 'Ensure this value ("%(show_value)s") is greater than or equal to %(ref_value)s.'
class reliure.validators.TypeValidator(vtype, message=None)

Bases: object

Validate that a value have a given type

__call__(value)
__init__(vtype, message=None)
message = "The value '%(show_value)s' is not of type %(value_type)s"
class reliure.validators.compareValidator(ref_value)

Bases: object

Validate a value by comparing it to a reference value

__call__(value)
__init__(ref_value)
compare(value, ref)
message = "The value '%(show_value)s' is not %(ref_value)s"
preprocess(value)

reliure.web

helpers to build HTTP/Json Api from reliure engines

reliure.web.app_routes(app)

list of route of an app

class reliure.web.EngineView(engine, name=None)

Bases: object

View over an Engine or a Block

>>> engine = Engine("count_a", "count_b")
>>>
>>> engine.count_a.setup(in_name='in')
>>> engine.count_a.set(lambda chaine: chaine.count("a"))
>>>
>>> engine.count_b.setup(in_name='in')
>>> engine.count_b.set(lambda chaine: chaine.count("b"))
>>> 
>>> 
>>> # we can build a view on this engine
>>> egn_view = EngineView(engine, name="count")
>>> egn_view.add_output("count_a")  # note that by default block outputs are named by block's name
>>> egn_view.add_output("count_b")
>>> # one can specify a short url for this engine
>>> egn_view.play_route("/q/<in>")
>>>
>>> # this view can be added to a reliure API
>>> api = ReliureAPI("api")
>>> api.register_view(egn_view)
__init__(engine, name=None)
add_input(in_name, type_or_parse=None)

Declare a possible input

add_output(out_name, type_or_serialize=None, **kwargs)

Declare an output

options()

Engine options discover HTTP entry point

parse_request()

Parse request for play()

play()

Main http entry point: run the engine

play_route(*routes)

Define routes for GET play.

This use Flask route syntax, see: http://flask.pocoo.org/docs/0.10/api/#url-route-registrations

run(inputs_data, options)

Run the engine/block according to some inputs data and options

It is called from play()

Parameters:
  • inputs_data – dict of input data
  • options – engine/block configuration dict
set_input_type(type_or_parse)

Set an unique input type.

If you use this then you have only one input for the play.

set_outputs(*outputs)

Set the outputs of the view

short_play(**kwargs)

Main http entry point: run the engine

class reliure.web.ComponentView(component)

Bases: reliure.web.EngineView

View over a simple component (Composable or simple function)

__init__(component)
add_input(in_name, type_or_parse=None)
add_output(out_name, type_or_serialize=None, **kwargs)
class reliure.web.ReliureAPI(name='api', url_prefix=None, expose_route=True, **kwargs)

Bases: flask.blueprints.Blueprint

Standart Flask json API view over a Reliure Engine.

This is a Flask Blueprint (see http://flask.pocoo.org/docs/blueprints/)

Here is a simple usage exemple:

>>> from reliure.engine import Engine
>>> engine = Engine("process")
>>> engine.process.setup(in_name="in", out_name="out")
>>> # setup the block's component
>>> engine.process.set(lambda x: x**2)
>>> 
>>> # configure a view for the engine
>>> egn_view = EngineView(engine)
>>> # configure view input/output
>>> from reliure.types import Numeric
>>> egn_view.set_input_type(Numeric())
>>> egn_view.add_output("out")
>>> 
>>> ## create the API blueprint
>>> api = ReliureAPI()
>>> api.register_view(egn_view, url_prefix="egn")
>>>
>>> # here you get your blueprint
>>> # you can register it to your app with
>>> app.register_blueprint(api, url_prefix="/api")    

Then you will have two routes:

  • [GET] /api/: returns a json that desctibe your api routes
  • [GET] /api/egn: returns a json that desctibe your engine
  • [POST] /api/egn: run the engine itself

To use the “POST” entry point you can do :

>>> request = {
...     "in": 5,       # this is the name of my input
...     "options": {}   # this this the api/engine configuration
... }
>>> res = requests.get(
...     SERVER_URL+"/api/egn",
...     data=json.dumps(request),
...     headers={"content-type": "application/json"}
... )                                                       
>>> data = res.json()                                       
{
    meta: {...}
    results: {
        "out": 25
    }
}
__init__(name='api', url_prefix=None, expose_route=True, **kwargs)

Build the Blueprint view over a Engine.

Parameters:name – the name of this api (used as url prefix by default)
Expose_route:wether / returns all api routes default True
register(app, options, first_registration=False)
register_view(view, url_prefix=None)

Associate a EngineView to this api

class reliure.web.RemoteApi(url, **kwargs)

Bases: flask.blueprints.Blueprint

Proxy to a remote ReliureJsonAPI

__init__(url, **kwargs)

Function doc :param url: engine api url

add_url_rule(path, endpoint, *args, **kwargs)
forward(**kwargs)

remote http call to api endpoint accept ONLY GET and POST

register(app, options, first_registration=False)

Reliure: minimal framework for data processing

reliure is a minimal and basic framework to manage pipeline of data processing components in Python.

Documentation

In case your are not reading it yet, full documentation is available on ReadTheDoc: http://reliure.readthedocs.org

Install

Should be simple as a pip command:

$ pip install reliure

License

reliure source code is available under the LGPL Version 3 license, unless otherwise indicated.

Requirements

reliure works with both Python 2 and Python 3. it depends on:

All this deps may be installed with:

$ pip install -r requirements.txt

To develop reliure you will also need:

Dev dependances may be installed with:

$ pip install -r requirements.dev.txt

Contribute

The following should create a pretty good development environment:

$ git clone https://github.com/kodexlab/reliure.git
$ cd reliure
$ virtualenv ENV
$ source ENV/bin/activate
$ pip install -r requirements.txt
$ pip install -r requirements.dev.txt

To run tests:

$ make testall

To build the doc:

$ make doc

then open: docs/_build/html/index.html

Warning

You need to have reliure accesible in the python path to run tests (and to build doc). For that you can install reliure as a link in local virtualenv:

$ pip install -e .

(Note: this is indicadet in pytest good practice )

If you dev, in the same time, an other package than need your last reliure version, you can use:

$ pip install -e the_good_path/reliure  # link reliure in local python packages

Indices and tables