Reliure documentation¶
Welcome on reliure
documenation. Just after the table of contents are
presented basic information on reliure
.
Contents¶
Pipeline and components¶
Table of contents
Simple reliure pipeline¶
Here is a very simple pipeline of two components:
>>> from reliure import Composable
>>> plus_two = Composable(lambda x: x+2)
>>> times_three = Composable(lambda x: x*3)
>>> pipeline = plus_two | times_three
>>> # one can then run it :
>>> pipeline(3)
15
>>> pipeline(10)
36
Note that we wrap simple function into Composable
, let’s detail it further.
Build a more complex component¶
A reliure component is basicely function with spectial wrapping arround to make
it “pipeline-able”. In other word it is a callable object that inherit from
reliure.Composable
.
You can build it using reliure.Composable
as a decorator:
>>> @Composable
... def my_processing(value):
... return value**2
...
>>> my_processing(2)
4
>>> my_processing.name # this has been added by Composable
'my_processing'
Or you can build a class that inherit from reliure.Composable
:
>>> class MyProcessing(Composable):
... def __init__(self, pow):
... super(MyProcessing, self).__init__()
... self.pow = pow
...
... def __call__(self, intdata):
... return intdata**self.pow
...
>>> my_processing = MyProcessing(4)
>>> my_processing.name
'MyProcessing'
>>> my_processing(2)
16
Tip
Defining a component as an object (with a __call__
) has the
avantage to make it cofigurable. Indeed some parameters can be given in the
__init__
and can then be used in __call__
.
Add options to components¶
Todo
better presentation of options
An other key feature of reliure is to have reliure.Optionable
components:
>>> from reliure import Optionable
>>> from reliure.types import Numeric
>>>
>>> class ProcessWithOption(Optionable):
... def __init__(self):
... super(ProcessWithOption, self).__init__()
... self.add_option("pow", Numeric(default=2, help="power to apply", min=0))
...
... @Optionable.check
... def __call__(self, intdata, pow=None):
... return intdata**pow
...
>>> my_processing = ProcessWithOption()
>>> my_processing.name
'ProcessWithOption'
>>> my_processing(2)
4
>>> my_processing(2, pow=4)
16
>>> my_processing(2, pow=-2)
Traceback (most recent call last):
reliure.exceptions.ValidationError: ['Ensure this value ("-2") is greater than or equal to 0.']
>>> 2
2
Processing engine¶
Table of contents
simple exemple¶
TODO !
wtf exemple¶
Here is a simple exemple of Engine
usage.
First you need to setup your engine:
>>> from reliure.engine import Engine
>>> egn = Engine()
>>> egn.requires('foo', 'bar', 'boo')
one can make imaginary components:
>>> from reliure.pipeline import Pipeline, Optionable, Composable
>>> from reliure.types import Numeric
>>> class One(Optionable):
... def __init__(self):
... super(One, self).__init__(name="one")
... self.add_option("val", Numeric(default=1))
...
... @Optionable.check
... def __call__(self, input, val=None):
... return input + val
...
>>> one = One()
>>> two = Composable(name="two", func=lambda x: x*2)
>>> three = Composable(lambda x: x - 2) | Composable(lambda x: x/2.)
>>> three.name = "three"
one can configure a block with this three components:
>>> foo_comps = [one, two, three]
>>> foo_options = {'defaults': 'two'}
>>> egn.set('foo', *foo_comps, **foo_options)
or
>>> egn['bar'].setup(multiple=True)
>>> egn['bar'].append(two, default=True)
>>> egn['bar'].append(three, default=True)
or
>>> egn["boo"].set(two, three)
>>> egn["boo"].setup(multiple=True)
>>> egn["boo"].defaults = [comp.name for comp in (two, three)]
One can have the list of all configurations:
>>> from pprint import pprint
>>> pprint(egn.as_dict())
{'args': ['input'],
'blocks': [{'args': None,
'components': [{'default': False,
'name': 'one',
'options': [{'name': 'val',
'otype': {'choices': None,
'default': 1,
'help': '',
'max': None,
'min': None,
'multi': False,
'type': 'Numeric',
'uniq': False,
'vtype': 'int'},
'type': 'value',
'value': 1}]},
{'default': True,
'name': 'two',
'options': None},
{'default': False,
'name': 'three',
'options': []}],
'multiple': False,
'name': 'foo',
'required': True,
'returns': 'foo'},
{'args': None,
'components': [{'default': True,
'name': 'two',
'options': None},
{'default': True,
'name': 'three',
'options': []}],
'multiple': True,
'name': 'bar',
'required': True,
'returns': 'bar'},
{'args': None,
'components': [{'default': True,
'name': 'two',
'options': None},
{'default': True,
'name': 'three',
'options': []}],
'multiple': True,
'name': 'boo',
'required': True,
'returns': 'boo'}]}
And then you can configure and run it:
>>> request_options = {
... 'foo':[
... {
... 'name': 'one',
... 'options': {
... 'val': 2
... }
... }, # input + 2
... ],
... 'bar':[
... {'name': 'two'},
... ], # input * 2
... 'boo':[
... {'name': 'two'},
... {'name': 'three'},
... ], # (input - 2) / 2.
... }
>>> egn.configure(request_options)
>>> # test before running:
>>> egn.validate()
One can then run only one block:
>>> egn['boo'].play(10)
{'boo': 4.0}
or all blocks :
>>> res = egn.play(4)
>>> res['foo'] # 4 + 2
6
>>> res['bar'] # 6 * 2
12
>>> res['boo'] # (12 - 2) / 2.0
5.0
HTTP/Json API¶
Reliure permits to build json api for simple processing function (or Optionable
)
as well as for more complex Engine
. The idea of the reliure API
mechanism is : you manage data processing logic, reliure manages the “glue” job.
Reliure web API are based on Flask.
A reliure API (ReliureAPI
) is a Flask Blueprint
where you plug view of your
processing modules.
Let’s see how it works on some simple examples !
Table of contents
Component or simple function¶
Expose a simple function¶
Let’s imagine that we have the following hyper-complex data procesing method:
>>> def count_a_and_b(chaine):
... return chaine.count("a") + chaine.count("b")
and you want it to be accessible on an HTTP/json supercool-powered API...
In ohter word we just want that a GET on http://myapi.me.com/api/count_ab/totocotata
returns 2
and eventualy some other metadata (processing time for instance).
Here is how we can do that with reliure.
First you need to build a “view” (a ComponentView
) on this function:
>>> from reliure.web import ComponentView
>>> view = ComponentView(count_a_and_b)
Then you have to define the type of the input (the type will manage parsing from string/json):
>>> from reliure.types import Text
>>> view.add_input("in", Text())
>>> # Note that, by default, the output will be named with the function name
You can also specify a short url patern to reach your function, this is done with flask route paterns syntax. Here we will simply indicate that the url (note that there will be url prefix) should match our uniq input:
>>> view.play_route("<in>")
Then you can create a ReliureAPI
object and register this view on it:
>>> from reliure.web import ReliureAPI
>>> api = ReliureAPI("api")
>>> api.register_view(view, url_prefix="count_ab")
This api
object can be plug to a flask app (it is a Flask Blueprint
):
>>> from flask import Flask
>>> app = Flask("my_app")
>>> app.register_blueprint(api, url_prefix="/api")
To illustrate API call, let’s use Flask testing mechanism:
>>> resp = client.get("/api/count_ab/abcdea") # call our API
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'count_a_and_b': 3}
>>>
>>> resp = client.get("/api/count_ab/abcdea__bb_aaa")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'count_a_and_b': 8}
Note that meta information is also available:
>>> pprint(results["meta"])
{'details': [{'errors': [],
'name': 'count_a_and_b',
'time': 3.314018249511719e-05,
'warnings': []}],
'errors': [],
'name': 'count_a_and_b:[count_a_and_b]',
'time': 3.314018249511719e-05,
'warnings': []}
Managing options and multiple inputs¶
Let’s mouv on a more complex exemple...
First, write your processing component¶
One can imagine the following component that merge two string with two possible methods (choice is made with an option):
>>> from reliure import Optionable
>>> from reliure.types import Text
>>>
>>> class StringMerge(Optionable):
... """ Stupid component that merge to string together
... """
... def __init__(self):
... super(StringMerge, self).__init__()
... self.add_option("method", Text(
... choices=[u"concat", u"altern"],
... default=u"concat",
... help="How to merge the inputs"
... ))
...
... @Optionable.check
... def __call__(self, left, right, method=None):
... if method == u"altern":
... merge = "".join("".join(each) for each in zip(left, right))
... else:
... merge = left + right
... return merge
One can use this directly in python:
>>> merge_component = StringMerge()
>>> merge_component("aaa", "bbb")
'aaabbb'
>>> merge_component("aaa", "bbb", method=u"altern")
'ababab'
Then create a view on it, and register it on your API¶
If you want to expose this component on a HTTP API,
as for our first exemple,
you need to build a “view” (a ComponentView
) on it:
>>> view = ComponentView(merge_component)
>>> # you need to define the type of the input
>>> from reliure.types import Text
>>> view.add_input("in_lft", Text())
>>> view.add_input("in_rgh", Text(default=u"ddd"))
>>> # ^ Note that it is possible to give default value for inputs
>>> view.add_output("merge")
>>> # we specify two short urls to reach the function:
>>> view.play_route("<in_lft>/<in_rgh>", "<in_lft>")
Warning
Note that for a ComponentView
the order of the inputs
matters to match with component (or function) arguments.
It is not the name of that permits the match.
Warning
when you define default value for inputs, None
can not be a default value.
Then we can register this new view to a reliure API object:
>>> api.register_view(view, url_prefix="merge")
Finaly, just use it !¶
And then we can use it:
>>> resp = client.get("/api/merge/aaa/bbb")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'aaabbb'}
As we have specify a route that require only one argument, and a default value
for this second input (in_rgh
), it is also possible to do:
>>> resp = client.get("/api/merge/aaa")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'aaaddd'}
It is also possible to call the API with options:
>>> resp = client.get("/api/merge/aaa/bbb?method=altern")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'ababab'}
Alternatively you can use a POST to send inputs. There is two posibility to provide inputs and options. First by using direct form encoding:
>>> resp = client.post("/api/merge", data={"in_lft":"ee", "in_rgh":"hhhh"})
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'eehhhh'}
And with options in the url:
>>> resp = client.post("/api/merge?method=altern", data={"in_lft":"ee", "in_rgh":"hhhh"})
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'eheh'}
The second option is to use a json payload:
>>> data = {
... "in_lft":"eeee",
... "in_rgh":"gg",
... "options": {
... "name": "StringMerge",
... "options": {
... "method": "altern",
... }
... }
... }
>>> json_data = json.dumps(data)
>>> resp = client.post("/api/merge", data=json_data, content_type='application/json')
>>> # note that it is important to specify content_type to 'application/json'
>>> results = json.loads(resp.data.decode("utf-8"))
>>> results["results"]
{'merge': 'egeg'}
Note that a GET call on the root /api/merge
returns a json that specify
the API. With this, it is possible do list all the options of the component:
>>> resp = client.get("/api/merge")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results)
{'args': ['in_lft', 'in_rgh'],
'components': [{'default': True,
'name': 'StringMerge',
'options': [{'name': 'method',
'otype': {'choices': ['concat', 'altern'],
'default': 'concat',
'encoding': 'utf8',
'help': 'How to merge the inputs',
'multi': False,
'type': 'Text',
'uniq': False,
'vtype': 'unicode'},
'type': 'value',
'value': 'concat'}]}],
'multiple': False,
'name': 'StringMerge',
'required': True,
'returns': ['merge']}
Complex processing engine¶
Define your engine¶
Here is a simple reliure engine that we will expose as an HTTP API.
>>> from reliure.engine import Engine
>>> engine = Engine("vowel", "consonant", "concat")
>>> engine.vowel.setup(in_name="text")
>>> engine.consonant.setup(in_name="text")
>>> engine.concat.setup(in_name=["vowel", "consonant"], out_name="merge")
>>>
>>> from reliure import Composable
>>> vowels = u"aiueoéèàùêôûîï"
>>> @Composable
... def extract_vowel(text):
... return "".join(char for char in text if char in vowels)
>>> engine.vowel.set(extract_vowel)
>>>
>>> @Composable
... def extract_consonant(text):
... return "".join(char for char in text if char not in vowels)
>>> engine.consonant.set(extract_consonant)
>>>
>>> # for the merge we re-use the component defined in previous section:
>>> engine.concat.set(StringMerge())
The Figure Engine schema. draw the processing schema of this small engine.

Engine schema.
(See
engine_schema()
to see how to generate such schema from an engine)Create a view and register it on your api¶
As for a simple component we need to create a view over our engine :
>>> from reliure.web import EngineView
>>> view = EngineView(engine)
And then to define the input and output types:
>>> view.add_input("text", Text())
>>> view.add_output("merge", Text())
We can also specify a short url patern to run the engine:
>>> view.play_route("<text>")
Then you can create a ReliureAPI
object and register this view on it:
Then we can register this new view to a reliure API object:
>>> api = ReliureAPI("api")
>>> api.register_view(view, url_prefix="process")
>>> # and register thi api to our flask app :
>>> app.register_blueprint(api, url_prefix="/api")
Use it !¶
>>> resp = client.get("/api/process/abcdea")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'merge': 'aeabcd'}
>>>
>>> resp = client.get("/api/process/abcdea__bb_aaa")
>>> results = json.loads(resp.data.decode("utf-8"))
>>> pprint(results["results"])
{'merge': 'aeaaaabcd__bb_'}
Note that meta information is also available:
>>> pprint(results["meta"])
{'details': [{'details': [{'errors': [],
'name': 'extract_vowel',
'time': 3.695487976074219e-05,
'warnings': []}],
'errors': [],
'name': 'vowel:[extract_vowel]',
'time': 3.695487976074219e-05,
'warnings': []},
{'details': [{'errors': [],
'name': 'extract_consonant',
'time': 3.0040740966796875e-05,
'warnings': []}],
'errors': [],
'name': 'consonant:[extract_consonant]',
'time': 3.0040740966796875e-05,
'warnings': []},
{'details': [{'errors': [],
'name': 'StringMerge',
'time': 5.507469177246094e-05,
'warnings': []}],
'errors': [],
'name': 'concat:[StringMerge]',
'time': 5.507469177246094e-05,
'warnings': []}],
'errors': [],
'name': 'engine:[vowel:[extract_vowel], consonant:[extract_consonant], concat:[StringMerge]]',
'time': 0.0001220703125,
'warnings': []}
Offline processing¶
Reliure provides some helpers to run offline bash processing.
Table of contents
Run a component, or pipeline of components¶
To illustrate how you can easily run a pipeline of components with reliure, let’s consider that we have a sequence of “document” we want to process:
>>> documents = ["doc1", "doc2", "doc3", "doc4"]
For that we have this two components, that we pipe:
>>> from reliure.pipeline import Composable
>>> @Composable
... def doc_analyse(docs):
... for doc in docs:
... yield {
... "title": doc,
... "id": int(doc[3:]),
... "url": "http://lost.com/%s" % doc,
... }
>>>
>>> @Composable
... def print_ulrs(docs):
... for doc in docs:
... print(doc["url"])
... yield doc
>>>
>>> pipeline = doc_analyse | print_ulrs
To run this pipeline on our documents, you just have to do:
>>> from reliure.offline import run
>>> res = run(pipeline, documents)
http://lost.com/doc1
http://lost.com/doc2
http://lost.com/doc3
http://lost.com/doc4
>>> pprint(res)
[{'id': 1, 'title': 'doc1', 'url': 'http://lost.com/doc1'},
{'id': 2, 'title': 'doc2', 'url': 'http://lost.com/doc2'},
{'id': 3, 'title': 'doc3', 'url': 'http://lost.com/doc3'},
{'id': 4, 'title': 'doc4', 'url': 'http://lost.com/doc4'}]
The exact same pipeline can now be run in // by using run_parallel()
instead of run()
:
>>> from reliure.offline import run_parallel
>>> res = run_parallel(pipeline, documents, ncpu=2, chunksize=5)
Command line interface¶
reliure.utils.cli
provides some helper to create or populate
argument parser from Optionable.
Let’s look at a simple exemple.
First you have an Optionable component with an option:
>>> class PowerBy(Optionable):
... def __init__(self):
... super(PowerBy, self).__init__("testOptionableName")
... self.add_option("alpha", Numeric(default=4, min=1, max=20,
... help='power exponent'))
...
... @Optionable.check
... def __call__(self, value, alpha=None):
... return value**alpha
>>>
Note that it could aslo be a pipeline of components.
Next you want to build a script with a __main__ and you want to map your component options to script option using argparse. Here is how to do (file reliure_cli.py:
#!/usr/bin/env python
import sys
from reliure import Optionable
from reliure.types import Numeric
# definition of your component here
class PowerBy(Optionable):
def __init__(self):
super(PowerBy, self).__init__("testOptionableName")
self.add_option("alpha", Numeric(default=4, min=1, max=20,
help='power exponent'))
@Optionable.check
def __call__(self, value, alpha=None):
return value**alpha
# creation of your processing component
mycomp = PowerBy()
def main():
from argparse import ArgumentParser
from reliure.utils.cli import arguments_from_optionable, get_config_for
# build the option parser
parser = ArgumentParser()
# Add options form your component
arguments_from_optionable(parser, mycomp, prefix="power_")
# Add other options, here the input ;
parser.add_argument('INPUT', type=int, help='the number to process !')
# Parse the options and get the config for the component
args = parser.parse_args()
config = get_config_for(args, mycomp, prefix="power_")
result = mycomp(args.INPUT, **config)
print(result)
return 0
if __name__ == '__main__':
sys.exit(main())
With that you will have a nice doc generated:
$ python reliure_cli.py -h
usage: reliure_cli.py [-h] [--power_alpha POWER_ALPHA] INPUT
positional arguments:
INPUT the number to process !
optional arguments:
-h, --help show this help message and exit
--power_alpha POWER_ALPHA
power exponent
$ python reliure_cli.py --power_alpha 3 3
27
API reference¶
Voir reliure.pipeline
par exemple, ou meme reliure.pipeline.Optionable
!
reliure
¶
-
class
reliure.
Composable
(func=None, name=None)¶ Bases:
object
Basic composable element
Composable is abstract, you need to implemented the
__call__()
method>>> e1 = Composable(lambda element: element**2, name="e1") >>> e2 = Composable(lambda element: element + 10, name="e2")
Then
Composable
can be pipelined this way :>>> chain = e1 | e2 >>> # so yo got: >>> chain(2) 14 >>> # which is equivalent to : >>> e2(e1(2)) 14 >>> # not that by defaut the pipeline agregate the components name >>> chain.name 'e1|e2' >>> # however you can override it >>> chain.name = "chain" >>> chain.name 'chain'
It also possible to ‘map’ the composables >>> cmap = e1 & e2 >>> # so you got: >>> cmap(2) [4, 12]
-
__call__
(*args, **kwargs)¶
-
__init__
(func=None, name=None)¶ You can create a
Composable
from a simple function:>>> def square(val, pow=2): ... return val ** pow >>> cfct = Composable(square) >>> cfct.name 'square' >>> cfct(2) 4 >>> cfct(3, 3) 27
-
init_logger
(reinit=False)¶
-
logger
¶
-
name
¶ Name of the optionable component
-
-
class
reliure.
Optionable
(name=None)¶ Bases:
reliure.pipeline.Composable
Abstract class for an optionable component
-
__init__
(name=None)¶ Parameters: name (str) – name of the component
-
add_option
(opt_name, otype, hidden=False)¶ Add an option to the object
Parameters: - opt_name (str) – option name
- otype (subclass of
GenericType
) – option type - hidden (bool) – if True the option will be hidden
-
change_option_default
(opt_name, default_val)¶ Change the default value of an option
Parameters: - opt_name (str) – option name
- value – new default option value
-
static
check
(call_fct)¶ Decorator for optionable __call__ method It check the given option values
-
clear_option_value
(opt_name)¶ Clear the stored option value (so the default will be used)
Parameters: opt_name (str) – option name
-
clear_options_values
()¶ Clear all stored option values (so the defaults will be used)
-
force_option_value
(opt_name, value)¶ force the (default) value of an option. The option is then no more listed by
get_options()
.Parameters: - opt_name (str) – option name
- value – option value
-
get_option_default
(opt_name)¶ Return the default value of a given option
Parameters: opt_name (str) – option name Returns: the default value of the option
-
get_option_value
(opt_name)¶ Return the value of a given option
Parameters: opt_name (str) – option name Returns: the value of the option
-
get_options
(hidden=False)¶ Parameters: hidden (bool) – whether to return hidden options Returns: dictionary of all options (with option’s information) Return type: dict
-
get_options_values
(hidden=False)¶ return a dictionary of options values
Parameters: hidden (bool) – whether to return hidden options Returns: dictionary of all option values Return type: dict
-
get_ordered_options
(hidden=False)¶ Parameters: hidden (bool) – whether to return hidden option Returns: ordered list of options pre-serialised (as_dict) Return type: list [opt_dict, ...]
-
has_option
(opt_name)¶ Whether the component have a given option
Whether the given option is hidden
-
options
¶
-
parse_options
(option_values)¶ Set the options (with parsing) and returns a dict of all options values
-
print_options
()¶ print description of the component options
-
set_option_value
(opt_name, value, parse=False)¶ Set the value of one option.
- # TODO/FIXME
- add force/hide argument
- add default argument
- remove methods force option value
- remove change_option_default
Parameters: - opt_name (str) – option name
- value – the new value
- parse (bool) – if True the value is converted from string to the correct type
-
set_options_values
(options, parse=False, strict=False)¶ Set the options from a dict of values (in string).
Parameters: - option_values (dict) – the values of options (in format {“opt_name”: “new_value”})
- parse (bool) – whether to parse the given value
- strict (bool) – if True the given option_values dict should only contains existing options (no other key)
-
-
class
reliure.
Map
(comp, as_list=False)¶ Bases:
reliure.pipeline.OptionableSequence
Apply a composable to each element of an generator like input.
>>> item_process = Composable(lambda x: x+2) | Composable(lambda x: x if x > 3 else 0) >>> flux_process = Map(item_process) >>> >>> inputs = range(5) >>> [e for e in flux_process(inputs)] [0, 0, 4, 5, 6]
-
__call__
(*args, **kwargs)¶
-
__init__
(comp, as_list=False)¶
-
reliure.engine
¶
see Processing engine for documentation
-
class
reliure.engine.
BasicPlayMeta
(component)¶ Bases:
object
Object to store and manage meta data for one component exec
Here is a typical usage :
>>> import time >>> comp = Composable(name="TheComp", func=lambda x: x) >>> # create the meta result before to use the component >>> meta = BasicPlayMeta(comp) >>> # imagine some input and options for the component >>> args, kwargs = [12], {} >>> # store these data: >>> meta.run_with(args, kwargs) >>> # run the component >>> start = time.time() # starting time >>> try: ... output = comp(*args, **kwargs) ... except Exception as error: ... # store the exception if any ... meta.add_error(error) ... # one can raise a custom error (or not) ... #raise RuntimeError() ... finally: ... # this will always be executed (even if the exception is not catched) ... meta.time = time.time() - start ... # for testing purpose we put a fixed time ... meta.time = 9.2e-5 >>> # one can get a pre-serialization of the collected meta data >>> from pprint import pprint >>> pprint(meta.as_dict()) {'errors': [], 'name': 'TheComp', 'time': 9.2e-05, 'warnings': []}
-
__init__
(component)¶
-
add_error
(error)¶ Register an error that occurs during component running
>>> comp = Composable(name="TheComp", func=lambda x: x) >>> meta = BasicPlayMeta(comp) >>> try: ... output = 1/0 ... except Exception as error: ... # store the exception if any ... meta.add_error(error) >>> from pprint import pprint >>> pprint(meta.as_dict()) {'errors': ['...division ...by zero'], 'name': 'TheComp', 'time': 0.0, 'warnings': []}
-
as_dict
()¶ Pre-serialisation of the meta data
-
errors
¶
-
has_error
¶ wether any error happened
-
has_warning
¶ wether there where a warning during play
-
name
¶ Name of the component
-
run_with
(inputs, options)¶ Store the run parameters (inputs and options)
-
time
¶ Execution time (walltime)
>>> comp = Composable(name="TheComp", func=lambda x: x) >>> meta = BasicPlayMeta(comp) >>> meta.time = 453.6 >>> meta.time 453.6
-
warnings
¶
-
-
class
reliure.engine.
Block
(name)¶ Bases:
object
A block is a processing step realised by one component.
A component is a callable object that has a name attribute, often it is also a
reliure.Optionable
object or a pipeline beeing areliure.Composable
.Block object provides methods to discover and parse components options (if any).
-
__init__
(name)¶ Intialise a block. This should be done only from the
Engine
.Parameters: name (str) – name of the Block
-
all_outputs
()¶ Returns a set of outputs name
-
append
(component, default=False)¶ Add one component to the block
Parameters: default (bool) – if true this component will be use by default
-
as_dict
()¶ returns a dictionary representation of the block and of all component options
-
clear_selections
()¶ Reset the current selections and reset option values to default for all components
Warning
This method also reset the components options values to the defaults values.
-
component_names
()¶ returns the list of component names.
Component names will have the same order than components
-
configure
(config)¶ Configure the block from an (horible) configuration dictionary (or list) this data are coming from a json client request and has to be parsed. It takes the default value if missing (for component selection and options).
Parameters: config (dict or list) – component to use and the associated options config format
{ 'name': name_of_the_comp_to_use, 'options': { name: value, name: va... } }
or for multiple selection
[ { 'name': name_of_the_comp_to_use, 'options': { name: value, name: va... } }, {...} ]
Warning
values of options in this dictionnary are strings
-
defaults
¶ component default component
Note
default components is just an indication for user and the views, except if the Block is required. If required then default is selected if nothing explisitely selected.
-
name
¶ Name of the optionable component
-
needed_inputs
()¶ Return a list of (needed) inputs names
-
play
(*inputs, **named_inputs)¶ Run the selected components of the block. The selected components are run with the already setted options.
Warning
Defaut ‘multiple’ behavior is a pipeline !
Parameters: *inputs – arguments (i.e. inputs) to give to the components
-
reset
()¶ Removes all the components of the block
-
select
(comp_name, options=None)¶ Select the components that will by played (with given options).
options will be passed to
Optionable.parse_options()
if the component is a subclass ofOptionable
.Warning
this function also setup the options (if given) of the selected component. Use
clear_selections()
to restore both selection and component’s options.This method may be call at play ‘time’, before to call
play()
to run all selected components.Parameters: - name – name of the component to select
- options (dict) – options to set to the components
-
selected
()¶ returns the list of selected component names.
if no component selected return the one marked as default. If the block is required and no component where indicated as default, then the first component is selected.
-
set
(*components)¶ Set the possible components of the block
Parameters: components – components to append Optionables or Composables
-
setup
(in_name=None, out_name=None, required=None, hidden=None, multiple=None, defaults=None)¶ Set the options of the block. Only the not None given options are set
Note
a block may have multiple inputs but have only one output
Parameters: - in_name (str or list of str) – name(s) of the block input data
- out_name (str) – name of the block output data
- required (bool) – whether the block will be required or not
- hidden (bool) – whether the block will be hidden to the user or not
- multiple (bool) – if True more than one component may be selected/ run)
- defaults (list of str, or str) – names of the selected components
-
validate
()¶ check that the block can be run
-
-
class
reliure.engine.
Engine
(*names)¶ Bases:
object
The Reliure engine.
-
DEFAULT_IN_NAME
= 'input'¶
-
__init__
(*names)¶ Create the engine
Parameters: names – names of the engine blocks
-
all_outputs
()¶ Returns a list of all engine possible outputs (note that inputs are also possible inputs)
>>> engine = Engine("op1", "op2") >>> engine.op1.setup(in_name="in", out_name="middle", required=False) >>> engine.op2.setup(in_name="middle", out_name="out") >>> sorted(list(engine.all_outputs())) ['in', 'middle', 'out']
More complex example:
>>> engine = Engine("op1", "op2") >>> engine.op1.setup(in_name="in", out_name="middle") >>> engine.op2.setup(in_name=["middle", "in2"], out_name="out") >>> sorted(list(engine.all_outputs())) ['in', 'in2', 'middle', 'out']
Note that by default the needed input is ‘input’:
>>> engine = Engine("op1", "op2") >>> engine.op1.append(lambda x:x+2) >>> engine.op2.append(lambda x:x*2) >>> sorted(list(engine.all_outputs())) ['input', 'op1', 'op2']
-
as_dict
()¶ dict repr of the components
-
configure
(config)¶ Configure all the blocks from an (horible) configuration dictionary this data are coming from a json client request and has to be parsed. It takes the default value if missing (for component selection and options).
Parameters: config (dict) – dictionary that give the component to use for each step and the associated options config format
{ block_name: [{ 'name': name_of_the_comp_to_use, 'options': { name: value, name: va... } }, {...} ] }
Warning
values of options in this dictionnary are strings
-
in_name
¶ Give the input name of the first block.
If this first block is not required or if other block need some inputs then you beter have to look at
needed_inputs()
.
-
names
()¶ Returns the list of block names
-
needed_inputs
()¶ List all the needed inputs of a configured engine
>>> engine = Engine("op1", "op2") >>> engine.op1.setup(in_name="in", out_name="middle", required=False) >>> engine.op2.setup(in_name="middle", out_name="out") >>> engine.op1.append(lambda x:x+2) >>> engine.op2.append(lambda x:x*2) >>> engine.op1.select('<lambda>') >>> list(engine.needed_inputs()) ['in']
But now if we unactivate the first component:
>>> engine.op1.clear_selections() >>> list(engine.needed_inputs()) ['middle']
More complex example:
>>> engine = Engine("op1", "op2") >>> engine.op1.setup(in_name="in", out_name="middle") >>> engine.op2.setup(in_name=["middle", "in2"], out_name="out") >>> engine.op1.append(lambda x:x+2) >>> engine.op2.append(lambda x, y:x*y) >>> engine.needed_inputs() == {'in', 'in2'} True
Note that by default the needed input is ‘input’:
>>> engine = Engine("op1", "op2") >>> engine.op1.append(lambda x:x+2) >>> engine.op2.append(lambda x:x*2) >>> list(engine.needed_inputs()) ['input']
-
play
(*inputs, **named_inputs)¶ Run the engine (that should have been configured first)
if the inputs are given without name it should be the inputs of the first block, ig named_inputs are used it may be the inputs of any block.
Note
Either inputs or named_inputs should be provided, not both
Parameters: - inputs – the data to give as input to the first block
- named_inputs – named input data should match with
needed_inputs()
result.
-
requires
(*names)¶ Declare what block will be used in this engine.
It should be call before adding or setting any component. Blocks order will be preserved for runnning task.
-
set
(name, *components, **parameters)¶ Set available components and the options of one block.
Parameters: - name – block name
- components – the components (see
Block.set()
) - parameters – block configuration (see
Block.setup()
)
for example :
>>> engine = Engine("op1") >>> engine.set("op1", Composable(), required=True, in_name="query", out_name="holygrail")
-
validate
(inputs=None)¶ Check that the blocks configuration is ok
Parameters: inputs (list of str) – the names of the play inputs
-
-
class
reliure.engine.
PlayMeta
(name)¶ Bases:
reliure.engine.BasicPlayMeta
Object to store and manage meta data for a set of component or block play
>>> gres = PlayMeta("operation") >>> res_plus = BasicPlayMeta(Composable(name="plus")) >>> res_plus.time = 1.6 >>> res_moins = BasicPlayMeta(Composable(name="moins")) >>> res_moins.time = 5.88 >>> gres.append(res_plus) >>> gres.append(res_moins) >>> from pprint import pprint >>> pprint(gres.as_dict()) {'details': [{'errors': [], 'name': 'plus', 'time': 1.6, 'warnings': []}, {'errors': [], 'name': 'moins', 'time': 5.88, 'warnings': []}], 'errors': [], 'name': 'operation:[plus, moins]', 'time': 7.48, 'warnings': []}
-
__init__
(name)¶
-
add_error
(error)¶ It is not possible to add an error here, you sould add it on a
BasicPlayMeta
-
append
(meta)¶ Add a
BasicPlayMeta
-
as_dict
()¶ Pre-serialisation of the meta data
-
errors
¶ get all the errors
>>> gres = PlayMeta("operation") >>> res_plus = BasicPlayMeta(Composable(name="plus")) >>> gres.append(res_plus) >>> res_plus.add_error(ValueError("invalid data")) >>> res_moins = BasicPlayMeta(Composable(name="moins")) >>> gres.append(res_moins) >>> res_plus.add_error(RuntimeError("server not anwsering")) >>> gres.errors [ValueError('invalid data',), RuntimeError('server not anwsering',)]
-
name
¶ Compute a name according to sub meta results names
>>> gres = PlayMeta("operation") >>> res_plus = BasicPlayMeta(Composable(name="plus")) >>> res_moins = BasicPlayMeta(Composable(name="moins")) >>> gres.append(res_plus) >>> gres.append(res_moins) >>> gres.name 'operation:[plus, moins]'
-
time
¶ Compute the total time (walltime)
>>> gres = PlayMeta("operation") >>> res_plus = BasicPlayMeta(Composable(name="plus")) >>> res_plus.time = 1.6 >>> res_moins = BasicPlayMeta(Composable(name="moins")) >>> res_moins.time = 5.88 >>> gres.append(res_plus) >>> gres.append(res_moins) >>> gres.time 7.48
-
warnings
¶
-
reliure.exceptions
¶
-
exception
reliure.exceptions.
ReliureError
¶ Bases:
exceptions.Exception
Basic reliure error
-
exception
reliure.exceptions.
ReliurePlayError
(msg)¶ Bases:
exceptions.Exception
Error occuring at engine ‘play’ time
This errors can be show to the user
>>> error = ReliurePlayError("an error message") >>> error.msg 'an error message'
-
__init__
(msg)¶ Parameters: msg – the message for the user
-
-
exception
reliure.exceptions.
ReliureTypeError
¶ Bases:
reliure.exceptions.ReliureError
Error in a reliure Type
-
exception
reliure.exceptions.
ReliureValueError
¶ Bases:
reliure.exceptions.ReliureError
,exceptions.ValueError
Reliure value error: one value (attribute) was wrong
-
exception
reliure.exceptions.
ValidationError
(message, params=None)¶ Bases:
reliure.exceptions.ReliureTypeError
An error while validating data of a given type.
It may be either a single validation error or a list of validation error
>>> from reliure.utils.i18n import _ >>> error = ValidationError(_("a message with a value : %(value)s"), {'value': 42}) >>> for err in error: print(err) a message with a value : 42
-
__init__
(message, params=None)¶
-
reliure.offline
¶
-
reliure.offline.
main
()¶ Small run usage exemple
-
reliure.offline.
run
(pipeline, input_gen, options={})¶ Run a pipeline over a input generator
>>> # if we have a simple component >>> from reliure.pipeline import Composable >>> @Composable ... def print_each(letters): ... for letter in letters: ... print(letter) ... yield letter >>> # that we want to run over a given input: >>> input = "abcde" >>> # we just have to do : >>> res = run(print_each, input) a b c d e
it is also possible to run any reliure pipeline this way: >>> import string >>> pipeline = Composable(lambda letters: (l.upper() for l in letters)) | print_each >>> res = run(pipeline, input) A B C D E
-
reliure.offline.
run_parallel
(pipeline, input_gen, options={}, ncpu=4, chunksize=200)¶ Run a pipeline in parallel over a input generator cutting it into small chunks.
>>> # if we have a simple component >>> from reliure.pipeline import Composable >>> # that we want to run over a given input: >>> input = "abcde" >>> import string >>> pipeline = Composable(lambda letters: (l.upper() for l in letters)) >>> res = run_parallel(pipeline, input, ncpu=2, chunksize=2) >>> #Note: res should be equals to [['C', 'D'], ['A', 'B'], ['E']] >>> #but it seems that there is a bug with py.test and mp...
reliure.option
¶
Option objects used in reliure.Optionable
.
inheritance diagrams¶
Class¶
-
class
reliure.options.
ListOption
(name, otype)¶ Bases:
reliure.options.Option
option with multiple value
-
parse
(values)¶
-
validate
(values)¶
-
-
class
reliure.options.
Option
(name, otype)¶ Bases:
object
Abstract option value
-
static
FromType
(name, otype)¶ ValueOption subclasses factory, creates a convenient option to store data from a given Type.
attribute precedence :
|attrs| > 0
(multi
anduniq
are implicit) => NotImplementedErroruniq
(multi
is implicit) => NotImplementedErrormulti
andnot uniq
=> NotImplementedErrornot multi
=> ValueOption
Parameters: - name (str) – Name of the option
- otype (subclass of
GenericType
) – the desired type of field
-
__init__
(name, otype)¶ Parameters: - name (str) – option name
- otype (subclass of
GenericType
) – option type
-
as_dict
()¶ returns a dictionary view of the option
Returns: the option converted in a dict Return type: dict
-
clear
()¶ Reset the option value (default will be used)
-
default
¶ Default value of the option
Warning
changing the default value also change the current value
-
name
¶ Name of the option.
-
parse
(value)¶ Convert the value from data just decoded from json.
Parameters: value – a potential value for the option Returns: the value converted to the good type
-
set
(value, parse=False)¶ Set the value of the option.
One can also set the ‘value’ property:
>>> opt = ValueOption("oname", Numeric(default=1,help="an option exemple")) >>> opt.value = 12
Parameters: value – the new value
-
summary
()¶
-
validate
(value)¶ Raises
ValidationError
if the value is not correct, else just returns the given value.It is called when a new value is setted.
Parameters: value – the value to validate Returns: the value
-
value
¶ Value of the option
-
static
-
class
reliure.options.
SetOption
(name, otype)¶ Bases:
reliure.options.Option
option with multiple value
-
parse
(values)¶
-
validate
(values)¶
-
-
class
reliure.options.
ValueOption
(name, otype)¶ Bases:
reliure.options.Option
Single value option
-
parse
(value)¶
-
validate
(value)¶
-
reliure.pipeline
¶
inheritance diagrams¶
Class¶
-
class
reliure.pipeline.
Composable
(func=None, name=None)¶ Bases:
object
Basic composable element
Composable is abstract, you need to implemented the
__call__()
method>>> e1 = Composable(lambda element: element**2, name="e1") >>> e2 = Composable(lambda element: element + 10, name="e2")
Then
Composable
can be pipelined this way :>>> chain = e1 | e2 >>> # so yo got: >>> chain(2) 14 >>> # which is equivalent to : >>> e2(e1(2)) 14 >>> # not that by defaut the pipeline agregate the components name >>> chain.name 'e1|e2' >>> # however you can override it >>> chain.name = "chain" >>> chain.name 'chain'
It also possible to ‘map’ the composables >>> cmap = e1 & e2 >>> # so you got: >>> cmap(2) [4, 12]
-
__call__
(*args, **kwargs)¶
-
__init__
(func=None, name=None)¶ You can create a
Composable
from a simple function:>>> def square(val, pow=2): ... return val ** pow >>> cfct = Composable(square) >>> cfct.name 'square' >>> cfct(2) 4 >>> cfct(3, 3) 27
-
init_logger
(reinit=False)¶
-
logger
¶
-
name
¶ Name of the optionable component
-
-
class
reliure.pipeline.
Map
(comp, as_list=False)¶ Bases:
reliure.pipeline.OptionableSequence
Apply a composable to each element of an generator like input.
>>> item_process = Composable(lambda x: x+2) | Composable(lambda x: x if x > 3 else 0) >>> flux_process = Map(item_process) >>> >>> inputs = range(5) >>> [e for e in flux_process(inputs)] [0, 0, 4, 5, 6]
-
__call__
(*args, **kwargs)¶
-
__init__
(comp, as_list=False)¶
-
-
class
reliure.pipeline.
MapReduce
(reduce, *composables)¶ Bases:
reliure.pipeline.MapSeq
MapReduce implentation for components
One can pass a simple function:
>>> mapseq = MapReduce(sum, lambda x: x+1, lambda x: x+2, lambda x: x+3) >>> mapseq(10) 36
Or implements sub class of MapReduce:
>>> class MyReduce(MapReduce): ... def __init__(self, *composables): ... super(MyReduce, self).__init__(None, *composables) ... def reduce(self, array, *args, **kwargs): ... return list(args) + [sum(array)] >>> mapreduce = MyReduce(lambda x: x+1, lambda x: x+2, lambda x: x+3) >>> mapreduce(10) [10, 36]
-
__call__
(*args, **kwargs)¶
-
__init__
(reduce, *composables)¶
-
reduce
(array, *args, **kwargs)¶
-
-
class
reliure.pipeline.
MapSeq
(*composants, **kwargs)¶ Bases:
reliure.pipeline.OptionableSequence
Map implentation for components
>>> mapseq = MapSeq(lambda x: x+1, lambda x: x+2, lambda x: x+3) >>> mapseq(10) [11, 12, 13] >>> sum(mapseq(10)) 36
-
__call__
(*args, **kwargs)¶
-
map
(*args, **kwargs)¶
-
-
class
reliure.pipeline.
Optionable
(name=None)¶ Bases:
reliure.pipeline.Composable
Abstract class for an optionable component
-
__init__
(name=None)¶ Parameters: name (str) – name of the component
-
add_option
(opt_name, otype, hidden=False)¶ Add an option to the object
Parameters: - opt_name (str) – option name
- otype (subclass of
GenericType
) – option type - hidden (bool) – if True the option will be hidden
-
change_option_default
(opt_name, default_val)¶ Change the default value of an option
Parameters: - opt_name (str) – option name
- value – new default option value
-
static
check
(call_fct)¶ Decorator for optionable __call__ method It check the given option values
-
clear_option_value
(opt_name)¶ Clear the stored option value (so the default will be used)
Parameters: opt_name (str) – option name
-
clear_options_values
()¶ Clear all stored option values (so the defaults will be used)
-
force_option_value
(opt_name, value)¶ force the (default) value of an option. The option is then no more listed by
get_options()
.Parameters: - opt_name (str) – option name
- value – option value
-
get_option_default
(opt_name)¶ Return the default value of a given option
Parameters: opt_name (str) – option name Returns: the default value of the option
-
get_option_value
(opt_name)¶ Return the value of a given option
Parameters: opt_name (str) – option name Returns: the value of the option
-
get_options
(hidden=False)¶ Parameters: hidden (bool) – whether to return hidden options Returns: dictionary of all options (with option’s information) Return type: dict
-
get_options_values
(hidden=False)¶ return a dictionary of options values
Parameters: hidden (bool) – whether to return hidden options Returns: dictionary of all option values Return type: dict
-
get_ordered_options
(hidden=False)¶ Parameters: hidden (bool) – whether to return hidden option Returns: ordered list of options pre-serialised (as_dict) Return type: list [opt_dict, ...]
-
has_option
(opt_name)¶ Whether the component have a given option
Whether the given option is hidden
-
options
¶
-
parse_options
(option_values)¶ Set the options (with parsing) and returns a dict of all options values
-
print_options
()¶ print description of the component options
-
set_option_value
(opt_name, value, parse=False)¶ Set the value of one option.
- # TODO/FIXME
- add force/hide argument
- add default argument
- remove methods force option value
- remove change_option_default
Parameters: - opt_name (str) – option name
- value – the new value
- parse (bool) – if True the value is converted from string to the correct type
-
set_options_values
(options, parse=False, strict=False)¶ Set the options from a dict of values (in string).
Parameters: - option_values (dict) – the values of options (in format {“opt_name”: “new_value”})
- parse (bool) – whether to parse the given value
- strict (bool) – if True the given option_values dict should only contains existing options (no other key)
-
-
class
reliure.pipeline.
OptionableSequence
(*composants, **kwargs)¶ Bases:
reliure.pipeline.Optionable
Abstract class to manage a composant made of a sequence of callable (Optionable or not).
This object is an Optionable witch as all the options of it composants
-
__call__
(*args, **kwargs)¶
-
__init__
(*composants, **kwargs)¶ Create a optionable sequence
Parameters: shared_option – allow two components to share an option, if the two options have the same name (False by default)
-
add_option
(opt_name, otype, hidden=False)¶
-
call_item
(item, *args, **kwargs)¶
-
change_option_default
(opt_name, default_val)¶
-
clear_option_value
(opt_name)¶
-
clear_options_values
()¶
-
close
()¶ Close all the neested components
-
force_option_value
(opt_name, value)¶
-
get_option_default
(opt_name)¶
-
get_option_value
(opt_name)¶
-
get_options_values
(hidden=False)¶
-
has_option
(opt_name)¶
-
options
¶
-
set_option_value
(opt_name, value, parse=False)¶
-
set_options_values
(option_values, parse=True, strict=False)¶
-
-
class
reliure.pipeline.
Pipeline
(*composables, **kwargs)¶ Bases:
reliure.pipeline.OptionableSequence
A Pipeline is a sequence of function called sequentially.
It may be created explicitely:
>>> step1 = lambda x: x**2 >>> step2 = lambda x: x-1 >>> step3 = lambda x: min(x, 22) >>> processing = Pipeline(step1, step2, step3) >>> processing(4) 15 >>> processing(40) 22
Or it can be created implicitely with the pipe operator (__or__) if the first function is
Composable
:>>> step1 = Composable(step1) >>> processing = step1 | step2 | step3 >>> processing(3) 8 >>> processing(0) -1
-
__call__
(*args, **kwargs)¶
-
__init__
(*composables, **kwargs)¶
-
reliure.schema
¶
copyright: |
|
---|---|
license: | ${LICENSE} |
Class¶
-
class
reliure.schema.
Doc
(schema=None, **data)¶ Bases:
dict
Document object
Here is an exemple of document construction from a simple text. First we define document’s schema:
>>> from reliure.types import Text, Numeric >>> term_field = Text(attrs={'tf':Numeric(default=1), 'positions':Numeric(multi=True)}) >>> schema = Schema(docnum=Numeric(), text=Text(), terms=term_field)
Now it is how one can build a document from this simple text:
>>> text = """i have seen chicken passing the street and i believed ... how many chicken must pass in the street before you ... believe"""
Then we can create the document:
>>> doc = Doc(schema, docnum=1, text=text) >>> doc.text[:6] 'i have' >>> len(doc.text) 113 >>> doc["docnum"] 1
Then we can analyse the text:
>>> tokens = text.split(' ') >>> from collections import OrderedDict >>> text_terms = list(OrderedDict.fromkeys(tokens)) >>> terms_tf = [ tokens.count(k) for k in text_terms ] >>> terms_pos = [[i for i, tok in enumerate(tokens) if tok == k ] for k in text_terms]
and one can store the result in the field “terms”:
>>> doc.terms = text_terms >>> doc.terms.tf.values() # here we got only '1', it's the default value [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] >>> doc.terms.tf = terms_tf >>> doc.terms.positions = terms_pos
One can access the information, for example, for the term “chicken”:
>>> key = "chicken" >>> doc.terms[key].tf 2 >>> doc.terms[key].positions [3, 11] >>> doc.terms.get_attr_value(key, 'positions') [3, 11] >>> doc.terms._keys[key] 3 >>> doc.terms.positions[3] [3, 11]
#TODO: la valeur de docnum doit être passer en argument de __init__
-
__init__
(schema=None, **data)¶ Document initialisation
Warning
a copy of the given schema is stored in the document
Simple exemple:
>>> from reliure.types import Text, Numeric >>> doc = Doc(Schema(titre=Text()), titre='Un titre')
Not that a “docnum” field is always present, i.e. it is added if not given in schema: >>> doc = Doc(docnum=”42”) >>> doc.docnum ‘42’
-
add_field
(name, ftype, docfield=None)¶ Add a field to the document (and to the underlying schema)
Parameters: - name (str) – name of the new field
- ftype (subclass of
GenericType
) – type of the new field
-
export
(exclude=[])¶ returns a dictionary representation of the document
-
set_field
(name, value, parse=False)¶ Set the value of a field
-
-
class
reliure.schema.
DocField
(ftype)¶ Bases:
object
Abstract document field
Theses objects are containers of document’s data.
-
static
FromType
(ftype)¶ DocField subclasses factory, creates a convenient field to store data from a given Type.
attribute precedence :
|attrs| > 0
(multi
anduniq
are implicit) => VectorFielduniq
(multi
is implicit) => SetFieldmulti
andnot uniq
=> ListFieldnot multi
=> ValueField
Parameters: ftype (subclass of GenericType
) – the desired type of field
-
__init__
(ftype)¶ Parameters: ftype (subclass of GenericType
) – the type for the field
-
export
()¶ Returns a serialisable representation of the field
-
ftype
¶
-
get_value
()¶ return the value of the field.
-
parse
(value)¶
-
static
-
exception
reliure.schema.
FieldValidationError
(field, value, errors)¶ Bases:
exceptions.Exception
Error in a field validation
-
__init__
(field, value, errors)¶
-
-
class
reliure.schema.
ListField
(fieldtype)¶ Bases:
reliure.schema.DocField
,list
list container for non-uniq field
usage example:
>>> from reliure.types import Text >>> schema = Schema(tags=Text(multi=True, uniq=False)) >>> doc = Doc(schema, docnum='abc42') >>> doc.tags.add('boo') >>> doc.tags.add('foo') >>> doc.tags.add('foo') >>> len(doc.tags) 3 >>> doc.tags.export() ['boo', 'foo', 'foo']
-
__init__
(fieldtype)¶
-
add
(value)¶ Adds a value to the list (as append). convenience method, to have the same signature than
SetField
andVectorField
-
append
(value)¶
-
export
()¶ returns a list pre-seriasation of the field
>>> from reliure.types import Text >>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True) >>> doc.terms.add('rat') >>> doc.terms.add('chien') >>> doc.terms.add('chat') >>> doc.terms.add('léopart') >>> doc.terms.export() ['rat', 'chien', 'chat', 'l\xe9opart']
-
get_value
()¶
-
parse
(value)¶
-
set
(values)¶ set new values (values have to be iterable)
-
-
class
reliure.schema.
Schema
(**fields)¶ Bases:
object
Schema definition for documents (
Doc
). Class inspired from Matt Chaput’s Whoosh.Creating a schema:
>>> from reliure.types import Text, Numeric >>> schema = Schema(title=Text(), score=Numeric()) >>> sorted(schema.field_names()) ['score', 'title']
-
__init__
(**fields)¶ Create a schema from pairs of field name and field type
For exemple:
>>> from reliure.types import Text, Numeric >>> schema = Schema(tags=Text(multi=True), score=Numeric(vtype=float, min=0., max=1.))
-
add_field
(name, field)¶ Add a named field to the schema.
Warning
the field name should not contains spaces and should not start with an underscore.
Parameters: - name (str) – name of the new field
- field (subclass of
GenericType
) – type instance for the field
-
copy
()¶ Returns a copy of the schema
-
field_names
()¶
-
has_field
(name)¶
-
iter_fields
()¶
-
remove_field
(field_name)¶
-
-
exception
reliure.schema.
SchemaError
¶ Bases:
exceptions.Exception
Error
-
class
reliure.schema.
SetField
(fieldtype)¶ Bases:
reliure.schema.DocField
,set
Document field for a set of values (i.e. the fieldtype is “multi” and “uniq”)
usage example:
>>> from reliure.types import Text >>> schema = Schema(tags=Text(multi=True, uniq=True)) >>> doc = Doc(schema, docnum='abc42') >>> doc.tags.add('boo') >>> doc.tags.add('foo') >>> len(doc.tags) 2 >>> sorted(doc.tags.export()) ['boo', 'foo']
-
__init__
(fieldtype)¶
-
add
(value)¶
-
export
()¶
-
get_value
()¶
-
parse
(value)¶
-
set
(values)¶
-
-
class
reliure.schema.
ValueField
(fieldtype)¶ Bases:
reliure.schema.DocField
Stores only one value
usage example:
>>> from reliure.types import Text >>> schema = Schema(title=Text(), like=Numeric(default=45)) >>> doc = Doc(schema, docnum='abc42') >>> # 'title' field >>> doc.title = 'Un titre cool !' >>> doc.title 'Un titre cool !' >>> doc.get_field('title').export() 'Un titre cool !' >>> doc.get_field('title').ftype Text(multi=False, uniq=False, default=, attrs=None) >>> # 'like' field >>> doc.like 45
-
__init__
(fieldtype)¶
-
export
()¶
-
get_value
()¶
-
set
(value)¶
-
-
class
reliure.schema.
VectorAttr
(vector, attr)¶ Bases:
object
Internal class used to acces an attribute of a
VectorField
>>> from reliure.types import Text, Numeric >>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) >>> doc.terms.add('chat') >>> type(doc.terms.tf) <class 'reliure.schema.VectorAttr'>
-
__init__
(vector, attr)¶
-
export
()¶
-
values
()¶
-
-
class
reliure.schema.
VectorField
(ftype)¶ Bases:
reliure.schema.DocField
More complex document field container
Hide: >>> from pprint import pprint
usage:
>>> from reliure.types import Text, Numeric >>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) >>> doc.terms.add('chat') >>> doc.terms['chat'].tf = 12 >>> doc.terms['chat'].tf 12 >>> doc.terms.add('dog', tf=55) >>> doc.terms['dog'].tf 55
One can also add an atribute after the field is created:
>>> doc.terms.add_attribute('foo', Numeric(default=42)) >>> doc.terms.foo.values() [42, 42] >>> doc.terms['dog'].foo = 20 >>> doc.terms.foo.values() [42, 20]
It is also possible to delete elements from the field
>>> pprint(doc.terms.export()) {'foo': [42, 20], 'keys': {'chat': 0, 'dog': 1}, 'tf': [12, 55]} >>> del doc.terms['chat'] >>> pprint(doc.terms.export()) {'foo': [20], 'keys': {'dog': 0}, 'tf': [55]}
-
__init__
(ftype)¶
-
add
(key, **kwargs)¶ Add a key to the vector, do nothing if the key is already present
>>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, attrs={'tf': Numeric(default=1, min=0)}) >>> doc.terms.add('chat') >>> doc.terms.add('dog', tf=2) >>> doc.terms.tf.values() [1, 2]
>>> doc.terms.add('mouse', comment="a small mouse") Traceback (most recent call last): ... ValueError: Invalid attribute name: 'comment'
>>> doc.terms.add('mouse', tf=-2) Traceback (most recent call last): ValidationError: ['Ensure this value ("-2") is greater than or equal to 0.']
-
add_attribute
(name, ftype)¶ Add a data attribute. Note that the field type will be modified !
Parameters: - name (str) – name of the new attribute
- ftype (subclass of
GenericType
) – type of the new attribute
-
attribute_names
()¶ returns the names of field’s data attributes
Returns: set of attribute names Return type: frozenset
-
clear_attributes
()¶ removes all attributes
-
export
()¶ returns a dictionary pre-seriasation of the field
Hide: >>> from pprint import pprint
>>> from reliure.types import Text, Numeric >>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric(default=1)}) >>> doc.terms.add('chat') >>> doc.terms.add('rat', tf=5) >>> doc.terms.add('chien', tf=2) >>> pprint(doc.terms.export()) {'keys': {'chat': 0, 'chien': 2, 'rat': 1}, 'tf': [1, 5, 2]}
-
get_attr_value
(key, attr)¶ returns the value of a given attribute for a given key
>>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) >>> doc.terms.add('chat', tf=55) >>> doc.terms.get_attr_value('chat', 'tf') 55
-
get_attribute
(name)¶
-
get_value
()¶ from DocField, convenient method
-
has
(key)¶
-
keys
()¶ list of keys in the vector
-
set
(keys)¶ Set new keys. Mind this will clear all attributes and keys before adding new keys
>>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, attrs={'tf': Numeric(default=1)}) >>> doc.terms.add('copmputer', tf=12) >>> doc.terms.tf.values() [12] >>> doc.terms.set(['keyboard', 'mouse']) >>> list(doc.terms) ['keyboard', 'mouse'] >>> doc.terms.tf.values() [1, 1]
-
set_attr_value
(key, attr, value)¶ set the value of a given attribute for a given key
-
-
class
reliure.schema.
VectorItem
(vector, key)¶ Bases:
object
Internal class used to acces an item (= a value) of a
VectorField
>>> from reliure.types import Text, Numeric >>> doc = Doc(docnum='1') >>> doc.terms = Text(multi=True, uniq=True, attrs={'tf': Numeric()}) >>> doc.terms.add('chat') >>> type(doc.terms['chat']) <class 'reliure.schema.VectorItem'>
-
__init__
(vector, key)¶
-
as_dict
()¶
-
attribute_names
()¶
-
reliure.types
¶
inheritance diagrams¶
Class¶
-
class
reliure.types.
Boolean
(**kwargs)¶ Bases:
reliure.types.GenericType
-
TRUE_VALUES
= set([True, 'oui', 'o', '1', 'yes', 'true'])¶
-
__init__
(**kwargs)¶
-
default_validators
= [<reliure.validators.TypeValidator object at 0x7efc62433090>]¶
-
parse
(value)¶
-
-
class
reliure.types.
Datetime
(**kwargs)¶ Bases:
reliure.types.GenericType
datetime type
-
__init__
(**kwargs)¶
-
as_dict
()¶
-
parse
(value)¶
-
-
class
reliure.types.
GenericType
(default=None, help='', multi=None, uniq=None, choices=None, attrs=None, validators=[], parse=None, serialize=None)¶ Bases:
object
Define a type.
-
__init__
(default=None, help='', multi=None, uniq=None, choices=None, attrs=None, validators=[], parse=None, serialize=None)¶ Parameters: - default – default value for the field
- help (str) – description of what the data is
- multi (bool) – field is a list or a set
- uniq (bool) – wether the values are unique, only apply if multi is True
- choices (list) – if setted the value should be one of the given choice
- attrs – field attributes, dictionary of {“name”: AbstractType()}
- validators – list of additional validators
- parse – a parsing function
- serialize – a pre-serialization function
-
as_dict
()¶ returns a dictionary view of the option
Returns: the option converted in a dict Return type: dict
-
default
¶ Default value of the type
-
default_validators
= []¶
-
parse
(value)¶ parsing from string
-
serialize
(value, **kwargs)¶ pre-serialize value
-
validate
(value)¶ Abstract method, check if a value is correct (type). Should raise
TypeError
if the type the validation fail.Parameters: value – the value to validate Returns: the given value (that may have been converted)
-
-
class
reliure.types.
Numeric
(vtype=<type 'int'>, min=None, max=None, **kwargs)¶ Bases:
reliure.types.GenericType
Numerical type (int or float)
-
__init__
(vtype=<type 'int'>, min=None, max=None, **kwargs)¶ Parameters: - vtype – the type of numbers that can be stored in this field,
either
int
,float
. - signed (bool) – if the value may be negatif (True by default)
- min – if not None, the minimal possible value
- max – if not None, the maximal possible value
- vtype – the type of numbers that can be stored in this field,
either
-
as_dict
()¶
-
parse
(value)¶
-
reliure.utils
¶
reliure.utils.log
¶
Helper function to setup a basic logger for a reliure app
-
class
reliure.utils.log.
ColorFormatter
(*args, **kwargs)¶ Bases:
logging.Formatter
-
BLACK
= 0¶
-
BLUE
= 4¶
-
BOLD_SEQ
= '\x1b[1m'¶
-
COLORS
= {'BLUE': 4, 'YELLOW': 3, 'GREEN': 2, 'ERROR': 1, 'DEBUG': 4, 'CYAN': 6, 'WHITE': 7, 'RED': 1, 'INFO': 7, 'WARNING': 3, 'CRITICAL': 3, 'MAGENTA': 5}¶
-
COLOR_SEQ
= '\x1b[1;%dm'¶
-
CYAN
= 6¶
-
GREEN
= 2¶
-
MAGENTA
= 5¶
-
RED
= 1¶
-
RESET_SEQ
= '\x1b[0m'¶
-
WHITE
= 7¶
-
YELLOW
= 3¶
-
__init__
(*args, **kwargs)¶
-
format
(record)¶
-
-
class
reliure.utils.log.
SpeedLogger
(each=1000, elements='documents')¶ Bases:
reliure.pipeline.Composable
Pipeline element that do nothing but log the procesing speed every K element and at the end.
if you have a processing pipe, for example:
>>> from reliure.pipeline import Composable >>> pipeline = Composable(lambda data: (x**3 for x in data))
you can add a
SpeedLogger
in it:>>> pipeline |= SpeedLogger(each=30000, elements="numbers")
And then when you run your pipeline:
>>> import logging >>> from reliure.utils.log import get_basic_logger >>> logger = get_basic_logger(logging.INFO) >>> from reliure.offline import run >>> results = run(pipeline, range(100000))
You will get a logging like that:
2014-09-26 15:19:56,139:INFO:reliure.SpeedLogger:Process 30000 numbers in 0.010 sec (2894886.12 numbers/sec) 2014-09-26 15:19:56,148:INFO:reliure.SpeedLogger:Process 30000 numbers in 0.008 sec (3579572.14 numbers/sec) 2014-09-26 15:19:56,156:INFO:reliure.SpeedLogger:Process 30000 numbers in 0.008 sec (3719343.80 numbers/sec) 2014-09-26 15:19:56,159:INFO:reliure.SpeedLogger:Process 9997 numbers in 0.003 sec (3367096.85 numbers/sec) 2014-09-26 15:19:56,159:INFO:reliure.SpeedLogger:In total: 100000 numbers proceded in 0.030 sec (3307106.53 numbers/sec)
-
__call__
(inputs)¶
-
__init__
(each=1000, elements='documents')¶ Parameters: - each – log the speed every each element
- elements – name of the elements in the produced log lines
-
-
reliure.utils.log.
get_app_logger_color
(appname, app_log_level=20, log_level=30, logfile=None)¶ Configure the logging for an app using reliure (it log’s both the app and reliure lib)
Parameters: - appname – the name of the application to log
- log_level – log level for the reliure
- logfile – file that store the log, time rotating file (by day), no if None
Parap app_log_level: log level for the app
-
reliure.utils.log.
get_basic_logger
(level=30, scope='reliure')¶ return a basic logger that print on stdout msg from reliure lib
reliure.utils.i18n
¶
helpers for internationalisation
reliure.utils.cli
¶
copyright: |
|
---|---|
license: | ${LICENSE} |
Helper function to setup argparser from reliure components and engines
-
reliure.utils.cli.
argument_from_option
(parser, component, opt_name, prefix='')¶ Add an argparse argument to a parse from one option of one
Optionable
>>> comp = Optionable() >>> comp.add_option("num", Numeric(default=1, max=12, help="An exemple of option")) >>> parser = argparse.ArgumentParser(prog="PROG") >>> argument_from_option(parser, comp, "num") >>> parser.print_help() usage: PROG [-h] [--num NUM] optional arguments: -h, --help show this help message and exit --num NUM An exemple of option >>> parser.parse_args(["--num", "2"]) Namespace(num=2) >>> parser.parse_args(["--num", "20"]) Traceback (most recent call last): ... ValidationError: [u'Ensure this value ("20") is less than or equal to 12.']
-
reliure.utils.cli.
arguments_from_optionable
(parser, component, prefix='')¶ Add argparse arguments from all options of one
Optionable
>>> # Let's build a dummy optionable component: >>> comp = Optionable() >>> comp.add_option("num", Numeric(default=1, max=12, help="An exemple of option")) >>> comp.add_option("title", Text(help="The title of the title")) >>> comp.add_option("ok", Boolean(help="is it ok ?", default=True)) >>> comp.add_option("cool", Boolean(help="is it cool ?", default=False)) >>> >>> # one can then register all the options of this component to a arg parser >>> parser = argparse.ArgumentParser(prog="PROG") >>> arguments_from_optionable(parser, comp) >>> parser.print_help() usage: PROG [-h] [--num NUM] [--title TITLE] [--not-ok] [--cool] optional arguments: -h, --help show this help message and exit --num NUM An exemple of option --title TITLE The title of the title --not-ok is it ok ? --cool is it cool ?
The option values for a componant can then be retrieved with
get_config_for()
>>> args = parser.parse_args() >>> config = get_config_for(args, comp) >>> comp("input", **config) "comp_result"
-
reliure.utils.cli.
get_config_for
(args, component, prefix='')¶ Returns a dictionary of option value for a given component
See
arguments_from_optionable()
documentation
-
reliure.utils.
deprecated
(new_fct_name, logger=None)¶ Decorator to notify that a fct is deprecated
-
reliure.utils.
engine_schema
(engine, out_names=None, filename=None, format='pdf')¶ Build a graphviz schema of a reliure
Engine
.It depends on graphviz.
Parameters: - engine (
Engine
) – reliure engine to graph - out_names – list of block output to consider as engine output (all by default)
- filename – output filename, create the file if given
- formal – output file formal (pdf, svg, png, ...)
>>> from reliure.engine import Engine >>> egn = Engine("preproc", "proc1", "proc2") >>> egn.preproc.setup(in_name="input", out_name="data") >>> egn.proc1.setup(in_name="data", out_name="gold_data") >>> egn.proc2.setup(in_name="input", out_name="silver_data") >>> # you can create >>> schema = engine_schema(egn, filename='docs/img/engine_schema', format='png')
it create the following image :
You can specify which block output will be consider as engine output:
>>> schema = engine_schema(egn, ["gold_data", "silver_data"], filename='docs/img/engine_schema_simple', format='png')
Note that it is also possible to draw a pdf;
>>> schema = engine_schema(egn, ["gold_data", "silver_data"], filename='docs/img/engine_schema_simple', format='pdf')
- engine (
-
reliure.utils.
parse_bool
(value)¶ Convert a string to a boolean
>>> parse_bool(None) False >>> parse_bool("true") True >>> parse_bool("TRUE") True >>> parse_bool("yes") True >>> parse_bool("1") True >>> parse_bool("false") False >>> parse_bool("sqdf") False >>> parse_bool(False) False >>> parse_bool(True) True
reliure.validators
¶
-
class
reliure.validators.
ChoiceValidator
(ref_value)¶ Bases:
reliure.validators.compareValidator
-
compare
(value, ref)¶
-
message
= 'Ensure this value ("%(show_value)s") is in %(ref_value)s.'¶
-
-
class
reliure.validators.
MaxValueValidator
(ref_value)¶ Bases:
reliure.validators.compareValidator
-
compare
(value, ref)¶
-
message
= 'Ensure this value ("%(show_value)s") is less than or equal to %(ref_value)s.'¶
-
-
class
reliure.validators.
MinValueValidator
(ref_value)¶ Bases:
reliure.validators.compareValidator
-
compare
(value, ref)¶
-
message
= 'Ensure this value ("%(show_value)s") is greater than or equal to %(ref_value)s.'¶
-
reliure.web
¶
helpers to build HTTP/Json Api from reliure engines
-
reliure.web.
app_routes
(app)¶ list of route of an app
-
class
reliure.web.
EngineView
(engine, name=None)¶ Bases:
object
View over an
Engine
or aBlock
>>> engine = Engine("count_a", "count_b") >>> >>> engine.count_a.setup(in_name='in') >>> engine.count_a.set(lambda chaine: chaine.count("a")) >>> >>> engine.count_b.setup(in_name='in') >>> engine.count_b.set(lambda chaine: chaine.count("b")) >>> >>> >>> # we can build a view on this engine >>> egn_view = EngineView(engine, name="count") >>> egn_view.add_output("count_a") # note that by default block outputs are named by block's name >>> egn_view.add_output("count_b") >>> # one can specify a short url for this engine >>> egn_view.play_route("/q/<in>") >>> >>> # this view can be added to a reliure API >>> api = ReliureAPI("api") >>> api.register_view(egn_view)
-
__init__
(engine, name=None)¶
-
add_input
(in_name, type_or_parse=None)¶ Declare a possible input
-
add_output
(out_name, type_or_serialize=None, **kwargs)¶ Declare an output
-
options
()¶ Engine options discover HTTP entry point
-
play
()¶ Main http entry point: run the engine
-
play_route
(*routes)¶ Define routes for GET play.
This use Flask route syntax, see: http://flask.pocoo.org/docs/0.10/api/#url-route-registrations
-
run
(inputs_data, options)¶ Run the engine/block according to some inputs data and options
It is called from
play()
Parameters: - inputs_data – dict of input data
- options – engine/block configuration dict
-
set_input_type
(type_or_parse)¶ Set an unique input type.
If you use this then you have only one input for the play.
-
set_outputs
(*outputs)¶ Set the outputs of the view
-
short_play
(**kwargs)¶ Main http entry point: run the engine
-
-
class
reliure.web.
ComponentView
(component)¶ Bases:
reliure.web.EngineView
View over a simple component (
Composable
or simple function)-
__init__
(component)¶
-
add_input
(in_name, type_or_parse=None)¶
-
add_output
(out_name, type_or_serialize=None, **kwargs)¶
-
-
class
reliure.web.
ReliureAPI
(name='api', url_prefix=None, expose_route=True, **kwargs)¶ Bases:
flask.blueprints.Blueprint
Standart Flask json API view over a Reliure
Engine
.This is a Flask Blueprint (see http://flask.pocoo.org/docs/blueprints/)
Here is a simple usage exemple:
>>> from reliure.engine import Engine >>> engine = Engine("process") >>> engine.process.setup(in_name="in", out_name="out") >>> # setup the block's component >>> engine.process.set(lambda x: x**2) >>> >>> # configure a view for the engine >>> egn_view = EngineView(engine) >>> # configure view input/output >>> from reliure.types import Numeric >>> egn_view.set_input_type(Numeric()) >>> egn_view.add_output("out") >>> >>> ## create the API blueprint >>> api = ReliureAPI() >>> api.register_view(egn_view, url_prefix="egn") >>> >>> # here you get your blueprint >>> # you can register it to your app with >>> app.register_blueprint(api, url_prefix="/api")
Then you will have two routes:
- [GET] /api/: returns a json that desctibe your api routes
- [GET] /api/egn: returns a json that desctibe your engine
- [POST] /api/egn: run the engine itself
To use the “POST” entry point you can do :
>>> request = { ... "in": 5, # this is the name of my input ... "options": {} # this this the api/engine configuration ... } >>> res = requests.get( ... SERVER_URL+"/api/egn", ... data=json.dumps(request), ... headers={"content-type": "application/json"} ... ) >>> data = res.json() { meta: {...} results: { "out": 25 } }
-
__init__
(name='api', url_prefix=None, expose_route=True, **kwargs)¶ Build the Blueprint view over a
Engine
.Parameters: name – the name of this api (used as url prefix by default) Expose_route: wether / returns all api routes default True
-
register
(app, options, first_registration=False)¶
-
register_view
(view, url_prefix=None)¶ Associate a
EngineView
to this api
-
class
reliure.web.
RemoteApi
(url, **kwargs)¶ Bases:
flask.blueprints.Blueprint
Proxy to a remote
ReliureJsonAPI
-
__init__
(url, **kwargs)¶ Function doc :param url: engine api url
-
add_url_rule
(path, endpoint, *args, **kwargs)¶
-
forward
(**kwargs)¶ remote http call to api endpoint accept ONLY GET and POST
-
register
(app, options, first_registration=False)¶
-
Reliure: minimal framework for data processing¶
reliure
is a minimal and basic framework to manage pipeline of data
processing components in Python.
Documentation¶
In case your are not reading it yet, full documentation is available on ReadTheDoc: http://reliure.readthedocs.org
License¶
reliure
source code is available under the LGPL Version 3 license, unless otherwise indicated.
Requirements¶
reliure
works with both Python 2 and Python 3. it depends on:
All this deps may be installed with:
$ pip install -r requirements.txt
To develop reliure you will also need:
Dev dependances may be installed with:
$ pip install -r requirements.dev.txt
Contribute¶
The following should create a pretty good development environment:
$ git clone https://github.com/kodexlab/reliure.git
$ cd reliure
$ virtualenv ENV
$ source ENV/bin/activate
$ pip install -r requirements.txt
$ pip install -r requirements.dev.txt
To run tests:
$ make testall
To build the doc:
$ make doc
then open: docs/_build/html/index.html
Warning
You need to have reliure accesible in the python path to run tests (and to build doc). For that you can install reliure as a link in local virtualenv:
$ pip install -e .
(Note: this is indicadet in pytest good practice )
If you dev, in the same time, an other package than need your last reliure version, you can use:
$ pip install -e the_good_path/reliure # link reliure in local python packages