Lucidity

Filesystem templating and management.

Guide

Overview and examples of using the system in practice.

Introduction

Lucidity is a framework for templating filesystem structure.

It works using regular expressions, but hides much of the verbosity through the use of simple placeholders (such as you see in Python string formatting).

Consider the following paths:

/jobs/monty/assets/circus/model/high/circus_high_v001.abc
/jobs/monty/assets/circus/model/low/circus_low_v001.abc
/jobs/monty/assets/parrot/model/high/parrot_high_v002.abc

A regular expression to describe them might be:

'/jobs/(?P<job>[\w_]+?)/assets/(?P<asset_name>[\w_]+?)/model/(?P<lod>[\w_]+?)/(?P<asset_name>[\w_]+?)_(?P<lod>[\w_]+?)_v(?P<version>\d+?)\.(?P<filetype>\w+?)'

Meanwhile, the Lucidity pattern would be:

'/jobs/{job}/assets/{asset_name}/model/{lod}/{asset_name}_{lod}_v{version}.{filetype}'

With Lucidity you store this pattern as a template and can then use that template to generate paths from data as well as extract data from matching paths in a standard fashion.

Read the Tutorial to find out more.

Installation

Installing Lucidity is simple with pip:

$ pip install lucidity

If the Cheeseshop (a.k.a. PyPI) is down, you can also install Lucidity from one of the mirrors:

$ pip install --use-mirrors lucidity

Alternatively, you may wish to download manually from Gitlab where Lucidity is actively developed.

You can clone the public repository:

$ git clone git@gitlab.com:4degrees/lucidity.git

Or download an appropriate zipball

Once you have a copy of the source, you can embed it in your Python package, or install it into your site-packages:

$ python setup.py install

Dependencies

For testing:

Tutorial

This tutorial gives a good introduction to using Lucidity.

First make sure that you have Lucidity installed.

Patterns

Lucidity uses patterns to represent a path structure. A pattern is very much like the format you would use in a Python string format expression.

For example, a pattern to represent this filepath:

'/jobs/monty/assets/circus/model/high/circus_high_v001.abc'

Could be:

'/jobs/{job}/assets/{asset_name}/model/{lod}/{asset_name}_{lod}_v{version}.{filetype}'

Each {name} in braces is a variable that can either be extracted from a matching path, or substituted with a provided value when constructing a path. The variable is referred to as a placeholder.

Templates

A Template is a simple container for a pattern.

First, import the package:

>>> import lucidity

Now, construct a template with the pattern above:

>>> template = lucidity.Template('model', '/jobs/{job}/assets/{asset_name}/model/{lod}/{asset_name}_{lod}_v{version}.{filetype}')

Note

The template must be given a name to identify it. The name becomes useful when you have a bunch of templates to manage.

You can check the identified placeholders in a template using the Template.keys method:

>>> print template.keys()
set(['job', 'asset_name', 'lod', 'version', 'filetype'])

Parsing

With a template defined we can now parse a path and extract data from it:

>>> path = '/jobs/monty/assets/circus/model/high/circus_high_v001.abc'
>>> data = template.parse(path)
>>> print data
{
    'job': 'monty',
    'asset_name': 'circus',
    'lod': 'high',
    'version': '001',
    'filetype': 'abc'
}

If a template’s pattern does not match the path then parse() will raise a ParseError:

>>> print template.parse('/other/monty/assets')
ParseError: Input '/other/monty/assets' did not match template pattern.
Handling Duplicate Placeholders

It is perfectly acceptable for a template to contain the same placeholder multiple times, as seen in the template constructed above. When parsing, by default, the last matching value for a placeholder is used:

>>> path = '/jobs/monty/assets/circus/model/high/spaceship_high_v001.abc'
>>> data = template.parse(path)
>>> print data['asset_name']
spaceship

This is called RELAXED mode. If this behaviour is not desirable then the duplicate_placeholder_mode of any Template can be set to STRICT mode instead:

>>> path = '/jobs/monty/assets/circus/model/high/spaceship_high_v001.abc'
>>> template.duplicate_placeholder_mode = template.STRICT
>>> template.parse(path)
ParseError: Different extracted values for placeholder 'asset_name' detected. Values were 'circus' and 'spaceship'.

Note

duplicate_placeholder_mode can also be passed as an argument when constructing a template.

Anchoring

By default, a pattern is anchored at the start, requiring that the start of a path match the pattern:

>>> job_template = lucidity.Template('job', '/job/{job}')
>>> print job_template.parse('/job/monty')
{'job': 'monty'}
>>> print job_template.parse('/job/monty/extra/path')
{'job': 'monty'}
>>> print job_template.parse('/other/job/monty')
ParseError: Input '/other/job/monty' did not match template pattern.

The anchoring can be changed when constructing a template by passing an anchor keyword in:

>>> filename_template = lucidity.Template(
...     'filename',
...     '{filename}.{index}.{ext}',
...     anchor=lucidity.Template.ANCHOR_END
... )
>>> print filename_template.parse('/some/path/to/file.0001.dpx')
{'filename': 'file', 'index': '0001', 'ext': 'dpx'}

The anchor can be one of:

  • ANCHOR_START - Match pattern at the start of the string.
  • ANCHOR_END - Match pattern at the end of the string.
  • ANCHOR_BOTH - Match pattern exactly.
  • None - Match pattern once anywhere in the string.

Formatting

It is also possible to pass a dictionary of data to a template in order to produce a path:

>>> data = {
...     'job': 'monty',
...     'asset_name': 'circus',
...     'lod': 'high',
...     'version': '001',
...     'filetype': 'abc'
... }
>>> path = template.format(data)
>>> print path
/jobs/monty/assets/circus/model/high/circus_high_v001.abc

In the example above, we haven’t done more than could be achieved with standard Python string formatting. In the next sections, though, you will see the need for a dedicated format() method.

If the supplied data does not contain enough information to fill the template completely a FormatError will be raised:

>>> print template.format({})
FormatError: Could not format data {} due to missing key 'job'.

Nested Data Structures

Often the data structure you want to use will be more complex than a single level dictionary. Therefore, Lucidity also supports nested dictionaries when both parsing or formatting a path.

To indicate a nested structure, use a dotted notation in your placeholder name:

>>> template = lucidity.Template('job', '/jobs/{job.code}')
>>> print template.parse('/jobs/monty')
{'job': {'code': 'monty'}}
>>> print template.format({'job': {'code': 'monty'}})
/jobs/monty

Note

Unlike the standard Python format syntax, the dotted notation in Lucidity always refers to a nested item structure rather than attribute access.

Custom Regular Expressions

Lucidity works by constructing a regular expression from a pattern. It replaces all placeholders with a default regular expression that should suit most cases.

However, if you need to customise the regular expression you can do so either at a template level or per placeholder.

At The Template Level

To modify the default regular expression for a template, pass it is as an additional argument:

>>> template = lucidity.Template('name', 'pattern',
                                 default_placeholder_expression='[^/]+')
Per Placeholder

To alter the expression for a single placeholder, use a colon : after the placeholder name and follow with your custom expression:

>>> template = lucidity.Template('name', 'file_v{version:\d+}.ext')

Above, the version placeholder expression has been customised to only match one or more digits.

Note

If your custom expression requires the use of braces ({}) you must escape them to distinguish them from the placeholder braces. Use a preceding backslash for the escape (\{, \}).

And of course, any custom expression text is omitted when formatting data:

>>> print template.format({'version': '001'})
file_v001.ext

Managing Multiple Templates

Representing different path structures requires the use of multiple templates.

Lucidity provides a few helper functions for dealing with multiple templates.

Template Discovery

Templates can be discovered by searching a list of paths for mount points that register template instances. By default, the list of paths is retrieved from the environment variable LUCIDITY_TEMPLATE_PATH.

To search and load templates in this way:

>>> import lucidity
>>> templates = lucidity.discover_templates()

To specify a specific list of paths just pass them to the function:

>>> templates = lucidity.discover_templates(paths=['/templates'])

By default each path will be recursively searched. You can disable this behaviour by setting the recursive keyword argument:

>>> templates = lucidity.discover_templates(recursive=False)

Template Mount Points

To write a template mount point, define a Python file containing a register function. The function should return a list of instantiated Template instances:

# templates.py

from lucidity import Template

def register():
    '''Register templates.'''
    return [
        Template('job', '/jobs/{job.code}'),
        Template('shot', '/jobs/{job.code}/shots/{scene.code}_{shot.code}')
    ]

Place the file on one of the search paths for discover_templates() to have it take effect.

Operations Against Multiple Templates

Lucidity also provides two top level functions to run a parse or format operation against multiple candidate templates using the first correct result found.

Given the following templates:

>>> import lucidity
>>> templates = [
...     lucidity.Template('model', '/jobs/{job}/assets/model/{lod}'),
...     lucidity.Template('rig', '/jobs/{job}/assets/rig/{rig_type}')
... ]

To perform a parse:

>>> print lucidity.parse('/jobs/monty/assets/rig/anim', templates)
({'job': 'monty', 'rig_type': 'anim'},
 Template(name='rig', pattern='/jobs/{job}/assets/rig/{rig_type}'))

To format data:

>>> print lucidity.format({'job': 'monty', 'rig_type': 'anim'}, templates)
('/jobs/monty/assets/rig/anim',
 Template(name='rig', pattern='/jobs/{job}/assets/rig/{rig_type}'))

Note

The return value is a tuple of (result, template).

If no template could provide a result an appropriate error is raised ( ParseError or FormatError).

Using Template References

When the same pattern is used repetitively in several templates, it can be useful to extract it out into a separate template that can be referenced.

To reference another template, use its name prefixed by the at symbol, @, in a placeholder:

>>> shot_path = lucidity.Template('shot_path', '{@job_path}/shots/{shot.code}')

Template references are resolved on demand when performing operations with the template, such as calling Template.parse() or Template.format(). This is why the above didn’t produce an error even though no job_path template has been defined (or a way to lookup that template by name even). This behaviour allows discovery of templates without worrying about the order of template construction.

As soon as you attempt to perform an operation on the template that does require resolving references, errors will be raised accordingly. For example, try calling the Template.keys() method:

>>> print shot_path.keys()
ResolveError: Failed to resolve reference 'job_path' as no template resolver set.

The error indicates that we have not provided the template with a way to actually resolve template references. To do this we need to set the template_resolver attribute on the template to an object that matches the Resolver interface. Fortunately, the resolver interface is simple so a even a basic dictionary can act as a resolver:

>>> resolver = {}
>>> shot_path.template_resolver = resolver

Note

The template resolver can also be passed as an argument when instantiating a new Template.

Try getting the keys again:

>>> print shot_path.keys()
ResolveError: Failed to resolve reference 'job_path' using template resolver.

Slightly better. Now we have a resolver in place we just need to add the other template to the resolver so that it can be looked up by name:

>>> job_path = lucidity.Template('job_path', '/jobs/{job.code}')
>>> resolver[job_path.name] = job_path

Note

In a future release a dedicated template collection class will make dealing with template references even easier.

Print the keys again and it should resolve all the references and give back the full list of keys that make up the expanded template:

>>> print shot_path.keys()
set(['job.code', 'shot.code'])

There is also a method for listing the references found in a template:

>>> print shot_path.references()
set(['job_path'])

Additionally, if you would like to see the fully expanded pattern you can manually call the Template.expanded_pattern() method at any time:

>>> print shot_path.expanded_pattern()
/jobs/{job.code}/shots/{shot.code}

Anchor behaviour

A Template has an anchor setting that determines how the template pattern is matched when parsing. When a template is referenced in another template its anchor setting is ignored and only the anchor setting of the outermost template is used:

>>> template_a = lucidity.Template(
...     'a', 'path/{variable}', anchor=lucidity.Template.ANCHOR_START
... )
>>> print template_a.parse('/some/path/value')
ParseError: Path '/some/path/value' did not match template pattern.
>>> resolver = {}
>>> resolver[template_a.name] = template_a
>>> template_b = lucidity.Template(
...     'b', '{@a}', anchor=lucidity.Template.ANCHOR_END,
...     template_resolver=resolver
... )
>>> print template_b.parse('/some/path/value')
{'variable': 'value'}

Reference

API reference providing details on the actual code.

lucidity

lucidity.discover_templates(paths=None, recursive=True)[source]

Search paths for mount points and load templates from them.

paths should be a list of filesystem paths to search for mount points. If not specified will try to use value from environment variable LUCIDITY_TEMPLATE_PATH.

A mount point is a Python file that defines a ‘register’ function. The function should return a list of instantiated Template objects.

If recursive is True (the default) then all directories under a path will also be searched.

lucidity.parse(path, templates)[source]

Parse path against templates.

path should be a string to parse.

templates should be a list of Template instances in the order that they should be tried.

Return (data, template) from first successful parse.

Raise ParseError if path is not parseable by any of the supplied templates.

lucidity.format(data, templates)[source]

Format data using templates.

data should be a dictionary of data to format into a path.

templates should be a list of Template instances in the order that they should be tried.

Return (path, template) from first successful format.

Raise FormatError if data is not formattable by any of the supplied templates.

lucidity.get_template(name, templates)[source]

Retrieve a template from templates by name.

Raise NotFound if no matching template with name found in templates.

template

class lucidity.template.Template(name, pattern, anchor=1, default_placeholder_expression='[\w_.\-]+', duplicate_placeholder_mode=1, template_resolver=None)[source]

Bases: object

A template.

ANCHOR_START = 1
ANCHOR_END = 2
ANCHOR_BOTH = 3
RELAXED = 1
STRICT = 2
__init__(name, pattern, anchor=1, default_placeholder_expression='[\\w_.\\-]+', duplicate_placeholder_mode=1, template_resolver=None)[source]

Initialise with name and pattern.

anchor determines how the pattern is anchored during a parse. A value of ANCHOR_START (the default) will match the pattern against the start of a path. ANCHOR_END will match against the end of a path. To anchor at both the start and end (a full path match) use ANCHOR_BOTH. Finally, None will try to match the pattern once anywhere in the path.

duplicate_placeholder_mode determines how duplicate placeholders will be handled during parsing. RELAXED mode extracts the last matching value without checking the other values. STRICT mode ensures that all duplicate placeholders extract the same value and raises ParseError if they do not.

If template_resolver is supplied, use it to resolve any template references in the pattern during operations. It should conform to the Resolver interface. It can be changed at any time on the instance to affect future operations.

name

Return name of template.

pattern

Return template pattern.

expanded_pattern()[source]

Return pattern with all referenced templates expanded recursively.

Raise lucidity.error.ResolveError if pattern contains a reference that cannot be resolved by currently set template_resolver.

parse(path)[source]

Return dictionary of data extracted from path using this template.

Raise ParseError if path is not parsable by this template.

format(data)[source]

Return a path formatted by applying data to this template.

Raise FormatError if data does not supply enough information to fill the template fields.

keys()[source]

Return unique set of placeholders in pattern.

references()[source]

Return unique set of referenced templates in pattern.

class lucidity.template.Resolver[source]

Bases: object

Template resolver interface.

get(template_name, default=None)[source]

Return template that matches template_name.

If no template matches then return default.

error

Custom error classes.

exception lucidity.error.ParseError[source]

Bases: exceptions.Exception

Raise when a template is unable to parse a path.

exception lucidity.error.FormatError[source]

Bases: exceptions.Exception

Raise when a template is unable to format data into a path.

exception lucidity.error.NotFound[source]

Bases: exceptions.Exception

Raise when an item cannot be found.

exception lucidity.error.ResolveError[source]

Bases: exceptions.Exception

Raise when a template reference can not be resolved.

Release and migration notes

Find out what has changed between versions and see important migration notes to be aware of when switching to a new version.

Release Notes

1.5.1

20 October 2018
  • fixed

    Specify dependencies using case sensitive names to support systems that cannot resolve in a case insensitive manner (such as “Nexus Repository”).

1.5.0

6 June 2015

1.4.1

26 May 2015
  • changed

    Implemented custom formatter internally, removing dependency on Bunch and simplifying format logic.

1.4.0

25 May 2015
  • new

    Added duplicate_placeholder_mode to control template behaviour when parsing templates with duplicate placeholders, including a new STRICT mode.

1.3.1

1 April 2014

1.3.0

28 March 2014
  • changed

    Removed dependency on Regex module to simplify installation across different platforms.

1.2.0

9 March 2014
  • new

    Added Template.keys() for retrieving set of placeholders used in a template pattern.

1.1.0

8 March 2014
  • new

    Support partial matching of template when parsing with the introduction of a new anchor setting on templates.

    See also

    Anchoring

  • new

    Helper function lucidity.get_template() for retrieving a template by name from a list of templates.

  • fixed

    Special regex characters not escaped in pattern leading to incorrect parses.

  • fixed

    Template.format() fails with unhandled AttributeError when nested dictionary is missing a required key.

1.0.0

1 September 2013
  • Initial release with support for Template objects that can use a simple pattern to parse strings into data and format data into strings.

    See also

    Tutorial

Migration notes

This section will show more detailed information when relevant for switching to a new version, such as when upgrading involves backwards incompatibilities.

Glossary

LUCIDITY_TEMPLATE_PATH

Environment variable defining paths to search for template mount points. Can be multiple paths separated by the appropriate path separator for your operating system.

Indices and tables