Welcome to Flatson’s documentation!

Contents:

Flatson

A tool to flatten JSON-like objects, allowing to configure via an annotated JSON schema

Features

  • Flattens Python dictionaries using a JSON schema
  • Supports per-field configuration via the schema

Usage:

>>> from flatson import Flatson
>>> schema = {
        "$schema": "http://json-schema.org/draft-04/schema",
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "number"},
            "address": {
                "type": "object",
                "properties": {"city": {"type": "string"}, "street": {"type": "string"}}
            },
            "skills": {"type": "array", "items": {"type": "string"}}
        }
    }
>>> sample = {
            "name": "Claudio", "age": 42,
            "address": {"city": "Paris", "street": "Rue de Sevres"},
            "skills": ["hacking", "soccer"]}
>>> f = Flatson(schema)
>>> f.fieldnames
['address.city', 'address.street', 'age', 'name', 'skills']
>>> f.flatten(sample)
['Paris', 'Rue de Sevres', 42, 'Claudio', '["hacking","soccer"]']

You can get a dict with the field names order preserved:

>>> f.flatten_dict(sample)
OrderedDict([('address.city', 'Paris'), ('address.street', 'Rue de Sevres'), ('age', 42), ('name', 'Claudio'), ('skills', '["hacking","soccer"]')])

You can also configure array serialization behavior through the schema (default JSON):

>>> schema = {
        "$schema": "http://json-schema.org/draft-04/schema",
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "skills": {
                "type": "array",
                "items": {"type": "string"},
                "flatson_serialize": {"method": "join_values"},
            }
        }
    }
>>> f = Flatson(schema)
>>> f.flatten({"name": "Salazar", "skills": ["hacking", "socker", "partying"]})
['Salazar', 'hacking,socker,partying']

Next Steps

Read more on how to use Flatson, check out the Github Repo and feel free to send Issues or PRs. =)

Installation

At the command line:

$ easy_install flatson

Or, if you have virtualenvwrapper installed:

$ mkvirtualenv flatson
$ pip install flatson

Using Flatson

Using Flatson is simple, you just need some JSON data to flatten and its JSON schema. Flatson will use the information from the schema to understand the structure of your object, which makes the flattening easier and more predictable.

Note

If you don’t have the JSON schema for the data you want to flatten, you can use a tool to generate a JSON schema for your data, like Skinfer or http://jsonschema.net.

Walk-through with an example

Say you have the following JSON schema in a file named schemafile.json:

{
    "$schema": "http://json-schema.org/draft-04/schema",
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "number"},
        "address": {
            "type": "object",
            "properties": {"city": {"type": "string"}, "street": {"type": "string"}}
        },
        "skills": {"type": "array", "items": {"type": "string"}}
    }
}

You can instantiate the Flatson class for this schemafile like this:

>>> from flatson import Flatson
>>> f = Flatson.fromschemafile('schemafile.json')
>>> f.fieldnames
['address.city', 'address.street', 'age', 'name', 'skills']

Note how Flatson has inferred the flattened field names, which you can access through the fieldnames property.

Let’s test it with some sample data:

>>> sample = {
            "name": "Claudio", "age": 42,
            "address": {"city": "Paris", "street": "Rue de Sevres"},
            "skills": ["hacking", "soccer"]}
>>> f.flatten(sample)
['Paris', 'Rue de Sevres', 42, 'Claudio', '["hacking","soccer"]']

There are a couple of things to note here:

  1. the flatten() method simply returns a list of simple objects
  2. the array is by default serialized as a JSON string

Note

Array serialization is a topic apart, for now it suffices to say that if you don’t like this default behavior, there are other options you can configure through the schema, you can even register your own serialization methods if you like.

Say you actually want a Python dict instead of a list, no worries, just use flatten_dict():

>>> f.flatten_dict(sample)
OrderedDict([('address.city', 'Paris'), ('address.street', 'Rue de Sevres'), ('age', 42), ('name', 'Claudio'), ('skills', '["hacking","soccer"]')])

Note that this returns an OrderedDict instead of a traditional Python dict: this has the advantage of preserving the same field ordering of the the list returned by the flatten() method.

Array serialization

TODO: write about array serialization here, point to design decisions, list available methods

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

You can contribute in many ways:

Types of Contributions

Report Bugs

Report bugs at https://github.com/scrapinghub/flatson/issues.

If you are reporting a bug, please include:

  • Your operating system name and version.
  • Any details about your local setup that might be helpful in troubleshooting.
  • Detailed steps to reproduce the bug.

Fix Bugs

Look through the GitHub issues for bugs. Anything tagged with “bug” is open to whoever wants to implement it.

Implement Features

Look through the GitHub issues for features. Anything tagged with “feature” is open to whoever wants to implement it.

Write Documentation

Flatson could always use more documentation, whether as part of the official Flatson docs, in docstrings, or even on the web in blog posts, articles, and such.

Submit Feedback

The best way to send feedback is to file an issue at https://github.com/scrapinghub/flatson/issues.

If you are proposing a feature:

  • Explain in detail how it would work.
  • Keep the scope as narrow as possible, to make it easier to implement.
  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Get Started!

Ready to contribute? Here’s how to set up flatson for local development.

  1. Fork the flatson repo on GitHub.

  2. Clone your fork locally:

    $ git clone git@github.com:your_name_here/flatson.git
    
  3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:

    $ mkvirtualenv flatson
    $ cd flatson/
    $ python setup.py develop
    
  4. Create a branch for local development:

    $ git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  5. When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:

    $ flake8 flatson tests
    $ python setup.py test
    $ tox
    

    To get flake8 and tox, just pip install them into your virtualenv.

  6. Commit your changes and push your branch to GitHub:

    $ git add .
    $ git commit -m "Your detailed description of your changes."
    $ git push origin name-of-your-bugfix-or-feature
    
  7. Submit a pull request through the GitHub website.

Pull Request Guidelines

Before you submit a pull request, check that it meets these guidelines:

  1. The pull request should include tests.
  2. If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring, and add the feature to the list in README.rst.
  3. The pull request should work for Python 2.6, 2.7, 3.3, and 3.4, and for PyPy. Check https://travis-ci.org/scrapinghub/flatson/pull_requests and make sure that the tests pass for all supported Python versions.

Tips

To run a subset of tests:

$ python -m unittest tests.test_flatson

Credits

Development Lead

Contributors

History

0.1.0 (2015-09-25)

  • First release on PyPI.

Indices and tables