Quick Start

Pymongoext is an ORM-like Pymongo extension that adds json schema validation, index management and intermediate data manipulators. Pymongoext simplifies working with MongoDB, while maintaining a syntax very identical to Pymongo.

pymongoext.Model is simply a wrapper around pymongo.Collection. As such, all of the pymongo.Collection API is exposed through pymongoext.Model. If you don’t find what you want in the pymongoext.Model API, please take a look at pymongo’s Collection documentation.

Features

  • schema validation (which uses MongoDB JSON Schema validation)
  • schema-less feature
  • nested and complex schema declaration
  • untyped field support
  • required fields validation
  • default values
  • custom validators
  • operator for validation (OneOf, AllOf, AnyOf, Not)
  • indexes management
  • data manipulators (transform documents before saving and after retrieval)
  • Easy to create custom data manipulators
  • object-like results instead of dict-like. (i.e. foo.bar instead of foo[‘bar’])
  • No custom-query language or API to learn (If you know how to use pymongo, you already know how to use pymongoext)

Examples

Some simple examples of what pymongoext code looks like:

from datetime import datetime
from pymongo import MongoClient, IndexModel
from pymongoext import *


class User(Model):
    @classmethod
    def db(cls):
        return MongoClient()['my_database_name']

    __schema__ = DictField(dict(
        email=StringField(required=True),
        name=StringField(required=True),
        yob=IntField(minimum=1900, maximum=2019)
    ))

    __indexes__ = [IndexModel('email', unique=True), 'name']

    class AgeManipulator(Manipulator):
        def transform_outgoing(self, doc, model):
            doc['age'] = datetime.now().year - doc['yob']
            return doc


# Create a user
>>> User.insert_one({'email': 'jane@gmail.com', 'name': 'Jane Doe', 'yob': 1990})

# Fetch one user
>>> user = User.find_one()

# Print the users age
>>> print(user.age)

Contents

Guides

Getting Started

Before we start, make sure that a copy of MongoDB is running in an accessible location. If you haven’t installed pymongoext, simply use pip to install it like so:

$ pip install pymongoext

Every Model subclass need implement the pymongoext.model.Model.db() method which returns a valid pymongo database instance. To achieve this, we advice creating a base model class that implements the db() method.

from pymongo import MongoClient
from pymongoext import Model

class BaseModel(Model):
    @classmethod
    def db(cls):
        return MongoClient()['my_database_name']

Now all concrete models would extend this class.

Defining our documents

MongoDB is schemaless, which means that no schema is enforced by the database — we may add and remove fields however we want and MongoDB won’t complain. This makes life a lot easier in many regards, especially when there is a change to the data model. However, defining schemas for our documents can help to iron out bugs involving incorrect types or missing fields, and also allow us to define utility methods on our documents in the same way that traditional ORMs do.

Users Model

Just as if we were using a relational database with an ORM, we need to define which fields a User may have (schema), and what types of data they might store.

from pymongoext import DictField, StringField, IntField

class User(BaseModel):
   __schema__ = DictField(dict(
      email=StringField(required=True),
      first_name=StringField(required=True),
      last_name=StringField(required=True),
      yob=IntField(minimum=1900, maximum=2019)
   ))

Indexes

MongoDB supports secondary indexes. With pymongoext, we define these indexes as a list within our Model on the __indexes__ variable. Both Single Field and compound indexes are supported.

from pymongo import IndexModel

class User(BaseModel):
   .....
   __indexes__ = [IndexModel('email', unique=True), 'first_name']

This example creates a unique index on email and an index on first_name.

Single Field descending index

Suppose we wanted the index on first_name to be sorted in a descending manner. That could be achieved as follows

>>> ('first_name', pymongo.DESCENDING)

Alternatively, we could also create a descending index by prefixing the field name with a - sign

>>> '-first_name'
Compound indexes

A compound index is simply a list of single field indexes. Therefore, to create a compound index on first_name and last_name sorted in opposite directions

from pymongo import IndexModel

class User(BaseModel):
   .....
   __indexes__ = [
      IndexModel('email', unique=True),
      ['-first_name', '+last_name'] # compound index
   ]

Note the - and + signs which specify the index on first_name and last_name should be sorted in descending and ascending order respectively.

Manipulators

Manipulators are useful for manipulating (adding, removing, modifying) document properties before being persisted to MongoDB and after retrieval. A manipulator has two methods transform_incoming and transform_outgoing.

Suppose you want to print out the person’s full name. You could do it yourself:

user = User.find_one()
print("{} {}".format(user['first_name'], user['last_name']))

But concatenating the first and last name every time can get cumbersome. And what if you want to do some extra processing on the name, like removing diacritics? A Manipulator lets you define a virtual property full_name that won’t get persisted to MongoDB.

from pymongoext.manipulators import Manipulator

class User(BaseModel):
   .....
   class FullNameManipulator(Manipulator):
      def transform_outgoing(self, doc, model):
         doc['full_name'] = "{} {}".format(user['first_name'], user['last_name'])
         return doc

      def transform_incoming(self, doc, model, action):
         if 'full_name' in doc:
            del doc['full_name']  # Don't persist full name
         return doc

Now, every document you retrieve will have a full_name property.

user = User.find_one()
print(user['full_name'])
Parametrized Manipulator

A manipulator is bind to a Model as either a manipulator class or object. The later is necessary when using manipulators that are initialized with parameters.

Suppose you had a manipulator that adds a dynamic value to a dynamic field:

class AddManipulator(Manipulator):
   def __init__(self, field, value):
      self.field = field
      self.value = value

   def transform_outgoing(self, doc, model):
      doc[self.field] = doc[self.field] + self.value
      return doc

Then we would bind this manipulator to the model as an object

class User(BaseModel):
   .....
   addOneToFamilySize = AddManipulator('family_size', 1)

class Book(BaseModel):
   ......
   subtractFirstAndLastPages = AddManipulator('page_count', -2)

Install

Install using Pip

pymongoext is currently available for Python >= 3. Plans to add support for python 2.7 are underway. The recommended way to install pymongoext is using pip:

python -m pip install pymongoext

Install from Source

To install pymongoext from source, clone the repository from github:

git clone https://github.com/musyoka-morris/pymongoext.git
cd pymongoext
python setup.py install

or use pip locally if you want to install all dependencies as well:

pip install .

API Reference

Model

class pymongoext.model.Model

Bases: object

The base class used for defining the structure and properties of collections of documents stored in MongoDB. You should not use the Model class directly. Instead Inherit from this class to define a document’s structure.

In pymongoext, the term “Model” refers to subclasses of the Model class. A Model is your primary tool for interacting with MongoDB collections.

Note

All concrete classes must implement the db() which should return a valid mongo database instance.

Examples

Create a Users model

from pymongo import MongoClient, IndexModel
from pymongoext import Model, DictField, StringField, IntField

class User(Model):

    @classmethod
    def db(cls):
        return MongoClient()['my_database_name']

    __schema__ = DictField(dict(
        email=StringField(required=True),
        name=StringField(required=True),
        password=StringField(required=True),
        age=IntField(minimum=0)
    ))

    __indexes__ = [IndexModel('email', unique=True), 'name']

All pymongo.collection.Collection methods and attributes can be accessed through the Model as shown below

Create a new user

result = User.insert_one({
    "email": "john.doe@dummy.com",
    "name": "John Doe",
    "password": "secret",
    "age": 35
})
user_id = result.inserted_id

Find a single document by id. See get() for an alternative to find_one()

user = User.find_one(user_id)

Check if a document exists

user_exists = User.exists({"_id": user_id})
__collection_name__ = None

Pymongoext by default produces a collection name by taking the under_score_case of the model. See inflection.underscore for more info.

Set this if you need a different name for your collection.

__auto_update__ = True

By default, pymongoext ensures the indexes and schema defined on the model are in sync with mongodb. It achieves this by creating & dropping indexes and pushing updates to the the JsonSchema defined in mongodb collection. This is done when your code runs for the first time or the server is restarted.

If you want to disable this functionality and manage the updates yourself, you can set __auto_update__ to False.

But then remember to call _update() yourself to update the schema.

__indexes__ = []

List of Indexes to create on this collection

A valid index can be either:

  1. string optionally prefixed with a + or - sign.
  2. (string, int) tuple
  3. A list whose values are either as defined in 1 or 2 above (Compound indexes)
  4. an instance of pymongo.IndexModel

See the create_index and create_indexes methods of pymongo Collection for more info.

__schema__ = None

Specifies model schema

Type:pymongoext.fields.DictField
classmethod exists(filter=None, *args, **kwargs)

Check if a document exists in the database

All arguments to find() are also valid arguments for exists(), although any limit argument will be ignored. Returns True if a matching document is found, otherwise False .

Parameters:
  • filter (optional) – a dictionary specifying the query to be performed OR any other type to be used as the value for a query for "_id".
  • *args (optional) – any additional positional arguments are the same as the arguments to find().
  • **kwargs (optional) – any additional keyword arguments are the same as the arguments to find().
classmethod get(filter=None, *args, **kwargs)

Retrieve the the matching object raising pymongoext.exceptions.MultipleDocumentsFound exception if multiple results and pymongoext.exceptions.NoDocumentFound if no results are found.

All arguments to find() are also valid arguments for get(), although any limit argument will be ignored. Returns the matching document

Parameters:
  • filter (optional) – a dictionary specifying the query to be performed OR any other type to be used as the value for a query for "_id".
  • *args (optional) – any additional positional arguments are the same as the arguments to find().
  • **kwargs (optional) – any additional keyword arguments are the same as the arguments to find().
classmethod db()

Get the mongo database instance associated with this collection

All concrete classes must implement this method. A sample implementation is shown below

from pymongo import MongoClient
from pymongoext import Model

class User(Model):
    @classmethod
    def db(cls):
        return MongoClient()['my_database_name']
Returns:pymongo.database.Database
classmethod name()

Returns the collection name.

See __collection_name__ for more info on how the collection name is determined

classmethod c()

Get the collection associated with this model. This method ensures that the model indexes and schema validators are up to date.

Returns:pymongo.collection.Collection
classmethod apply_incoming_manipulators(doc, action)

Apply manipulators to an incoming document before it gets stored.

Parameters:
  • doc (dict) – the document to be inserted into the database
  • action (str) – the incoming action being performed
Returns:

the transformed document

Return type:

dict

classmethod apply_outgoing_manipulators(doc)

Apply manipulators to an outgoing document.

Parameters:doc (dict) – the document being retrieved from the database
Returns:the transformed document
Return type:dict
classmethod parse(data, with_defaults=False)

Prepare the data to be stored in the db

For example, given a simple user model

class User(Model):

    @classmethod
    def db(cls):
        return MongoClient()['my_database_name']

    __schema__ = DictField(dict(
        name=StringField(required=True),
        age=IntField(minimum=0, required=True, default=18)
    ))
User.parse({'name': 'John Doe'}, with_defaults=True)
>>> {'name': 'John Doe', 'age': 18}
Parameters:
  • data (dict) – Data to be stored
  • with_defaults (bool) – If True, None and missing values are set to the field default
Returns:

dict

classmethod manipulators()

Return a list of manipulators to be applied to incoming and outgoing documents. Manipulators are applied sequentially in an order determined by priority value of the manipulator. Manipulators with a lower priority will be applied first.

A manipulator operates on a single document before it is saved to mongodb and after it is retrieved. See pymongoext.manipulators.Manipulator on how to implement your own manipulators.

By default every pymongoext model has two manipulators:

  1. IdWithoutUnderscoreManipulator with priority=0
  2. ParseInputsManipulator with priority=7
Returns:list of pymongoext.manipulators.Manipulator
class IdWithoutUnderscoreManipulator

Bases: pymongoext.manipulators.Manipulator

A document manipulator that manages a virtual id field.

priority = 0
transform_incoming(doc, model, action)

Remove id field if given and set _id to that value if missing

transform_outgoing(doc, model)

Add an id field if it is missing.

class ParseInputsManipulator

Bases: pymongoext.manipulators.Manipulator

Parses incoming documents to ensure data is in the valid format

priority = 7
transform_incoming(doc, model, action)

Manipulate an incoming document.

Parameters:
  • doc (dict) – the SON object to be inserted into the database
  • model (Type[pymongoext.model.Model]) – the model the object is associated with
  • action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
class MunchManipulator

Bases: pymongoext.manipulators.Manipulator

Transforms documents to Munch objects. A Munch is a Python dictionary that provides attribute-style access

See https://github.com/Infinidat/munch

priority = -1
transform_incoming(doc, model, action)

Manipulate an incoming document.

Parameters:
  • doc (dict) – the SON object to be inserted into the database
  • model (Type[pymongoext.model.Model]) – the model the object is associated with
  • action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
transform_outgoing(doc, model)

Manipulate an outgoing document.

Parameters:
  • doc (dict) – the SON object being retrieved from the database
  • model (Type[pymongoext.model.Model]) – the model associated with this document

Fields

class pymongoext.fields.Field(default=None, required=False, enum=None, title=None, description=None, **kwargs)

Base class for all fields

Parameters:
  • required (bool) – Specifies if a value is required for this field. defaults to False
  • enum (list) – Enumerates all possible values of the field
  • title (str) – A descriptive title string with no effect
  • description (str) – A string that describes the schema and has no effect
  • **kwargs – Additional parameters
schema()

Creates a valid JsonSchema object

Returns:dict
class pymongoext.fields.NullField

Null field

class pymongoext.fields.StringField(max_length=None, min_length=None, pattern=None, **kwargs)

String field

Parameters:
  • max_length (int) – The maximum length of the field
  • min_length (int) – The minimum length of the field
  • pattern (str) – Field must match the regular expression
  • **kwargs – any additional keyword arguments are the same as the arguments to the Field class
class pymongoext.fields.NumberField(maximum=None, minimum=None, exclusive_maximum=None, exclusive_minimum=None, multiple_of=None, **kwargs)

Numeric field

Parameters:
  • maximum (int|long) – The inclusive maximum value of the field
  • minimum (int|long) – The inclusive minimum value of the field
  • exclusive_maximum (int|long) – The exclusive maximum value of the field. values are valid if they are strictly less than (not equal to) the given value.
  • exclusive_minimum (int|long) – The exclusive minimum value of the field. values are valid if they are strictly greater than (not equal to) the given value.
  • multiple_of (int|long) – Field must be a multiple of this value
  • **kwargs – any additional keyword arguments are the same as the arguments to the Field class
class pymongoext.fields.IntField(maximum=None, minimum=None, exclusive_maximum=None, exclusive_minimum=None, multiple_of=None, **kwargs)

Integer field

class pymongoext.fields.FloatField(maximum=None, minimum=None, exclusive_maximum=None, exclusive_minimum=None, multiple_of=None, **kwargs)

Float field

class pymongoext.fields.BooleanField(default=None, required=False, enum=None, title=None, description=None, **kwargs)

Boolean field

class pymongoext.fields.DateTimeField(default=None, required=False, enum=None, title=None, description=None, **kwargs)

Datetime field

class pymongoext.fields.TimeStampField(default=None, required=False, enum=None, title=None, description=None, **kwargs)

Timestamp field

class pymongoext.fields.ObjectIDField(default=None, required=False, enum=None, title=None, description=None, **kwargs)

ObjectID field

class pymongoext.fields.ListField(field=None, max_items=None, min_items=None, unique_items=None, **kwargs)

List field

Parameters:
  • field (Field) – A field to validate each type against
  • max_items (int) – The maximum length of array
  • min_items (int) – The minimum length of array
  • unique_items (bool) – If true, each item in the array must be unique. Otherwise, no uniqueness constraint is enforced.
  • **kwargs – any additional keyword arguments are the same as the arguments to the Field class
class pymongoext.fields.DictField(props=None, max_props=None, min_props=None, additional_props=True, required_props=None, **kwargs)

Dict Field

Parameters:
  • (dict of str (props) – Field): A map of known properties
  • max_props (int) – The maximum number of properties allowed
  • min_props (int) – The minimum number of properties allowed
  • additional_props (Field | bool) – If True, additional fields are allowed. If False, only properties specified in props are allowed. If an instance of Field is specified, additional fields must validate against that field.
  • required_props (list of str) – Property names that must be included
  • **kwargs – any additional keyword arguments are the same as the arguments to the Field class
schema()

Creates a valid JsonSchema object

Returns:dict
class pymongoext.fields.MapField(field, **kwargs)
class pymongoext.fields.OneOf(*fields, **kwargs)

value must match exactly one of the specified fields

Example

>>> OneOf(StringField(), IntField(minimum=10), required=True)
class pymongoext.fields.AllOf(*fields, **kwargs)

value must match all specified fields

class pymongoext.fields.AnyOf(*fields, **kwargs)

value must match at least one of the specified fields

class pymongoext.fields.Not(field, **kwargs)

Allow anything that does not match the given field

Parameters:
  • field (Field) – value must not match this field
  • **kwargs – any additional keyword arguments are the same as the arguments to the Field class

Manipulators

class pymongoext.manipulators.IncomingAction

Bases: object

Enum for Incoming action types

CREATE = 'CREATE'
REPLACE = 'REPLACE'
UPDATE = 'UPDATE'
class pymongoext.manipulators.Manipulator

Bases: object

A base document manipulator.

This manipulator just saves and restores documents without changing them.

priority = 5

Determines the order in which the manipulators will be applied. Manipulators with a lower priority will be applied first

transform_incoming(doc, model, action)

Manipulate an incoming document.

Parameters:
  • doc (dict) – the SON object to be inserted into the database
  • model (Type[pymongoext.model.Model]) – the model the object is associated with
  • action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
transform_outgoing(doc, model)

Manipulate an outgoing document.

Parameters:
  • doc (dict) – the SON object being retrieved from the database
  • model (Type[pymongoext.model.Model]) – the model associated with this document
class pymongoext.manipulators.MunchManipulator

Bases: pymongoext.manipulators.Manipulator

Transforms documents to Munch objects. A Munch is a Python dictionary that provides attribute-style access

See https://github.com/Infinidat/munch

priority = -1
transform_incoming(doc, model, action)

Manipulate an incoming document.

Parameters:
  • doc (dict) – the SON object to be inserted into the database
  • model (Type[pymongoext.model.Model]) – the model the object is associated with
  • action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
transform_outgoing(doc, model)

Manipulate an outgoing document.

Parameters:
  • doc (dict) – the SON object being retrieved from the database
  • model (Type[pymongoext.model.Model]) – the model associated with this document
class pymongoext.manipulators.IdWithoutUnderscoreManipulator

Bases: pymongoext.manipulators.Manipulator

A document manipulator that manages a virtual id field.

priority = 0
transform_incoming(doc, model, action)

Remove id field if given and set _id to that value if missing

transform_outgoing(doc, model)

Add an id field if it is missing.

class pymongoext.manipulators.ParseInputsManipulator

Bases: pymongoext.manipulators.Manipulator

Parses incoming documents to ensure data is in the valid format

priority = 7
transform_incoming(doc, model, action)

Manipulate an incoming document.

Parameters:
  • doc (dict) – the SON object to be inserted into the database
  • model (Type[pymongoext.model.Model]) – the model the object is associated with
  • action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed

Exceptions

exception pymongoext.exceptions.MultipleDocumentsFound

Raised by pymongoext.model.Model.get() when multiple documents matching the search criteria are found

exception pymongoext.exceptions.NoDocumentFound

Raised by pymongoext.model.Model.get() when no documents matching the search criteria are found