Quick Start¶
Pymongoext is an ORM-like Pymongo extension that adds json schema validation, index management and intermediate data manipulators. Pymongoext simplifies working with MongoDB, while maintaining a syntax very identical to Pymongo.
pymongoext.Model
is simply a wrapper around pymongo.Collection
.
As such, all of the pymongo.Collection API is exposed through pymongoext.Model.
If you don’t find what you want in the pymongoext.Model API,
please take a look at pymongo’s Collection documentation.
Features¶
- schema validation (which uses MongoDB JSON Schema validation)
- schema-less feature
- nested and complex schema declaration
- untyped field support
- required fields validation
- default values
- custom validators
- operator for validation (OneOf, AllOf, AnyOf, Not)
- indexes management
- data manipulators (transform documents before saving and after retrieval)
- Easy to create custom data manipulators
- object-like results instead of dict-like. (i.e. foo.bar instead of foo[‘bar’])
- No custom-query language or API to learn (If you know how to use pymongo, you already know how to use pymongoext)
Examples¶
Some simple examples of what pymongoext code looks like:
from datetime import datetime
from pymongo import MongoClient, IndexModel
from pymongoext import *
class User(Model):
@classmethod
def db(cls):
return MongoClient()['my_database_name']
__schema__ = DictField(dict(
email=StringField(required=True),
name=StringField(required=True),
yob=IntField(minimum=1900, maximum=2019)
))
__indexes__ = [IndexModel('email', unique=True), 'name']
class AgeManipulator(Manipulator):
def transform_outgoing(self, doc, model):
doc['age'] = datetime.now().year - doc['yob']
return doc
# Create a user
>>> User.insert_one({'email': 'jane@gmail.com', 'name': 'Jane Doe', 'yob': 1990})
# Fetch one user
>>> user = User.find_one()
# Print the users age
>>> print(user.age)
Contents¶
Guides¶
Getting Started¶
Before we start, make sure that a copy of MongoDB is running in an accessible location. If you haven’t installed pymongoext, simply use pip to install it like so:
$ pip install pymongoext
Every Model subclass need implement the pymongoext.model.Model.db()
method which returns a
valid pymongo database instance.
To achieve this, we advice creating a base model class that implements the db()
method.
from pymongo import MongoClient
from pymongoext import Model
class BaseModel(Model):
@classmethod
def db(cls):
return MongoClient()['my_database_name']
Now all concrete models would extend this class.
Defining our documents¶
MongoDB is schemaless, which means that no schema is enforced by the database — we may add and remove fields however we want and MongoDB won’t complain. This makes life a lot easier in many regards, especially when there is a change to the data model. However, defining schemas for our documents can help to iron out bugs involving incorrect types or missing fields, and also allow us to define utility methods on our documents in the same way that traditional ORMs do.
Users Model¶
Just as if we were using a relational database with an ORM, we need to define which fields a User may have (schema), and what types of data they might store.
from pymongoext import DictField, StringField, IntField
class User(BaseModel):
__schema__ = DictField(dict(
email=StringField(required=True),
first_name=StringField(required=True),
last_name=StringField(required=True),
yob=IntField(minimum=1900, maximum=2019)
))
Indexes¶
MongoDB supports secondary indexes.
With pymongoext, we define these indexes as a list within our Model on the __indexes__
variable.
Both Single Field and compound indexes are supported.
from pymongo import IndexModel
class User(BaseModel):
.....
__indexes__ = [IndexModel('email', unique=True), 'first_name']
This example creates a unique index on email
and an index on first_name
.
Single Field descending index¶
Suppose we wanted the index on first_name
to be sorted in a descending manner.
That could be achieved as follows
>>> ('first_name', pymongo.DESCENDING)
Alternatively, we could also create a descending index by prefixing the field name with a -
sign
>>> '-first_name'
Compound indexes¶
A compound index is simply a list of single field indexes.
Therefore, to create a compound index on first_name
and last_name
sorted in opposite directions
from pymongo import IndexModel
class User(BaseModel):
.....
__indexes__ = [
IndexModel('email', unique=True),
['-first_name', '+last_name'] # compound index
]
Note the -
and +
signs which specify the index on first_name and last_name should be sorted in
descending and ascending order respectively.
Manipulators¶
Manipulators are useful for manipulating (adding, removing, modifying) document properties
before being persisted to MongoDB and after retrieval.
A manipulator has two methods transform_incoming
and transform_outgoing
.
Suppose you want to print out the person’s full name. You could do it yourself:
user = User.find_one()
print("{} {}".format(user['first_name'], user['last_name']))
But concatenating the first and last name every time can get cumbersome.
And what if you want to do some extra processing on the name, like removing diacritics?
A Manipulator lets you define a virtual property full_name
that won’t get persisted to MongoDB.
from pymongoext.manipulators import Manipulator
class User(BaseModel):
.....
class FullNameManipulator(Manipulator):
def transform_outgoing(self, doc, model):
doc['full_name'] = "{} {}".format(user['first_name'], user['last_name'])
return doc
def transform_incoming(self, doc, model, action):
if 'full_name' in doc:
del doc['full_name'] # Don't persist full name
return doc
Now, every document you retrieve will have a full_name property.
user = User.find_one()
print(user['full_name'])
Parametrized Manipulator¶
A manipulator is bind to a Model as either a manipulator class or object. The later is necessary when using manipulators that are initialized with parameters.
Suppose you had a manipulator that adds a dynamic value to a dynamic field:
class AddManipulator(Manipulator):
def __init__(self, field, value):
self.field = field
self.value = value
def transform_outgoing(self, doc, model):
doc[self.field] = doc[self.field] + self.value
return doc
Then we would bind this manipulator to the model as an object
class User(BaseModel):
.....
addOneToFamilySize = AddManipulator('family_size', 1)
class Book(BaseModel):
......
subtractFirstAndLastPages = AddManipulator('page_count', -2)
Install¶
Install using Pip¶
pymongoext is currently available for Python >= 3. Plans to add support for python 2.7 are underway.
The recommended way to install pymongoext is using pip
:
python -m pip install pymongoext
API Reference¶
Model¶
-
class
pymongoext.model.
Model
¶ Bases:
object
The base class used for defining the structure and properties of collections of documents stored in MongoDB. You should not use the
Model
class directly. Instead Inherit from this class to define a document’s structure.In pymongoext, the term “Model” refers to subclasses of the
Model
class. A Model is your primary tool for interacting with MongoDB collections.Note
All concrete classes must implement the
db()
which should return a valid mongo database instance.Examples
Create a Users model
from pymongo import MongoClient, IndexModel from pymongoext import Model, DictField, StringField, IntField class User(Model): @classmethod def db(cls): return MongoClient()['my_database_name'] __schema__ = DictField(dict( email=StringField(required=True), name=StringField(required=True), password=StringField(required=True), age=IntField(minimum=0) )) __indexes__ = [IndexModel('email', unique=True), 'name']
All pymongo.collection.Collection methods and attributes can be accessed through the Model as shown below
Create a new user
result = User.insert_one({ "email": "john.doe@dummy.com", "name": "John Doe", "password": "secret", "age": 35 }) user_id = result.inserted_id
Find a single document by id. See
get()
for an alternative tofind_one()
user = User.find_one(user_id)
Check if a document exists
user_exists = User.exists({"_id": user_id})
-
__collection_name__
= None¶ Pymongoext by default produces a collection name by taking the under_score_case of the model. See inflection.underscore for more info.
Set this if you need a different name for your collection.
-
__auto_update__
= True¶ By default, pymongoext ensures the indexes and schema defined on the model are in sync with mongodb. It achieves this by creating & dropping indexes and pushing updates to the the JsonSchema defined in mongodb collection. This is done when your code runs for the first time or the server is restarted.
If you want to disable this functionality and manage the updates yourself, you can set
__auto_update__
to False.But then remember to call
_update()
yourself to update the schema.
-
__indexes__
= []¶ List of Indexes to create on this collection
A valid index can be either:
string
optionally prefixed with a+
or-
sign.- (string, int)
tuple
- A
list
whose values are either as defined in 1 or 2 above (Compound indexes) - an instance of
pymongo.IndexModel
See the create_index and create_indexes methods of pymongo Collection for more info.
-
__schema__
= None¶ Specifies model schema
Type: pymongoext.fields.DictField
-
classmethod
exists
(filter=None, *args, **kwargs)¶ Check if a document exists in the database
All arguments to
find()
are also valid arguments forexists()
, although any limit argument will be ignored. ReturnsTrue
if a matching document is found, otherwiseFalse
.Parameters: - filter (optional) – a dictionary specifying
the query to be performed OR any other type to be used as
the value for a query for
"_id"
. - *args (optional) – any additional positional arguments
are the same as the arguments to
find()
. - **kwargs (optional) – any additional keyword arguments
are the same as the arguments to
find()
.
- filter (optional) – a dictionary specifying
the query to be performed OR any other type to be used as
the value for a query for
-
classmethod
get
(filter=None, *args, **kwargs)¶ Retrieve the the matching object raising
pymongoext.exceptions.MultipleDocumentsFound
exception if multiple results andpymongoext.exceptions.NoDocumentFound
if no results are found.All arguments to
find()
are also valid arguments forget()
, although any limit argument will be ignored. Returns the matching documentParameters: - filter (optional) – a dictionary specifying
the query to be performed OR any other type to be used as
the value for a query for
"_id"
. - *args (optional) – any additional positional arguments
are the same as the arguments to
find()
. - **kwargs (optional) – any additional keyword arguments
are the same as the arguments to
find()
.
- filter (optional) – a dictionary specifying
the query to be performed OR any other type to be used as
the value for a query for
-
classmethod
db
()¶ Get the mongo database instance associated with this collection
All concrete classes must implement this method. A sample implementation is shown below
from pymongo import MongoClient from pymongoext import Model class User(Model): @classmethod def db(cls): return MongoClient()['my_database_name']
Returns: pymongo.database.Database
-
classmethod
name
()¶ Returns the collection name.
See
__collection_name__
for more info on how the collection name is determined
-
classmethod
c
()¶ Get the collection associated with this model. This method ensures that the model indexes and schema validators are up to date.
Returns: pymongo.collection.Collection
-
classmethod
apply_incoming_manipulators
(doc, action)¶ Apply manipulators to an incoming document before it gets stored.
Parameters: - doc (dict) – the document to be inserted into the database
- action (str) – the incoming action being performed
Returns: the transformed document
Return type: dict
-
classmethod
apply_outgoing_manipulators
(doc)¶ Apply manipulators to an outgoing document.
Parameters: doc (dict) – the document being retrieved from the database Returns: the transformed document Return type: dict
-
classmethod
parse
(data, with_defaults=False)¶ Prepare the data to be stored in the db
For example, given a simple user model
class User(Model): @classmethod def db(cls): return MongoClient()['my_database_name'] __schema__ = DictField(dict( name=StringField(required=True), age=IntField(minimum=0, required=True, default=18) ))
User.parse({'name': 'John Doe'}, with_defaults=True) >>> {'name': 'John Doe', 'age': 18}
Parameters: - data (dict) – Data to be stored
- with_defaults (bool) – If
True
, None and missing values are set to the field default
Returns: dict
-
classmethod
manipulators
()¶ Return a list of manipulators to be applied to incoming and outgoing documents. Manipulators are applied sequentially in an order determined by
priority
value of the manipulator. Manipulators with a lower priority will be applied first.A manipulator operates on a single document before it is saved to mongodb and after it is retrieved. See
pymongoext.manipulators.Manipulator
on how to implement your own manipulators.By default every pymongoext model has two manipulators:
IdWithoutUnderscoreManipulator
withpriority=0
ParseInputsManipulator
withpriority=7
Returns: list of pymongoext.manipulators.Manipulator
-
class
IdWithoutUnderscoreManipulator
¶ Bases:
pymongoext.manipulators.Manipulator
A document manipulator that manages a virtual id field.
-
priority
= 0¶
-
transform_incoming
(doc, model, action)¶ Remove id field if given and set _id to that value if missing
-
transform_outgoing
(doc, model)¶ Add an id field if it is missing.
-
-
class
ParseInputsManipulator
¶ Bases:
pymongoext.manipulators.Manipulator
Parses incoming documents to ensure data is in the valid format
-
priority
= 7¶
-
transform_incoming
(doc, model, action)¶ Manipulate an incoming document.
Parameters: - doc (dict) – the SON object to be inserted into the database
- model (Type[pymongoext.model.Model]) – the model the object is associated with
- action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
-
-
class
MunchManipulator
¶ Bases:
pymongoext.manipulators.Manipulator
Transforms documents to Munch objects. A Munch is a Python dictionary that provides attribute-style access
See https://github.com/Infinidat/munch
-
priority
= -1¶
-
transform_incoming
(doc, model, action)¶ Manipulate an incoming document.
Parameters: - doc (dict) – the SON object to be inserted into the database
- model (Type[pymongoext.model.Model]) – the model the object is associated with
- action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
-
transform_outgoing
(doc, model)¶ Manipulate an outgoing document.
Parameters: - doc (dict) – the SON object being retrieved from the database
- model (Type[pymongoext.model.Model]) – the model associated with this document
-
-
Fields¶
-
class
pymongoext.fields.
Field
(default=None, required=False, enum=None, title=None, description=None, **kwargs)¶ Base class for all fields
Parameters: - required (bool) – Specifies if a value is required for this field. defaults to
False
- enum (list) – Enumerates all possible values of the field
- title (str) – A descriptive title string with no effect
- description (str) – A string that describes the schema and has no effect
- **kwargs – Additional parameters
-
schema
()¶ Creates a valid JsonSchema object
Returns: dict
- required (bool) – Specifies if a value is required for this field. defaults to
-
class
pymongoext.fields.
NullField
¶ Null field
-
class
pymongoext.fields.
StringField
(max_length=None, min_length=None, pattern=None, **kwargs)¶ String field
Parameters: - max_length (int) – The maximum length of the field
- min_length (int) – The minimum length of the field
- pattern (str) – Field must match the regular expression
- **kwargs – any additional keyword arguments are the same as the arguments to the
Field
class
-
class
pymongoext.fields.
NumberField
(maximum=None, minimum=None, exclusive_maximum=None, exclusive_minimum=None, multiple_of=None, **kwargs)¶ Numeric field
Parameters: - maximum (int|long) – The inclusive maximum value of the field
- minimum (int|long) – The inclusive minimum value of the field
- exclusive_maximum (int|long) – The exclusive maximum value of the field. values are valid if they are strictly less than (not equal to) the given value.
- exclusive_minimum (int|long) – The exclusive minimum value of the field. values are valid if they are strictly greater than (not equal to) the given value.
- multiple_of (int|long) – Field must be a multiple of this value
- **kwargs – any additional keyword arguments are the same as the arguments to the
Field
class
-
class
pymongoext.fields.
IntField
(maximum=None, minimum=None, exclusive_maximum=None, exclusive_minimum=None, multiple_of=None, **kwargs)¶ Integer field
-
class
pymongoext.fields.
FloatField
(maximum=None, minimum=None, exclusive_maximum=None, exclusive_minimum=None, multiple_of=None, **kwargs)¶ Float field
-
class
pymongoext.fields.
BooleanField
(default=None, required=False, enum=None, title=None, description=None, **kwargs)¶ Boolean field
-
class
pymongoext.fields.
DateTimeField
(default=None, required=False, enum=None, title=None, description=None, **kwargs)¶ Datetime field
-
class
pymongoext.fields.
TimeStampField
(default=None, required=False, enum=None, title=None, description=None, **kwargs)¶ Timestamp field
-
class
pymongoext.fields.
ObjectIDField
(default=None, required=False, enum=None, title=None, description=None, **kwargs)¶ ObjectID field
-
class
pymongoext.fields.
ListField
(field=None, max_items=None, min_items=None, unique_items=None, **kwargs)¶ List field
Parameters: - field (Field) – A field to validate each type against
- max_items (int) – The maximum length of array
- min_items (int) – The minimum length of array
- unique_items (bool) – If true, each item in the array must be unique. Otherwise, no uniqueness constraint is enforced.
- **kwargs – any additional keyword arguments are the same as the arguments to the
Field
class
-
class
pymongoext.fields.
DictField
(props=None, max_props=None, min_props=None, additional_props=True, required_props=None, **kwargs)¶ Dict Field
Parameters: - (dict of str (props) – Field): A map of known properties
- max_props (int) – The maximum number of properties allowed
- min_props (int) – The minimum number of properties allowed
- additional_props (Field | bool) – If
True
, additional fields are allowed. IfFalse
, only properties specified inprops
are allowed. If an instance ofField
is specified, additional fields must validate against that field. - required_props (list of str) – Property names that must be included
- **kwargs – any additional keyword arguments are the same as the arguments to the
Field
class
-
schema
()¶ Creates a valid JsonSchema object
Returns: dict
-
class
pymongoext.fields.
MapField
(field, **kwargs)¶
-
class
pymongoext.fields.
OneOf
(*fields, **kwargs)¶ value must match exactly one of the specified fields
Example
>>> OneOf(StringField(), IntField(minimum=10), required=True)
-
class
pymongoext.fields.
AllOf
(*fields, **kwargs)¶ value must match all specified fields
-
class
pymongoext.fields.
AnyOf
(*fields, **kwargs)¶ value must match at least one of the specified fields
Manipulators¶
-
class
pymongoext.manipulators.
IncomingAction
¶ Bases:
object
Enum for Incoming action types
-
CREATE
= 'CREATE'¶
-
REPLACE
= 'REPLACE'¶
-
UPDATE
= 'UPDATE'¶
-
-
class
pymongoext.manipulators.
Manipulator
¶ Bases:
object
A base document manipulator.
This manipulator just saves and restores documents without changing them.
-
priority
= 5¶ Determines the order in which the manipulators will be applied. Manipulators with a lower priority will be applied first
-
transform_incoming
(doc, model, action)¶ Manipulate an incoming document.
Parameters: - doc (dict) – the SON object to be inserted into the database
- model (Type[pymongoext.model.Model]) – the model the object is associated with
- action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
-
transform_outgoing
(doc, model)¶ Manipulate an outgoing document.
Parameters: - doc (dict) – the SON object being retrieved from the database
- model (Type[pymongoext.model.Model]) – the model associated with this document
-
-
class
pymongoext.manipulators.
MunchManipulator
¶ Bases:
pymongoext.manipulators.Manipulator
Transforms documents to Munch objects. A Munch is a Python dictionary that provides attribute-style access
See https://github.com/Infinidat/munch
-
priority
= -1¶
-
transform_incoming
(doc, model, action)¶ Manipulate an incoming document.
Parameters: - doc (dict) – the SON object to be inserted into the database
- model (Type[pymongoext.model.Model]) – the model the object is associated with
- action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
-
transform_outgoing
(doc, model)¶ Manipulate an outgoing document.
Parameters: - doc (dict) – the SON object being retrieved from the database
- model (Type[pymongoext.model.Model]) – the model associated with this document
-
-
class
pymongoext.manipulators.
IdWithoutUnderscoreManipulator
¶ Bases:
pymongoext.manipulators.Manipulator
A document manipulator that manages a virtual id field.
-
priority
= 0¶
-
transform_incoming
(doc, model, action)¶ Remove id field if given and set _id to that value if missing
-
transform_outgoing
(doc, model)¶ Add an id field if it is missing.
-
-
class
pymongoext.manipulators.
ParseInputsManipulator
¶ Bases:
pymongoext.manipulators.Manipulator
Parses incoming documents to ensure data is in the valid format
-
priority
= 7¶
-
transform_incoming
(doc, model, action)¶ Manipulate an incoming document.
Parameters: - doc (dict) – the SON object to be inserted into the database
- model (Type[pymongoext.model.Model]) – the model the object is associated with
- action (str) – One of CREATE|REPLACE|UPDATE. Signifies the action being performed
-
Exceptions¶
-
exception
pymongoext.exceptions.
MultipleDocumentsFound
¶ Raised by
pymongoext.model.Model.get()
when multiple documents matching the search criteria are found
-
exception
pymongoext.exceptions.
NoDocumentFound
¶ Raised by
pymongoext.model.Model.get()
when no documents matching the search criteria are found