django-zombodb documentation

Easy Django integration with Elasticsearch through ZomboDB Postgres Extension. Thanks to ZomboDB, your Django models are synced with Elasticsearch at transaction time! Searching is also very simple: you can make Elasticsearch queries by just calling one of the search methods on your querysets. Couldn’t be easier!

Installation and Configuration

Example

You can check a fully configured Django project with django-zombodb at https://github.com/vintasoftware/django-zombodb/tree/master/example

Requirements

  • Python: 3.5, 3.6, 3.7

  • Django: 2.0, 2.1

Installation

Install django-zombodb:

pip install django-zombodb

Settings

Set ZOMBODB_ELASTICSEARCH_URL on your settings.py. That is the URL of the ElasticSearch cluster used by ZomboDB.

ZOMBODB_ELASTICSEARCH_URL = 'http://localhost:9200/'

Move forward to learn how to integrate your models with Elasticsearch.

Integrating with Elasticsearch

ZomboDB integrates Postgres with Elasticsearch through Postgres indexes. If you don’t know much about ZomboDB, please read its tutorial before proceeding.

Installing ZomboDB extension

Since ZomboDB is a Postgres extension, you must install and activate it. Follow the official ZomboDB installation instructions.

Activating ZomboDB extension

django-zombodb provides a Django migration operation to activate ZomboDB extension on your database. To run it, please make sure your database user is a superuser:

psql -d your_database -c "ALTER USER your_database_user SUPERUSER"

Then create an empty migration on your “main” app (usually called “core” or “common”):

python manage.py makemigrations core --empty

Add the django_zombodb.operations.ZomboDBExtension operation to the migration you’ve just created:

import django_zombodb.operations

class Migration(migrations.Migration):

    dependencies = [
        ('restaurants', '0001_initial'),
    ]

    operations = [
        django_zombodb.operations.ZomboDBExtension(),
        ...
    ]

Alternatively, you can activate the extension manually with a command. But you should avoid this because you’ll need to remember to run this on production, on tests, and on the machines of all your co-workers:

psql -d django_zombodb -c "CREATE EXTENSION zombodb"

Creating an index

Imagine you have the following model:

class Restaurant(models.Model):
    name = models.TextField()
    street = models.TextField()

To integrate it with Elasticsearch, we need to add a ZomboDBIndex to it:

from django_zombodb.indexes import ZomboDBIndex

class Restaurant(models.Model):
    name = models.TextField()
    street = models.TextField()

    class Meta:
        indexes = [
            ZomboDBIndex(fields=[
                'name',
                'street',
            ]),
        ]

After that, create and run the migrations:

python manage.py makemigrations
python manage.py migrate

Warning

During the migration, ZomboDBIndex reads the value at settings.ZOMBODB_ELASTICSEARCH_URL. That means if settings.ZOMBODB_ELASTICSEARCH_URL changes after the ZomboDBIndex migration, the internal index stored at Postgres will still point to the old URL. If you wish to change the URL of an existing ZomboDBIndex, change both settings.ZOMBODB_ELASTICSEARCH_URL and issue a ALTER INDEX index_name SET (url='http://some.new.url'); (preferably inside a migrations.RunSQL in a new migration).

Now the Restaurant model will support Elasticsearch queries for both name and street fields. But to perform those searches, we need it to use the custom queryset SearchQuerySet:

from django_zombodb.indexes import ZomboDBIndex
from django_zombodb.querysets import SearchQuerySet

class Restaurant(models.Model):
    name = models.TextField()
    street = models.TextField()

    objects = models.Manager.from_queryset(SearchQuerySet)()

    class Meta:
        indexes = [
            ZomboDBIndex(fields=[
                'name',
                'street',
            ]),
        ]

Note

If you already have a custom queryset on your model, make it inherit from SearchQuerySetMixin.

Field mapping

From Elasticsearch documentation:

“Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. For instance, use mappings to define:

  • which string fields should be treated as full text fields.

  • which fields contain numbers, dates, or geolocations.

  • whether the values of all fields in the document should be indexed into the catch-all _all field.

  • the format of date values.

  • custom rules to control the mapping for dynamically added fields.”

If you don’t specify a mapping for your ZomboDBIndex, django-zombodb uses ZomboDB’s default mappings, which are based on the Postgres type of your model fields.

To customize mapping, specify a field_mapping parameter to your ZomboDBIndex like below:

from django_zombodb.indexes import ZomboDBIndex
from django_zombodb.querysets import SearchQuerySet

class Restaurant(models.Model):
    name = models.TextField()
    street = models.TextField()

    objects = models.Manager.from_queryset(SearchQuerySet)()

    class Meta:
        indexes = [
            ZomboDBIndex(
                fields=[
                    'name',
                    'street',
                ],
                field_mapping={
                    'name': {"type": "text",
                             "copy_to": "zdb_all",
                             "analyzer": "fulltext_with_shingles",
                             "search_analyzer": "fulltext_with_shingles_search"},
                    'street': {"type": "text",
                               "copy_to": "zdb_all",
                               "analyzer": "brazilian"},
                }
            )
        ]

Note

You probably wish to have "copy_to": "zdb_all" on your textual fields to match ZomboDB default behavior. From ZomboDB docs: “zdb_all is ZomboDB’s version of Elasticsearch’s “_all” field, except zdb_all is enabled for all versions of Elasticsearch. It is also configured as the default search field for every ZomboDB index”. For more info, read Elasticsearch docs take on the “_all” field.

Move forward to learn how to perform Elasticsearch queries through your model.

Searching

On models with ZomboDBIndex, use methods from SearchQuerySet/SearchQuerySetMixin to perform various kinds of Elasticsearch queries:

Validation

If you’re receiving queries from the end-user, particularly query string queries, you should call the search methods with validate=True. This will perform Elasticsearch-side validation through the Validate API. When doing that, InvalidElasticsearchQuery may be raised.

from django_zombodb.exceptions import InvalidElasticsearchQuery

queryset = Restaurant.objects.all()
try:
    queryset = queryset.query_string_search("AND steak*", validate=True)
except InvalidElasticsearchQuery:
    messages.error(request, "Invalid search query. Not filtering by search.")

Sorting by score

By default, the resulting queryset from the search methods is unordered. You can get results ordered by Elasticsearch’s score passing sort=True.

Restaurant.objects.query_string_search("brasil~ AND steak*", sort=True)

Alternatively, if you want to combine with your own order_by, you can use the method annotate_score():

Restaurant.objects.query_string_search(
    "brazil* AND steak*"
).annotate_score(
    attr='zombodb_score'
).order_by('-zombodb_score', 'name', 'pk')

Limiting

It’s a good practice to set a hard limit to the number of search results. For most search use cases, you shouldn’t need more than a certain number of results, either because users will only consume some of the high scoring results, or because documents with lower scores aren’t relevant to your process. To limit the results, use the limit parameter on search methods:

Restaurant.objects.query_string_search("brasil~ AND steak*", limit=1000)

Lazy and Chainable

The search methods are like the traditional filter method: they return a regular Django QuerySet that supports all operations, and that’s lazy and chainable. Therefore, you can do things like:

Restaurant.objects.filter(
    name__startswith='Pizza'
).query_string_search(
    'name:Hut'
).filter(
    street__contains='Road'
)

Warning

It’s fine to call filter/exclude/etc. before and after search. If possible, the best would be using only a Elasticsearch query. However, it’s definitely slow to call search methods multiple times on the same queryset! Please avoid this:

Restaurant.objects.query_string_search(
    'name:Pizza'
).query_string_search(
    'name:Hut'
)

While that may work as expected, it’s extremely inneficient. Instead, use compound queries like “bool”. They’ll be much faster. Note that “bool” queries might be quite confusing to implement. Check tutorials about them, like this one.

Missing features

Currently django-zombodb doesn’t support ZomboDB’s offset and sort functions that work on the Elasticsearch side. Regular SQL LIMIT/OFFSET/ORDER BY works fine, therefore traditional QuerySet operations work, but aren’t as performant as doing the same on ES side.

django_zombodb package

Submodules

django_zombodb.admin_mixins module

class django_zombodb.admin_mixins.ZomboDBAdminMixin[source]

Bases: object

get_list_display(request)[source]
get_ordering(request)[source]
get_queryset(request)[source]
get_search_fields(request)[source]

get_search_fields is unnecessary if ZomboDBAdminMixin is used. But since search_form.html uses this, we’ll return a placeholder tuple

get_search_results(request, queryset, search_term)[source]
max_search_results = None

django_zombodb.apps module

class django_zombodb.apps.DjangoZomboDBConfig(app_name, app_module)[source]

Bases: django.apps.config.AppConfig

name = 'django_zombodb'

django_zombodb.base_indexes module

class django_zombodb.base_indexes.PostgresIndex(*, fields=(), name=None, db_tablespace=None, opclasses=(), condition=None)[source]

Bases: django.db.models.indexes.Index

create_sql(model, schema_editor, using='')[source]
get_with_params()[source]
max_name_length

django_zombodb.exceptions module

exception django_zombodb.exceptions.InvalidElasticsearchQuery[source]

Bases: Exception

django_zombodb.helpers module

django_zombodb.helpers.get_zombodb_index_from_model(model)[source]
django_zombodb.helpers.validate_query_dict(model, query)[source]
django_zombodb.helpers.validate_query_string(model, query)[source]

django_zombodb.indexes module

class django_zombodb.indexes.ZomboDBIndex(*, shards=None, replicas=None, alias=None, refresh_interval=None, type_name=None, bulk_concurrency=None, batch_size=None, compression_level=None, llapi=None, field_mapping=None, **kwargs)[source]

Bases: django.contrib.postgres.indexes.PostgresIndex

create_sql(model, schema_editor, using='')[source]
deconstruct()[source]
get_with_params()[source]
remove_sql(model, schema_editor)[source]
suffix = 'zombodb'
class django_zombodb.indexes.ZomboDBIndexCreateStatementAdapter(statement, model, schema_editor, fields, field_mapping, row_type)[source]

Bases: object

references_column(*args, **kwargs)[source]
references_table(*args, **kwargs)[source]
rename_column_references(*args, **kwargs)[source]
rename_table_references(*args, **kwargs)[source]
template = 'CREATE INDEX %(name)s ON %(table)s USING zombodb ((ROW(%(columns)s)::%(row_type)s)) %(extra)s'
class django_zombodb.indexes.ZomboDBIndexRemoveStatementAdapter(statement, row_type)[source]

Bases: object

references_column(*args, **kwargs)[source]
references_table(*args, **kwargs)[source]
rename_column_references(*args, **kwargs)[source]
rename_table_references(*args, **kwargs)[source]

django_zombodb.operations module

class django_zombodb.operations.ZomboDBExtension[source]

Bases: django.contrib.postgres.operations.CreateExtension

django_zombodb.querysets module

class django_zombodb.querysets.SearchQuerySet(model=None, query=None, using=None, hints=None)[source]

Bases: django_zombodb.querysets.SearchQuerySetMixin, django.db.models.query.QuerySet

class django_zombodb.querysets.SearchQuerySetMixin[source]

Bases: object

annotate_score(attr='zombodb_score')[source]
order_by_score(score_attr='zombodb_score')[source]

django_zombodb.serializers module

Module contents

Change Log

0.3.0 (2019-07-18)

  • Support for custom Elasticsearch mappings through field_mapping parameter on ZomboDBIndex.

  • Support to limit parameter on search methods.

0.2.1 (2019-06-13)

  • Dropped support for Python 3.4.

  • Added missing imports to docs.

0.2.0 (2019-03-01)

  • Removed parameter url from ZomboDBIndex. This simplifies the support of multiple deployment environments (local, staging, production), because the ElasticSearch URL isn’t copied to inside migrations code (see Issue #17).

0.1.0 (2019-02-01)

  • First release on PyPI.

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

You can contribute in many ways:

Types of Contributions

Report Bugs

Report bugs at https://github.com/vintasoftware/django-zombodb/issues. Please fill the fields of the issue template.

Fix Bugs

Look through the GitHub issues for bugs. Anything tagged with “bug” is open to whoever wants to implement it.

Implement Features

Look through the GitHub issues for features. Anything tagged with “feature” is open to whoever wants to implement it.

Write Documentation

django-zombodb could always use more documentation, whether as part of the official django-zombodb docs, in docstrings, or even on the web in blog posts, articles, and such.

Submit Feedback

The best way to send feedback is to file an issue at https://github.com/vintasoftware/django-zombodb/issues.

If you are proposing a feature:

  • Explain in detail how it would work.

  • Keep the scope as narrow as possible, to make it easier to implement.

  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Get Started!

Ready to contribute? Here’s how to set up django-zombodb for local development.

  1. Fork the django-zombodb repo on GitHub.

  2. Clone your fork locally:

    $ git clone git@github.com:your_name_here/django-zombodb.git
    
  3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:

    $ mkvirtualenv django-zombodb
    $ cd django-zombodb/
    $ pip install -e .
    $ make install_requirements
    
  4. Create a branch for local development:

    $ git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  5. When you’re done making changes, check that your changes pass the linters and the tests, including testing other Python versions with tox:

    $ make lint
    $ make test
    $ make test-all
    
  6. Commit your changes and push your branch to GitHub:

    $ git add .
    $ git commit -m "Your detailed description of your changes."
    $ git push origin name-of-your-bugfix-or-feature
    
  7. Submit a pull request through the GitHub website.

Pull Request Guidelines

Before you submit a pull request, check that it meets these guidelines:

  1. The pull request should include tests.

  2. If the pull request adds functionality, the docs should be updated.

  3. The pull request should pass CI. Check https://travis-ci.org/vintasoftware/django-zombodb/pull_requests and make sure that the tests pass for all supported Python versions.

Tips

To run a subset of tests:

$ python runtests.py tests.test_apps

Credits

Development Lead

Contributors

None yet. Why not be the first?