Welcome to the Django-Cache-Magic Documenation¶
Table of Contents:
Introduction¶
CacheMagic addresses the most common scenarios for caching and cache invalidation: temporal caching, instance caching, related objects caching, and thundering herd protection.
Temporal Caching¶
This is like memoizing an expensive operation. Usually something that doesn’t need to be super-fresh, and aggregates many objects - making invalidation difficult or impossible.:
@cached
def my_expensive_call(arg1, arg2):
return do_other_things(arg1) * arg2
Instance Caching¶
This is the practice of caching individual model instances. CacheMagic provides a CacheController that you can attach to models to cause automatic caching and invalidations.
class Model(django.models.Model):
cache = cachemagic.CacheController()
field = django.models.TextField()
Model.objects.get(pk=27) # hits the database
Model.cache.get(27) # Tries cache first
Thundering Herd Protection¶
Any caching will be useless when a cache key expires and thousands of requests try to recompute the value at the same time. CacheMagic provides a cache backend for redis that prevents this problem by designating only one client to recompute the value while others simply read the existing cache value.
CACHES['default'] = {
'BACKEND': 'cachemagic.cache.RedisHerdCache',
'LOCATION': ':'.join([REDIS_HOST, str(REDIS_PORT), '0']),
'OPTIONS': {
'PASSWORD': REDIS_PASSWORD,
},
}
Examples¶
The example model defines a person with a name.
class Person(models.Model):
name = models.CharField(max_length=64)
cache = RelatedCacheController()
def __unicode__(self):
return self.name
class Book(models.Model):
title = models.CharField(max_length=64)
author = models.ForeignKey(Person)
Temporal Caching¶
Cached is a simple decorator that can be applied to any function to cache results. By default it uses the arguments as a key, but both the key and timeout can be customized.:
from django.db import models
from cachemagic.decorators import cached
@cached
def do_expensive_operation(thing, other):
return [other(item) for item in MyModel.objects.where(a=thing)]
Using in model methods¶
In most cases you should use an object’s primary key as the cache key instead of serializing the entire object.:
from django.db import models
from cachemagic.decorators import cached
class Model(models.Model):
field1 = IntegerField()
field2 = TextField()
@cached(key=lambda self: self.pk, timeout=180)
def get_my_related_things(self):
return [other(item) for item in self.related_things.select_related()]
Instance Caching¶
The CacheController works by listening for the post_save and post_delete signals that the model it is attached to will emit when you alter an instance. This allows it to automatically keep cached instances up to date!
from django.db import models
from cachemagic.controller import CacheController
class Model(models.Model):
field1 = IntegerField()
field2 = TextField()
cache = CacheController()
Reading From Cache¶
You can use the cache controller like a very simple manager: currently only
the .get(pk)
operation is supported. This will try to get and return the
model instance from cache. In the event that the key does not have a cache
entry, the value is read from the database using the model’s default manager.
The result is placed into the cache before being returned to the caller.
obj = Model.cache.get(pk=933)
Just like objects.get(), cache.get() may raise a Model.DoesNotExist exception. A DoesNotExist marker is placed in cache when an instance is deleted or an attempt to fetch a non-existent row is made, preventing subsequent requests against the cache from hitting DB or returning stale data.
Instance Cache Keys¶
The default CacheController creates keys based on your model’s app, name and
primary key, separated by colons:
app_label:model_name:primary_key
. This should present you with a unique
key for each object.
Note
This can be problematic if your model uses a primary key that can contain whitespace and you are using memcached as your cache backend. One possible solution is to provide a key generation function that hashes the key (see example below). You can also use a cache backend like Django NewCache that automatically hashes the key.
Overriding Cache Key Generation¶
You can subclass CacheController and override the make_key function to customize your cache keys.
- CacheController.make_key(self, pk)
- Called to generate all cache keys for this controller. You can access the
model class that this controller is attached to through
self.model
.
Examples¶
import hashlib
class HashCacheController(CacheController):
""" Hashes the cache key. This creates keys that are difficult to type
by hand, but can avoid problems related to key content and length.
"""
def make_key(self, pk):
key = super(HashCacheController, self).make_key(pk)
return hashlib.sha256(key).hexdigest()
class ModelVersionCacheController(CacheController):
""" Versions each cache key with the model's CACHE_VERSION attribute.
Updating the model's version when altering it's schema will
effectively invalidate all cached instances.
"""
def make_key(self, pk):
model_version = getattr(self.model, 'CACHE_VERSION', 0)
key = ':'.join([super(HashCacheController, self), model_version])
return key
Cache Timeouts¶
The default cache timeout is one hour. You can specify a number of seconds
to timeout as the timeout
parameter in the CacheController constructor. :
cache = CacheController(timeout=(60 * 60 * 24 * 7)) # timeout in one week
Overriding the default timeout¶
If you find yourself frequently overriding the default timeout, you can
subclass the CacheController and set a DEFAULT_TIMEOUT
attribute:
class LongCacheController(CacheController):
# timeouts longer than 30 days are treated as absolute timestamps by
# memcached; that makes 30 days the largest naive value we can use.
DEFAULT_TIMEOUT = 60 * 60 * 24 * 30
Multicache¶
Starting in Django 1.3 you could define multiple cache backends. If you want
to tie the instance cache for a model to a backend other than ‘default’, you
can pass the name of the backend you want to use into the controller
constructor as the keyword argument backend
.
Caveats¶
CacheMagic relies on the post_save and post_delete signals to keep your cache up to date. Performing operations that alter the database state without sending these signals will result in your cache becoming out of sync with your database.
Note
Do not use queryset.update() with models that have a CacheController attached! Your cache will not be updated.
Thundering Herd Protection¶
When your cache keys expire, there is a time window before the new value is recomputed where many clients will not be able to retrieve any result. This will cause a huge load to database backends. More can be found here: http://en.wikipedia.org/wiki/Thundering_herd_problem
To eliminate the problem, CacheMagic provides a redis backend that stores extra metadata about expiry time. This allows one client to realize the key will expire soon and tell the others continue using the old value until it is recomputed.
Usage¶
To setup the protection, simply use the RedisHerdCache backend provided. Here is an example configuration:
CACHES['default'] = {
'BACKEND': 'cachemagic.cache.RedisHerdCache',
'LOCATION': ':'.join([REDIS_HOST, str(REDIS_PORT), '0']),
'OPTIONS': {
'PASSWORD': REDIS_PASSWORD,
},
'VERSION': 0,
}