betamax¶
Betamax is a VCR imitation for requests. This will make mocking out requests much easier. It is tested on Travis CI.
Put in a more humorous way: “Betamax records your HTTP interactions so the NSA does not have to.”
Example Use¶
from betamax import Betamax
from requests import Session
from unittest import TestCase
with Betamax.configure() as config:
config.cassette_library_dir = 'tests/fixtures/cassettes'
class TestGitHubAPI(TestCase):
def setUp(self):
self.session = Session()
self.headers.update(...)
# Set the cassette in a line other than the context declaration
def test_user(self):
with Betamax(self.session) as vcr:
vcr.use_cassette('user')
resp = self.session.get('https://api.github.com/user',
auth=('user', 'pass'))
assert resp.json()['login'] is not None
# Set the cassette in line with the context declaration
def test_repo(self):
with Betamax(self.session).use_cassette('repo'):
resp = self.session.get(
'https://api.github.com/repos/sigmavirus24/github3.py'
)
assert resp.json()['owner'] != {}
What does it even do?¶
If you are unfamiliar with VCR, you might need a better explanation of what Betamax does.
Betamax intercepts every request you make and attempts to find a matching request that has already been intercepted and recorded. Two things can then happen:
- If there is a matching request, it will return the response that is associated with it.
- If there is not a matching request and it is allowed to record new responses, it will make the request, record the response and return the response.
Recorded requests and corresponding responses - also known as interactions - are stored in files called cassettes. (An example cassette can be seen in the examples section of the documentation.) The directory you store your cassettes in is called your library, or your cassette library.
VCR Cassette Compatibility¶
Betamax can use any VCR-recorded cassette as of this point in time. The only caveat is that python-requests returns a URL on each response. VCR does not store that in a cassette now but we will. Any VCR-recorded cassette used to playback a response will unfortunately not have a URL attribute on responses that are returned. This is a minor annoyance but not something that can be fixed.
Contents of Betamax’s Documentation¶
Getting Started¶
The first step is to make sure Betamax is right for you. Let’s start by answering the following questions
Are you using Requests?
If you’re not using Requests, Betamax is not for you. You should checkout VCRpy.
Are you using Sessions or are you using the functional API (e.g.,
requests.get
)?If you’re using the functional API, and aren’t willing to use Sessions, Betamax is not yet for you.
So if you’re using Requests and you’re using Sessions, you’re in the right place.
Betamax officially supports py.test and unittest but it should integrate well with nose as well.
Installation¶
$ pip install betamax
Configuration¶
When starting with Betamax, you need to tell it where to store the cassettes that it creates. There’s two ways to do this:
If you’re using
Betamax
oruse_cassette
you can pass thecassette_library_dir
option. For example,import betamax import requests session = requests.Session() recorder = betamax.Betamax(session, cassette_library_dir='cassettes') with recorder.use_cassette('introduction'): # ...
You can do it once, globally, for your test suite.
import betamax with betamax.Betamax.configure() as config: config.cassette_library_dir = 'cassettes'
Note
If you don’t set a cassette directory, Betamax won’t save cassettes to disk
There are other configuration options that can be provided, but this is the only one that is required.
Recording Your First Cassette¶
Let’s make a file named our_first_recorded_session.py
. Let’s add the
following to our file:
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/cassettes/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('our-first-recorded-session'):
session.get('https://httpbin.org/get')
if __name__ == '__main__':
main()
If we then run our script, we’ll see that a new file is created in our specified cassette directory. It should look something like:
{"http_interactions": [{"request": {"body": {"string": "", "encoding": "utf-8"}, "headers": {"Connection": ["keep-alive"], "Accept-Encoding": ["gzip, deflate"], "Accept": ["*/*"], "User-Agent": ["python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"]}, "method": "GET", "uri": "https://httpbin.org/get"}, "response": {"body": {"string": "{\n \"args\": {}, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/get\"\n}\n", "encoding": null}, "headers": {"content-length": ["265"], "server": ["nginx"], "connection": ["keep-alive"], "access-control-allow-credentials": ["true"], "date": ["Fri, 19 Jun 2015 04:10:33 GMT"], "access-control-allow-origin": ["*"], "content-type": ["application/json"]}, "status": {"message": "OK", "code": 200}, "url": "https://httpbin.org/get"}, "recorded_at": "2015-06-19T04:10:33"}], "recorded_with": "betamax/0.4.1"}
Now, each subsequent time that we run that script, we will use the recorded interaction instead of talking to the internet over and over again.
Note
There is no need to write any other code to replay your cassettes. Each time you run that session with the cassette in place, Betamax does all the heavy lifting for you.
Recording More Complex Cassettes¶
Most times we cannot isolate our tests to a single request at a time, so we’ll have cassettes that make multiple requests. Betamax can handle these with ease, let’s take a look at an example.
import betamax
from betamax_serializers import pretty_json
import requests
CASSETTE_LIBRARY_DIR = 'examples/cassettes/'
def main():
session = requests.Session()
betamax.Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('more-complicated-cassettes',
serialize_with='prettyjson'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
if __name__ == '__main__':
main()
Before we run this example, we have to install a new package:
betamax-serializers
, e.g., pip install betamax-serializers
.
If we now run our new example, we’ll see a new file appear in our
examples/cassettes/
directory named
more-complicated-cassettes.json
. This cassette will be much larger as
a result of making 3 requests and receiving 3 responses. You’ll also notice
that we imported betamax_serializers.pretty_json
and called
register_serializer()
with
PrettyJSONSerializer
. Then we added
a keyword argument to our invocation of use_cassette()
,
serialize_with='prettyjson'
.
PrettyJSONSerializer
is a class
provided by the betamax-serializers
package on PyPI that can serialize and
deserialize cassette data into JSON while allowing it to be easily human
readable and pretty. Let’s see the results:
{
"http_interactions": [
{
"recorded_at": "2015-06-21T19:22:54",
"request": {
"body": {
"encoding": "utf-8",
"string": ""
},
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip, deflate"
],
"Connection": [
"keep-alive"
],
"User-Agent": [
"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"
]
},
"method": "GET",
"uri": "https://httpbin.org/get"
},
"response": {
"body": {
"encoding": null,
"string": "{\n \"args\": {}, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/get\"\n}\n"
},
"headers": {
"access-control-allow-credentials": [
"true"
],
"access-control-allow-origin": [
"*"
],
"connection": [
"keep-alive"
],
"content-length": [
"265"
],
"content-type": [
"application/json"
],
"date": [
"Sun, 21 Jun 2015 19:22:54 GMT"
],
"server": [
"nginx"
]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/get"
}
},
{
"recorded_at": "2015-06-21T19:22:54",
"request": {
"body": {
"encoding": "utf-8",
"string": "{\"some-attribute\": \"some-value\"}"
},
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip, deflate"
],
"Connection": [
"keep-alive"
],
"Content-Length": [
"32"
],
"Content-Type": [
"application/json"
],
"User-Agent": [
"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"
]
},
"method": "POST",
"uri": "https://httpbin.org/post?id=20"
},
"response": {
"body": {
"encoding": null,
"string": "{\n \"args\": {\n \"id\": \"20\"\n }, \n \"data\": \"{\\\"some-attribute\\\": \\\"some-value\\\"}\", \n \"files\": {}, \n \"form\": {}, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Content-Length\": \"32\", \n \"Content-Type\": \"application/json\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"json\": {\n \"some-attribute\": \"some-value\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/post?id=20\"\n}\n"
},
"headers": {
"access-control-allow-credentials": [
"true"
],
"access-control-allow-origin": [
"*"
],
"connection": [
"keep-alive"
],
"content-length": [
"495"
],
"content-type": [
"application/json"
],
"date": [
"Sun, 21 Jun 2015 19:22:54 GMT"
],
"server": [
"nginx"
]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/post?id=20"
}
},
{
"recorded_at": "2015-06-21T19:22:54",
"request": {
"body": {
"encoding": "utf-8",
"string": ""
},
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip, deflate"
],
"Connection": [
"keep-alive"
],
"User-Agent": [
"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"
]
},
"method": "GET",
"uri": "https://httpbin.org/get?id=20"
},
"response": {
"body": {
"encoding": null,
"string": "{\n \"args\": {\n \"id\": \"20\"\n }, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/get?id=20\"\n}\n"
},
"headers": {
"access-control-allow-credentials": [
"true"
],
"access-control-allow-origin": [
"*"
],
"connection": [
"keep-alive"
],
"content-length": [
"289"
],
"content-type": [
"application/json"
],
"date": [
"Sun, 21 Jun 2015 19:22:54 GMT"
],
"server": [
"nginx"
]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/get?id=20"
}
}
],
"recorded_with": "betamax/0.4.2"
}
This makes the cassette easy to read and helps us recognize that requests and responses are paired together. We’ll explore cassettes more a bit later.
Long Term Usage Patterns¶
Now that we’ve covered the basics in Getting Started, let’s look at some patterns and problems we might encounter when using Betamax over a period of months instead of minutes.
Adding New Requests to a Cassette¶
Let’s reuse an example. Specifically let’s reuse our
examples/more_complicated_cassettes.py
example.
import betamax
from betamax_serializers import pretty_json
import requests
CASSETTE_LIBRARY_DIR = 'examples/cassettes/'
def main():
session = requests.Session()
betamax.Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('more-complicated-cassettes',
serialize_with='prettyjson'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
if __name__ == '__main__':
main()
Let’s add a new POST
request in there:
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-other-attribute': 'some-other-value'})
If we run this cassette now, we should expect to see that there was an
exception because Betamax couldn’t find a matching request for it. We expect
this because the post requests have two completely different bodies, right?
Right. The problem you’ll find is that by default Betamax only matches on
the URI and the Method. So Betamax will find a matching request/response pair
for ("POST", "https://httpbin.org/post?id=20")
and reuse it. So now we
need to update how we use Betamax so it will match using the body
as well:
import betamax
from betamax_serializers import pretty_json
import requests
CASSETTE_LIBRARY_DIR = 'examples/cassettes/'
def main():
session = requests.Session()
betamax.Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
matchers = ['method', 'uri', 'body']
with recorder.use_cassette('more-complicated-cassettes',
serialize_with='prettyjson',
match_requests_on=matchers):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-other-attribute': 'some-other-value'})
if __name__ == '__main__':
main()
Now when we run that we should see something like this:
Traceback (most recent call last):
File "examples/more_complicated_cassettes_2.py", line 30, in <module>
main()
File "examples/more_complicated_cassettes_2.py", line 26, in main
json={'some-other-attribute': 'some-other-value'})
File ".../lib/python2.7/site-packages/requests/sessions.py", line 508, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File ".../lib/python2.7/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File ".../lib/python2.7/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File ".../lib/python2.7/site-packages/betamax/adapter.py", line 91, in send
self.cassette))
betamax.exceptions.BetamaxError: A request was made that could not be handled.
A request was made to https://httpbin.org/post?id=20 that could not be found in more-complicated-cassettes.
The settings on the cassette are:
- record_mode: once
- match_options ['method', 'uri', 'body'].
This is what we do expect to see. So, how do we fix it?
We have a few options to fix it.
Option 1: Re-recording the Cassette¶
One of the easiest ways to fix this situation is to simply remove the cassette that was recorded and run the script again. This will recreate the cassette and subsequent runs will work just fine.
To be clear, we’re advocating for this option that the user do:
$ rm examples/cassettes/{{ cassette-name }}
This is the favorable option if you don’t foresee yourself needing to add new interactions often.
Option 2: Changing the Record Mode¶
A different way would be to update the recording mode used by Betamax. We would update the line in our file that currently reads:
with recorder.use_cassette('more-complicated-cassettes',
serialize_with='prettyjson',
match_requests_on=matchers):
to add one more parameter to the call to use_cassette()
.
We want to use the record
parameter to tell Betamax to use either the
new_episodes
or all
modes. Which you choose depends on your use case.
new_episodes
will only record new request/response interactions that
Betamax sees. all
will just re-record every interaction every time. In our
example, we’ll use new_episodes
so our code now looks like:
with recorder.use_cassette('more-complicated-cassettes',
serialize_with='prettyjson',
match_requests_on=matchers,
record='new_episodes'):
Known Issues¶
Tests Periodically Slow Down¶
Description:
Requests checks if it should use or bypass proxies using the standard library
function proxy_bypass
. This has been known to cause slow downs when using
Requests and can cause your recorded requests to slow down as well.
Betamax presently has no way to prevent this from being called as it operates at a lower level in Requests than is necessary.
Workarounds:
Mock gethostbyname method from socket library, to force a localhost setting, e.g.,
import socket socket.gethostbyname = lambda x: '127.0.0.1'
Set
trust_env
toFalse
on the session used with Betamax. This will prevent Requests from checking for proxies and whether it needs bypass them.
Related bugs:
Configuring Betamax¶
By now you’ve seen examples where we pass a great deal of keyword arguments to
use_cassette()
. You have also seen that we used
betamax.Betamax.configure()
. In this section, we’ll go into a deep
description of the different approaches and why you might pick one over the
other.
Global Configuration¶
Admittedly, I am not too proud of my decision to borrow this design from VCR, but I did and I use it and it isn’t entirely terrible. (Note: I do hope to come up with an elegant way to redesign it for v1.0.0 but that’s a long way off.)
The best way to configure Betamax globally is by using
betamax.Betamax.configure()
. This returns a
betamax.configure.Configuration
instance. This instance can be used
as a context manager in order to make the usage look more like VCR’s way of
configuring the library. For example, in VCR, you might do
VCR.configure do |config|
config.cassette_library_dir = 'examples/cassettes'
config.default_cassette_options[:record] = :none
# ...
end
Where as with Betamax you might do
from betamax import Betamax
with Betamax.configure() as config:
config.cassette_library_dir = 'examples/cassettes'
config.default_cassette_options['record_mode'] = 'none'
Alternatively, since the object returned is really just an object and does not do anything special as a context manager, you could just as easily do
from betamax import Betamax
config = Betamax.configure()
config.cassette_library_dir = 'examples/cassettes'
config.default_cassette_options['record_mode'] = 'none'
We’ll now move on to specific use-cases when configuring Betamax. We’ll
exclude the portion of each example where we create a
Configuration
instance.
Setting the Directory in which Betamax Should Store Cassette Files¶
Each and every time we use Betamax we need to tell it where to store (and
discover) cassette files. By default we do this by setting the
cassette_library_dir
attribute on our config
object, e.g.,
config.cassette_library_dir = 'tests/integration/cassettes'
Note that these paths are relative to what Python thinks is the current working directory. Wherever you run your tests from, write the path to be relative to that directory.
Setting Default Cassette Options¶
Cassettes have default options used by Betamax if none are set. For example,
- The default record mode is
once
. - The default matchers used are
method
anduri
. - Cassettes do not preserve the exact body bytes by default.
These can all be configured as you please. For example, if you want to change the default matchers and preserve exact body bytes, you would do
config.default_cassette_options['match_requests_on'] = [
'method',
'uri',
'headers',
]
config.preserve_exact_body_bytes = True
Filtering Sensitive Data¶
It’s unlikely that you’ll want to record an interaction that will not require authentication. For this we can define placeholders in our cassettes. Let’s use a very real example.
Let’s say that you want to get your user data from GitHub using Requests. You might have code that looks like this:
def me(username, password, session):
r = session.get('https://api.github.com/user', auth=(username, password))
r.raise_for_status()
return r.json()
You would test this something like:
import os
import betamax
import requests
from my_module import me
session = requests.Session()
recorder = betamax.Betamax(session)
username = os.environ.get('USERNAME', 'testuser')
password = os.environ.get('PASSWORD', 'testpassword')
with recorder.use_cassette('test-me'):
json = me(username, password, session)
# assertions about the JSON returned
The problem is that now your username and password will be recorded in the cassette which you don’t then want to push to your version control. How can we prevent that from happening?
import base64
username = os.environ.get('USERNAME', 'testuser')
password = os.environ.get('PASSWORD', 'testpassword')
config.define_cassette_placeholder(
'<GITHUB-AUTH>',
base64.b64encode(
'{0}:{1}'.format(username, password).encode('utf-8')
)
)
Note
Obviously you can refactor this a bit so you can pull those environment variables out in only one place, but I’d rather be clear than not here.
The first time you run the test script you would invoke your tests like so:
$ USERNAME='my-real-username' PASSWORD='supersecretep@55w0rd' \
python test_script.py
Future runs of the script could simply be run without those environment variables, e.g.,
$ python test_script.py
This means that you can run these tests on a service like Travis-CI without providing credentials.
In the event that you can not anticipate what you will need to filter out,
version 0.7.0 of Betamax adds before_record
and before_playback
hooks.
These two hooks both will pass the
Interaction
and
Cassette
to the function provided. An
example callback would look like:
def hook(interaction, cassette):
pass
You would then register this callback:
# Either
config.before_record(callback=hook)
# Or
config.before_playback(callback=hook)
You can register callables for both hooks. If you wish to ignore an
interaction and prevent it from being recorded or replayed, you can call the
ignore()
. You also have full
access to all of the methods and attributes on an instance of an Interaction.
This will allow you to inspect the response produced by the interaction and
then modify it. Let’s say, for example, that you are talking to an API that
grants authorization tokens on a specific request. In this example, you might
authenticate initially using a username and password and then use a token
after authenticating. You want, however, for the token to be kept secret. In
that case you might configure Betamax to replace the username and password,
e.g.,
config.define_cassette_placeholder('<USERNAME>', username)
config.define_cassette_placeholder('<PASSWORD>', password)
And you would also write a function that, prior to recording, finds the token, saves it, and obscures it from the recorded version of the cassette:
from betamax.cassette import cassette
def sanitize_token(interaction, current_cassette):
# Exit early if the request did not return 200 OK because that's the
# only time we want to look for Authorization-Token headers
if interaction.data['response']['status']['code'] != 200:
return
headers = interaction.data['response']['headers']
token = headers.get('Authorization-Token')
# If there was no token header in the response, exit
if token is None:
return
# Otherwise, create a new placeholder so that when cassette is saved,
# Betamax will replace the token with our placeholder.
current_cassette.placeholders.append(
cassette.Placeholder(placeholder='<AUTH_TOKEN>', replace=token)
)
This will dynamically create a placeholder for that cassette only. Once we have our hook, we need merely register it like so:
config.before_record(callback=sanitize_token)
And we no longer need to worry about leaking sensitive data.
In addition to the before_record
and before_playback
hooks,
version 0.9.0 of Betamax adds after_start()
and before_stop()
hooks. These two hooks both will pass the current
Cassette
to the callback function provided.
Register these hooks like so:
def hook(cassette):
if cassette.is_recording():
print("This cassette is recording!")
# Either
config.after_start(callback=hook)
# Or
config.before_stop(callback=hook)
These hooks are useful for performing configuration actions external to Betamax at the time Betamax is invoked, such as setting up correct authentication to an API so that the recording will not encounter any errors.
Setting default serializer¶
If you want to use a specific serializer for every cassette, you can set
serialize_with
as a default cassette option. For example, if you wanted to
use the prettyjson
serializer for every cassette you would do:
config.default_cassette_options['serialize_with'] = 'prettyjson'
Per-Use Configuration¶
Each time you create a Betamax
instance or use
use_cassette()
, you can pass some of the options from
above.
Setting the Directory in which Betamax Should Store Cassette Files¶
When using per-use configuration of Betamax, you can specify the cassette
directory when you instantiate a Betamax
object:
session = requests.Session()
recorder = betamax.Betamax(session,
cassette_library_dir='tests/cassettes/')
Setting Default Cassette Options¶
You can also set default cassette options when instantiating a
Betamax
object:
session = requests.Session()
recorder = betamax.Betamax(session, default_cassette_options={
'record_mode': 'once',
'match_requests_on': ['method', 'uri', 'headers'],
'preserve_exact_body_bytes': True
})
You can also set the above when calling use_cassette()
:
session = requests.Session()
recorder = betamax.Betamax(session)
with recorder.use_cassette('cassette-name',
preserve_exact_body_bytes=True,
match_requests_on=['method', 'uri', 'headers'],
record='once'):
session.get('https://httpbin.org/get')
Filtering Sensitive Data¶
Filtering sensitive data on a per-usage basis is the only difficult (or perhaps, less convenient) case. Cassette placeholders are part of the default cassette options, so we’ll set this value similarly to how we set the other default cassette options, the catch is that placeholders have a specific structure. Placeholders are stored as a list of dictionaries. Let’s use our example above and convert it.
import base64
username = os.environ.get('USERNAME', 'testuser')
password = os.environ.get('PASSWORD', 'testpassword')
session = requests.Session()
recorder = betamax.Betamax(session, default_cassette_options={
'placeholders': [{
'placeholder': '<GITHUB-AUTH>',
'replace': base64.b64encode(
'{0}:{1}'.format(username, password).encode('utf-8')
),
}]
})
Note that what we passed as our first argument is assigned to the
'placeholder'
key while the value we’re replacing is assigned to the
'replace'
key.
This isn’t the typical way that people filter sensitive data because they tend to want to do it globally.
Mixing and Matching¶
It’s not uncommon to mix and match configuration methodologies. I do this in
github3.py. I use global configuration to filter sensitive data and set
defaults based on the environment the tests are running in. On Travis-CI, the
record mode is set to 'none'
. I also set how we match requests and when we
preserve exact body bytes on a per-use basis.
Record Modes¶
Betamax, like VCR, has four modes that it can use to record cassettes:
'all'
'new_episodes'
'none'
'once'
You can only ever use one record mode. Below are explanations and examples of each record mode. The explanations are blatantly taken from VCR’s own Record Modes documentation.
All¶
The 'all'
record mode will:
- Record new interactions.
- Never replay previously recorded interactions.
This can be temporarily used to force VCR to re-record a cassette (i.e., to ensure the responses are not out of date) or can be used when you simply want to log all HTTP requests.
Given our file, examples/record_modes/all/example.py
,
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/all/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('all-example', record='all'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
if __name__ == '__main__':
main()
Every time we run it, our cassette
(examples/record_modes/all/all-example.json
) will be updated with new
values.
New Episodes¶
The 'new_episodes'
record mode will:
- Record new interactions.
- Replay previously recorded interactions.
It is similar to the 'once'
record mode, but will always record new
interactions, even if you have an existing recorded one that is similar
(but not identical, based on the :match_request_on option).
Given our file, examples/record_modes/new_episodes/example_original.py
,
with which we have already recorded
examples/record_modes/new_episodes/new-episodes-example.json
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/new_episodes/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('new-episodes-example', record='new_episodes'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
if __name__ == '__main__':
main()
If we then run examples/record_modes/new_episodes/example_updated.py
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/new_episodes/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('new-episodes-example', record='new_episodes'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
session.get('https://httpbin.org/get', params={'id': '40'})
if __name__ == '__main__':
main()
The new request at the end of the file will be added to the cassette without updating the other interactions that were already recorded.
None¶
The 'none'
record mode will:
- Replay previously recorded interactions.
- Cause an error to be raised for any new requests.
This is useful when your code makes potentially dangerous HTTP requests. The
'none'
record mode guarantees that no new HTTP requests will be made.
Given our file, examples/record_modes/none/example_original.py
, with a
cassette that already has interactions recorded in
examples/record_modes/none/none-example.json
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/none/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('none-example', record='none'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
if __name__ == '__main__':
main()
If we then run examples/record_modes/none/example_updated.py
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/none/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('none-example', record='none'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
session.get('https://httpbin.org/get', params={'id': '40'})
if __name__ == '__main__':
main()
We’ll see an exception indicating that new interactions were prevented:
Traceback (most recent call last):
File "examples/record_modes/none/example_updated.py", line 23, in <module>
main()
File "examples/record_modes/none/example_updated.py", line 19, in main
session.get('https://httpbin.org/get', params={'id': '40'})
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 477, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/site-packages/betamax/adapter.py", line 91, in send
self.cassette))
betamax.exceptions.BetamaxError: A request was made that could not be handled.
A request was made to https://httpbin.org/get?id=40 that could not be found in none-example.
The settings on the cassette are:
- record_mode: none
- match_options ['method', 'uri'].
Once¶
The 'once'
record mode will:
- Replay previously recorded interactions.
- Record new interactions if there is no cassette file.
- Cause an error to be raised for new requests if there is a cassette file.
It is similar to the 'new_episodes'
record mode, but will prevent new,
unexpected requests from being made (i.e. because the request URI changed
or whatever).
'once'
is the default record mode, used when you do not set one.
If we have a file, examples/record_modes/once/example_original.py
,
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/once/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('once-example', record='once'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
if __name__ == '__main__':
main()
And we run it, we’ll see a cassette named
examples/record_modes/once/once-example.json
has been created.
If we then run examples/record_modes/once/example_updated.py
,
import betamax
import requests
CASSETTE_LIBRARY_DIR = 'examples/record_modes/once/'
def main():
session = requests.Session()
recorder = betamax.Betamax(
session, cassette_library_dir=CASSETTE_LIBRARY_DIR
)
with recorder.use_cassette('once-example', record='once'):
session.get('https://httpbin.org/get')
session.post('https://httpbin.org/post',
params={'id': '20'},
json={'some-attribute': 'some-value'})
session.get('https://httpbin.org/get', params={'id': '20'})
session.get('https://httpbin.org/get', params={'id': '40'})
if __name__ == '__main__':
main()
We’ll see an exception similar to the one we see when using the 'none'
record mode.
Traceback (most recent call last):
File "examples/record_modes/once/example_updated.py", line 23, in <module>
main()
File "examples/record_modes/once/example_updated.py", line 19, in main
session.get('https://httpbin.org/get', params={'id': '40'})
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 477, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/site-packages/betamax/adapter.py", line 91, in send
self.cassette))
betamax.exceptions.BetamaxError: A request was made that could not be handled.
A request was made to https://httpbin.org/get?id=40 that could not be found in none-example.
The settings on the cassette are:
- record_mode: once
- match_options ['method', 'uri'].
Third-Party Packages¶
Betamax was created to be a very close imitation of VCR. As such, it has the default set of request matchers and a subset of the supported cassette serializers for VCR.
As part of my own usage of Betamax, and supporting other people’s usage of Betamax, I’ve created (and maintain) two third party packages that provide extra request matchers and cassette serializers.
For simplicity, those modules will be documented here instead of on their own documentation sites.
Request Matchers¶
There are three third-party request matchers provided by the betamax-matchers package:
URLEncodedBodyMatcher
,'form-urlencoded-body'
JSONBodyMatcher
,'json-body'
MultipartFormDataBodyMatcher
,'multipart-form-data-body'
In order to use any of these we have to register them with Betamax. Below we will register all three but you do not need to do that if you only need to use one:
import betamax
from betamax_matchers import form_urlencoded
from betamax_matchers import json_body
from betamax_matchers import multipart
betamax.Betamax.register_request_matcher(
form_urlencoded.URLEncodedBodyMatcher
)
betamax.Betamax.register_request_matcher(
json_body.JSONBodyMatcher
)
betamax.Betamax.register_request_matcher(
multipart.MultipartFormDataBodyMatcher
)
All of these classes inherit from betamax.BaseMatcher
which means
that each needs a name that will be used when specifying what matchers to use
with Betamax. I have noted those next to the class name for each matcher
above. Let’s use the JSON body matcher in an example though:
import betamax
from betamax_matchers import json_body
# This example requires at least requests 2.5.0
import requests
betamax.Betamax.register_request_matcher(
json_body.JSONBodyMatcher
)
def main():
session = requests.Session()
recorder = betamax.Betamax(session, cassette_library_dir='.')
url = 'https://httpbin.org/post'
json_data = {'key': 'value',
'other-key': 'other-value',
'yet-another-key': 'yet-another-value'}
matchers = ['method', 'uri', 'json-body']
with recorder.use_cassette('json-body-example', match_requests_on=matchers):
r = session.post(url, json=json_data)
if __name__ == '__main__':
main()
If we ran that request without those matcher with hash seed randomization, then we would occasionally receive exceptions that a request could not be matched. That is because dictionaries are not inherently ordered so the body string of the request can change and be any of the following:
{"key": "value", "other-key": "other-value", "yet-another-key":
"yet-another-value"}
{"key": "value", "yet-another-key": "yet-another-value", "other-key":
"other-value"}
{"other-key": "other-value", "yet-another-key": "yet-another-value",
"key": "value"}
{"yet-another-key": "yet-another-value", "key": "value", "other-key":
"other-value"}
{"yet-another-key": "yet-another-value", "other-key": "other-value",
"key": "value"}
{"other-key": "other-value", "key": "value", "yet-another-key":
"yet-another-value"}
But using the 'json-body'
matcher, the matcher will parse the request and
compare python dictionaries instead of python strings. That will completely
bypass the issues introduced by hash randomization. I use this matcher
extensively in github3.py’s tests.
Cassette Serializers¶
By default, Betamax only comes with the JSON serializer. betamax-serializers provides extra serializer classes that users have contributed.
For example, as we’ve seen elsewhere in our documentation, the default JSON
serializer does not create beautiful or easy to read cassettes. As a
substitute for that, we have the
PrettyJSONSerializer
that does that
for you.
from betamax import Betamax
from betamax_serializers import pretty_json
import requests
Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
session = requests.Session()
recorder = Betamax(session)
with recorder.use_cassette('testpretty', serialize_with='prettyjson'):
session.request(method=method, url=url, ...)
This will give us a pretty-printed cassette like:
{
"http_interactions": [
{
"recorded_at": "2015-06-21T19:22:54",
"request": {
"body": {
"encoding": "utf-8",
"string": ""
},
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip, deflate"
],
"Connection": [
"keep-alive"
],
"User-Agent": [
"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"
]
},
"method": "GET",
"uri": "https://httpbin.org/get"
},
"response": {
"body": {
"encoding": null,
"string": "{\n \"args\": {}, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/get\"\n}\n"
},
"headers": {
"access-control-allow-credentials": [
"true"
],
"access-control-allow-origin": [
"*"
],
"connection": [
"keep-alive"
],
"content-length": [
"265"
],
"content-type": [
"application/json"
],
"date": [
"Sun, 21 Jun 2015 19:22:54 GMT"
],
"server": [
"nginx"
]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/get"
}
},
{
"recorded_at": "2015-06-21T19:22:54",
"request": {
"body": {
"encoding": "utf-8",
"string": "{\"some-attribute\": \"some-value\"}"
},
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip, deflate"
],
"Connection": [
"keep-alive"
],
"Content-Length": [
"32"
],
"Content-Type": [
"application/json"
],
"User-Agent": [
"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"
]
},
"method": "POST",
"uri": "https://httpbin.org/post?id=20"
},
"response": {
"body": {
"encoding": null,
"string": "{\n \"args\": {\n \"id\": \"20\"\n }, \n \"data\": \"{\\\"some-attribute\\\": \\\"some-value\\\"}\", \n \"files\": {}, \n \"form\": {}, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Content-Length\": \"32\", \n \"Content-Type\": \"application/json\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"json\": {\n \"some-attribute\": \"some-value\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/post?id=20\"\n}\n"
},
"headers": {
"access-control-allow-credentials": [
"true"
],
"access-control-allow-origin": [
"*"
],
"connection": [
"keep-alive"
],
"content-length": [
"495"
],
"content-type": [
"application/json"
],
"date": [
"Sun, 21 Jun 2015 19:22:54 GMT"
],
"server": [
"nginx"
]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/post?id=20"
}
},
{
"recorded_at": "2015-06-21T19:22:54",
"request": {
"body": {
"encoding": "utf-8",
"string": ""
},
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip, deflate"
],
"Connection": [
"keep-alive"
],
"User-Agent": [
"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0"
]
},
"method": "GET",
"uri": "https://httpbin.org/get?id=20"
},
"response": {
"body": {
"encoding": null,
"string": "{\n \"args\": {\n \"id\": \"20\"\n }, \n \"headers\": {\n \"Accept\": \"*/*\", \n \"Accept-Encoding\": \"gzip, deflate\", \n \"Host\": \"httpbin.org\", \n \"User-Agent\": \"python-requests/2.7.0 CPython/2.7.9 Darwin/14.1.0\"\n }, \n \"origin\": \"127.0.0.1\", \n \"url\": \"https://httpbin.org/get?id=20\"\n}\n"
},
"headers": {
"access-control-allow-credentials": [
"true"
],
"access-control-allow-origin": [
"*"
],
"connection": [
"keep-alive"
],
"content-length": [
"289"
],
"content-type": [
"application/json"
],
"date": [
"Sun, 21 Jun 2015 19:22:54 GMT"
],
"server": [
"nginx"
]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/get?id=20"
}
}
],
"recorded_with": "betamax/0.4.2"
}
Usage Patterns¶
Below are suggested patterns for using Betamax efficiently.
Configuring Betamax in py.test’s conftest.py¶
Betamax and github3.py (the project which instigated the creation of Betamax)
both utilize py.test and its feature of configuring how the tests run with
conftest.py
[1]. One pattern that I have found useful is to include this
in your conftest.py
file:
import betamax
with betamax.Betamax.configure() as config:
config.cassette_library_dir = 'tests/cassettes/'
This configures your cassette directory for all of your tests. If you do not check your cassettes into your version control system, then you can also add:
import os
if not os.path.exists('tests/cassettes'):
os.makedirs('tests/cassettes')
An Example from github3.py¶
You can configure other aspects of Betamax via the conftest.py
file. For
example, in github3.py, I do the following:
import os
record_mode = 'none' if os.environ.get('TRAVIS_GH3') else 'once'
with betamax.Betamax.configure() as config:
config.cassette_library_dir = 'tests/cassettes/'
config.default_cassette_options['record_mode'] = record_mode
config.define_cassette_placeholder(
'<AUTH_TOKEN>',
os.environ.get('GH_AUTH', 'x' * 20)
)
In essence, if the tests are being run on Travis CI, then we want to make sure to not try to record new cassettes or interactions. We also, want to ensure we’re authenticated when possible but that we do not leave our placeholder in the cassettes when they’re replayed.
Using Human Readable JSON Cassettes¶
Using the PrettyJSONSerializer
provided by the betamax_serializers
package provides human readable JSON cassettes. Cassettes output in this way
make it easy to compare modifications to cassettes to ensure only expected
changes are introduced.
While you can use the serialize_with
option when creating each individual
cassette, it is simpler to provide this setting globally. The following example
demonstrates how to configure Betamax to use the PrettyJSONSerializer
for
all newly created cassettes:
from betamax_serializers import pretty_json
betamax.Betamax.register_serializer(pretty_json.PrettyJSONSerializer)
# ...
config.default_cassette_options['serialize_with'] = 'prettyjson'
Updating Existing Betamax Cassettes to be Human Readable¶
If you already have a library of cassettes when applying the previous configuration update, then you will probably want to also update all your existing cassettes into the new human readable format. The following script will help you transform your existing cassettes:
import os
import glob
import json
import sys
try:
cassette_dir = sys.argv[1]
cassettes = glob.glob(os.path.join(cassette_dir, '*.json'))
except:
print('Usage: {0} CASSETTE_DIRECTORY'.format(sys.argv[0]))
sys.exit(1)
for cassette_path in cassettes:
with open(cassette_path, 'r') as fp:
data = json.load(fp)
with open(cassette_path, 'w') as fp:
json.dump(data, fp, sort_keys=True, indent=2,
separators=(',', ': '))
print('Updated {0} cassette{1}.'.format(
len(cassettes), '' if len(cassettes) == 1 else 's'))
Copy and save the above script as fix_cassettes.py
and then run it like:
python fix_cassettes.py PATH_TO_CASSETTE_DIRECTORY
If you’re not already using a version control system (e.g., git, svn) then it is recommended you make a backup of your cassettes first in the event something goes wrong.
[1] | http://pytest.org/latest/plugins.html |
Integrating Betamax with Test Frameworks¶
It’s nice to have a way to integrate libraries you use for testing into your testing frameworks. Having considered this, the authors of and contributors to Betamax have included integrations in the package. Betamax comes with integrations for py.test and unittest. (If you need an integration for another framework, please suggest it and send a patch!)
PyTest Integration¶
New in version 0.5.0.
Changed in version 0.6.0.
When you install Betamax, it now installs two py.test fixtures by default.
To use it in your tests you need only follow the instructions on pytest’s
documentation. To use the betamax_session
fixture for an entire class of
tests you would do:
# tests/test_http_integration.py
import pytest
@pytest.mark.usefixtures('betamax_session')
class TestMyHttpClient:
def test_get(self, betamax_session):
betamax_session.get('https://httpbin.org/get')
This will generate a cassette name for you, e.g.,
tests.test_http_integration.TestMyHttpClient.test_get
. After running this
test you would have a cassette file stored in your cassette library directory
named tests.test_http_integration.TestMyHttpClient.test_get.json
. To use
this fixture at the module level, you need only do
# tests/test_http_integration.py
import pytest
pytest.mark.usefixtures('betamax_session')
class TestMyHttpClient:
def test_get(self, betamax_session):
betamax_session.get('https://httpbin.org/get')
class TestMyOtherHttpClient:
def test_post(self, betamax_session):
betamax_session.post('https://httpbin.org/post')
If you need to customize the recorder object, however, you can instead use the
betamax_recorder
fixture:
# tests/test_http_integration.py
import pytest
pytest.mark.usefixtures('betamax_recorder')
class TestMyHttpClient:
def test_post(self, betamax_recorder):
betamax_recorder.current_cassette.match_options.add('json-body')
session = betamax_recorder.session
session.post('https://httpbin.org/post', json={'foo': 'bar'})
Unittest Integration¶
New in version 0.5.0.
When writing tests with unittest, a common pattern is to either import
unittest.TestCase
or subclass that and use that subclass in your
tests. When integrating Betamax with your unittest testsuite, you should do
the following:
from betamax.fixtures import unittest
class IntegrationTestCase(unittest.BetamaxTestCase):
# Add the rest of the helper methods you want for your
# integration tests
class SpecificTestCase(IntegrationTestCase):
def test_something(self):
# Test something
The unittest integration provides the following attributes on the test case instance:
session
the instance ofBetamaxTestCase.SESSION_CLASS
created for that test.recorder
the instance ofbetamax.Betamax
created.
The integration also generates a cassette name from the test case class name
and test method. So the cassette generated for the above example would be
named SpecificTestCase.test_something
. To override that behaviour, you
need to override the
generate_cassette_name()
method in
your subclass.
The default path to save cassette is ./vcr/cassettes. To override the path uses the follow code at the top of file.
with betamax.Betamax.configure() as config:
config.cassette_library_dir = 'your/path/here'
If you are subclassing requests.Session
in your application, then it
follows that you will want to use that in your tests. To facilitate this, you
can set the SESSION_CLASS
attribute. To give a fuller example, let’s say
you’re changing the default cassette name and you’re providing your own
session class, your code might look like:
from betamax.fixtures import unittest
from myapi import session
class IntegrationTestCase(unittest.BetamaxTestCase):
# Add the rest of the helper methods you want for your
# integration tests
SESSION_CLASS = session.MyApiSession
def generate_cassette_name(self):
classname = self.__class__.__name__
method = self._testMethodName
return 'integration_{0}_{1}'.format(classname, method)
API¶
-
class
betamax.
Betamax
(session, cassette_library_dir=None, default_cassette_options={})¶ This object contains the main API of the request-vcr library.
This object is entirely a context manager so all you have to do is:
s = requests.Session() with Betamax(s) as vcr: vcr.use_cassette('example') r = s.get('https://httpbin.org/get')
Or more concisely, you can do:
s = requests.Session() with Betamax(s).use_cassette('example') as vcr: r = s.get('https://httpbin.org/get')
This object allows for the user to specify the cassette library directory and default cassette options.
s = requests.Session() with Betamax(s, cassette_library_dir='tests/cassettes') as vcr: vcr.use_cassette('example') r = s.get('https://httpbin.org/get') with Betamax(s, default_cassette_options={ 're_record_interval': 1000 }) as vcr: vcr.use_cassette('example') r = s.get('https://httpbin.org/get')
-
betamax_adapter
= None¶ Create a new adapter to replace the existing ones
-
static
configure
()¶ Help to configure the library as a whole.
with Betamax.configure() as config: config.cassette_library_dir = 'tests/cassettes/' config.default_cassette_options['match_options'] = [ 'method', 'uri', 'headers' ]
-
http_adapters
= None¶ Store the session’s original adapters.
-
static
register_request_matcher
(matcher_class)¶ Register a new request matcher.
Parameters: matcher_class – (required), this must sub-class BaseMatcher
-
static
register_serializer
(serializer_class)¶ Register a new serializer.
Parameters: matcher_class – (required), this must sub-class BaseSerializer
-
session
= None¶ Store the requests.Session object being wrapped.
-
start
()¶ Start recording or replaying interactions.
-
stop
()¶ Stop recording or replaying interactions.
-
use_cassette
(cassette_name, **kwargs)¶ Tell Betamax which cassette you wish to use for the context.
Parameters: - cassette_name (str) – relative name, without the serialization format, of the cassette you wish Betamax would use
- serialize_with (str) – the format you want Betamax to serialize the cassette with
- serialize (str) – DEPRECATED the format you want Betamax to serialize the request and response data to and from
-
-
betamax.decorator.
use_cassette
(cassette_name, cassette_library_dir=None, default_cassette_options={}, **use_cassette_kwargs)¶ Provide a Betamax-wrapped Session for convenience.
New in version 0.5.0.
This decorator can be used to get a plain Session that has been wrapped in Betamax. For example,
from betamax.decorator import use_cassette @use_cassette('example-decorator', cassette_library_dir='.') def test_get(session): # do things with session
Parameters: - cassette_name (str) – Name of the cassette file in which interactions will be stored.
- cassette_library_dir (str) – Directory in which cassette files will be stored.
- default_cassette_options (dict) – Dictionary of default cassette options to set for the cassette used when recording these interactions.
- **use_cassette_kwargs – Keyword arguments passed to
use_cassette()
-
class
betamax.configure.
Configuration
¶ This object acts as a proxy to configure different parts of Betamax.
You should only ever encounter this object when configuring the library as a whole. For example:
with Betamax.configure() as config: config.cassette_library_dir = 'tests/cassettes/' config.default_cassette_options['record_mode'] = 'once' config.default_cassette_options['match_requests_on'] = ['uri'] config.define_cassette_placeholder('<URI>', 'http://httpbin.org') config.preserve_exact_body_bytes = True
-
after_start
(callback=None)¶ Register a function to call after Betamax is started.
Example usage:
def on_betamax_start(cassette): if cassette.is_recording(): print("Setting up authentication...") with Betamax.configure() as config: config.cassette_load(callback=on_cassette_load)
Parameters: callback (callable) – The function which accepts a cassette and might mutate it before returning.
-
before_playback
(tag=None, callback=None)¶ Register a function to call before playing back an interaction.
Example usage:
def before_playback(interaction, cassette): pass with Betamax.configure() as config: config.before_playback(callback=before_playback)
Parameters: - tag (str) – Limits the interactions passed to the function based on the interaction’s tag (currently unsupported).
- callback (callable) – The function which either accepts just an interaction or an interaction and a cassette and mutates the interaction before returning.
-
before_record
(tag=None, callback=None)¶ Register a function to call before recording an interaction.
Example usage:
def before_record(interaction, cassette): pass with Betamax.configure() as config: config.before_record(callback=before_record)
Parameters: - tag (str) – Limits the interactions passed to the function based on the interaction’s tag (currently unsupported).
- callback (callable) – The function which either accepts just an interaction or an interaction and a cassette and mutates the interaction before returning.
-
before_stop
(callback=None)¶ Register a function to call before Betamax stops.
Example usage:
def on_betamax_stop(cassette): if not cassette.is_recording(): print("Playback completed.") with Betamax.configure() as config: config.cassette_eject(callback=on_betamax_stop)
Parameters: callback (callable) – The function which accepts a cassette and might mutate it before returning.
-
cassette_library_dir
¶ Retrieve and set the directory to store the cassettes in.
-
default_cassette_options
¶ Retrieve and set the default cassette options.
The options include:
match_requests_on
placeholders
re_record_interval
record_mode
preserve_exact_body_bytes
Other options will be ignored.
-
define_cassette_placeholder
(placeholder, replace)¶ Define a placeholder value for some text.
This also will replace the placeholder text with the text you wish it to use when replaying interactions from cassettes.
Parameters:
-
A set of fixtures to integrate Betamax with py.test.
-
betamax.fixtures.pytest.
betamax_session
(betamax_recorder)¶ Generate a session that has Betamax already installed.
See betamax_recorder fixture.
Parameters: betamax_recorder – A recorder fixture with a configured request session. Returns: An instantiated requests Session wrapped by Betamax.
Minimal unittest.TestCase
subclass adding Betamax integration.
-
class
betamax.fixtures.unittest.
BetamaxTestCase
(methodName='runTest')¶ Betamax integration for unittest.
New in version 0.5.0.
-
CASSETTE_LIBRARY_DIR
= None¶ Custom path to save cassette.
-
SESSION_CLASS
¶ alias of
requests.sessions.Session
-
generate_cassette_name
()¶ Generates a cassette name for the current test.
The default format is “%(classname)s.%(testMethodName)s”
To change the default cassette format, override this method in a subclass.
Returns: Cassette name for the current test. Return type: str
-
setUp
()¶ Betamax-ified setUp fixture.
This will call the superclass’ setUp method first and then it will create a new
requests.Session
and wrap that in a Betamax object to record it. At the end ofsetUp
, it will start recording.
-
tearDown
()¶ Betamax-ified tearDown fixture.
This will call the superclass’ tearDown method first and then it will stop recording interactions.
-
When using Betamax with unittest, you can use the traditional style of Betamax
covered in the documentation thoroughly, or you can use your fixture methods,
unittest.TestCase.setUp()
and unittest.TestCase.tearDown()
to wrap
entire tests in Betamax.
Here’s how you might use it:
from betamax.fixtures import unittest
from myapi import SessionManager
class TestMyApi(unittest.BetamaxTestCase):
def setUp(self):
# Call BetamaxTestCase's setUp first to get a session
super(TestMyApi, self).setUp()
self.manager = SessionManager(self.session)
def test_all_users(self):
"""Retrieve all users from the API."""
for user in self.manager:
# Make assertions or something
Alternatively, if you are subclassing a requests.Session
to provide
extra functionality, you can do something like this:
from betamax.fixtures import unittest
from myapi import Session, SessionManager
class TestMyApi(unittest.BetamaxTestCase):
SESSION_CLASS = Session
# See above ...
Examples¶
Basic Usage¶
Let example.json be a file in a directory called cassettes with the content:
{
"http_interactions": [
{
"request": {
"body": {
"string": "",
"encoding": "utf-8"
},
"headers": {
"User-Agent": ["python-requests/v1.2.3"]
},
"method": "GET",
"uri": "https://httpbin.org/get"
},
"response": {
"body": {
"string": "example body",
"encoding": "utf-8"
},
"headers": {},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/get"
}
}
],
"recorded_with": "betamax"
}
The following snippet will not raise any exceptions
from betamax import Betamax
from requests import Session
s = Session()
with Betamax(s, cassette_library_dir='cassettes') as betamax:
betamax.use_cassette('example', record='none')
r = s.get("https://httpbin.org/get")
On the other hand, this will raise an exception:
from betamax import Betamax
from requests import Session
s = Session()
with Betamax(s, cassette_library_dir='cassettes') as betamax:
betamax.use_cassette('example', record='none')
r = s.post("https://httpbin.org/post",
data={"key": "value"})
Finally, we can also use a decorator in order to simplify things:
import unittest
from betamax.decorator import use_cassette
class TestExample(unittest.TestCase):
@use_cassette('example', cassette_library_dir='cassettes')
def test_example(self, session):
session.get('https://httpbin.org/get')
# Or if you're using something like py.test
@use_cassette('example', cassette_library_dir='cassettes')
def test_example_pytest(session):
session.get('https://httpbin.org/get')
Opinions at Work¶
If you use requests
’s default Accept-Encoding
header, servers that
support gzip content encoding will return responses that Betamax cannot
serialize in a human-readable format. In this event, the cassette will look
like this:
{
"http_interactions": [
{
"request": {
"body": {
"base64_string": "",
"encoding": "utf-8"
},
"headers": {
"User-Agent": ["python-requests/v1.2.3"]
},
"method": "GET",
"uri": "https://httpbin.org/get"
},
"response": {
"body": {
"base64_string": "Zm9vIGJhcgo=",
"encoding": "utf-8"
},
"headers": {
"Content-Encoding": ["gzip"]
},
"status": {
"code": 200,
"message": "OK"
},
"url": "https://httpbin.org/get"
}
}
],
"recorded_with": "betamax"
}
Forcing bytes to be preserved¶
You may want to force betamax to preserve the exact bytes in the body of a response (or request) instead of relying on the opinions held by the library. In this case you have two ways of telling betamax to do this.
The first, is on a per-cassette basis, like so:
from betamax import Betamax
import requests
session = Session()
with Betamax.configure() as config:
c.cassette_library_dir = '.'
with Betamax(session).use_cassette('some_cassette',
preserve_exact_body_bytes=True):
r = session.get('http://example.com')
On the other hand, you may want to preserve exact body bytes for all cassettes. In this case, you can do:
from betamax import Betamax
import requests
session = Session()
with Betamax.configure() as config:
c.cassette_library_dir = '.'
c.preserve_exact_body_bytes = True
with Betamax(session).use_cassette('some_cassette'):
r = session.get('http://example.com')
What is a cassette?¶
A cassette is a set of recorded interactions serialized to a specific format. Currently the only supported format is JSON. A cassette has a list (or array) of interactions and information about the library that recorded it. This means that the cassette’s structure (using JSON) is
{
"http_interactions": [
// ...
],
"recorded_with": "betamax"
}
Each interaction is the object representing the request and response as well as the date it was recorded. The structure of an interaction is
{
"request": {
// ...
},
"response": {
// ...
},
"recorded_at": "2013-09-28T01:25:38"
}
Each request has the body, method, uri, and an object representing the headers. A serialized request looks like:
{
"body": {
"string": "...",
"encoding": "utf-8"
},
"method": "GET",
"uri": "http://example.com",
"headers": {
// ...
}
}
A serialized response has the status_code, url, and objects representing the headers and the body. A serialized response looks like:
{
"body": {
"encoding": "utf-8",
"string": "..."
},
"url": "http://example.com",
"status": {
"code": 200,
"message": "OK"
},
"headers": {
// ...
}
}
If you put everything together, you get:
{
"http_interactions": [
{
"request": {
{
"body": {
"string": "...",
"encoding": "utf-8"
},
"method": "GET",
"uri": "http://example.com",
"headers": {
// ...
}
}
},
"response": {
{
"body": {
"encoding": "utf-8",
"string": "..."
},
"url": "http://example.com",
"status": {
"code": 200,
"message": "OK"
},
"headers": {
// ...
}
}
},
"recorded_at": "2013-09-28T01:25:38"
}
],
"recorded_with": "betamax"
}
If you were to pretty-print a cassette, this is vaguely what you would see. Keep in mind that since Python does not keep dictionaries ordered, the items may not be in the same order as this example.
Note
Pro-tip You can pretty print a cassette like so:
python -m json.tool cassette.json
.
What is a cassette library?¶
When configuring Betamax, you can choose your own cassette library directory. This is the directory available from the current directory in which you want to store your cassettes.
For example, let’s say that you set your cassette library to be
tests/cassettes/
. In that case, when you record a cassette, it will be
saved there. To continue the example, let’s say you use the following code:
from requests import Session
from betamax import Betamax
s = Session()
with Betamax(s, cassette_library_dir='tests/cassettes').use_cassette('example'):
r = s.get('https://httpbin.org/get')
You would then have the following directory structure:
.
`-- tests
`-- cassettes
`-- example.json
Implementation Details¶
Everything here is an implementation detail and subject to volatile change. I would not rely on anything here for any mission critical code.
Gzip Content-Encoding¶
By default, requests sets an Accept-Encoding
header value that includes
gzip
(specifically, unless overridden, requests always sends
Accept-Encoding: gzip, deflate, compress
). When a server supports this and
responds with a response that has the Content-Encoding
header set to
gzip
, urllib3
automatically decompresses the body for requests. This
can only be prevented in the case where the stream
parameter is set to
True
. Since Betamax refuses to alter the headers on the response object in
any way, we force stream
to be True
so we can capture the compressed
data before it is decompressed. We then properly repopulate the response
object so you perceive no difference in the interaction.
To preserve the response exactly as is, we then must base64
encode the
body of the response before saving it to the file object. In other words,
whenever a server responds with a compressed body, you will not have a human
readable response body. There is, at the present moment, no way to configure
this so that this does not happen and because of the way that Betamax works,
you can not remove the Content-Encoding
header to prevent this from
happening.
Class Details¶
-
class
betamax.cassette.
Cassette
(cassette_name, serialization_format, **kwargs)¶ -
cassette_name
= None¶ Short name of the cassette
-
earliest_recorded_date
¶ The earliest date of all of the interactions this cassette.
-
find_match
(request)¶ Find a matching interaction based on the matchers and request.
This uses all of the matchers selected via configuration or
use_cassette
and passes in the request currently in progress.Parameters: request – requests.PreparedRequest
Returns: Interaction
-
is_empty
()¶ Determine if the cassette was empty when loaded.
-
is_recording
()¶ Return whether the cassette is recording.
-
-
class
betamax.cassette.
Interaction
(interaction, response=None)¶ The Interaction object represents the entirety of a single interaction.
The interaction includes the date it was recorded, its JSON representation, and the
requests.Response
object complete with itsrequest
attribute.This object also handles the filtering of sensitive data.
No methods or attributes on this object are considered public or part of the public API. As such they are entirely considered implementation details and subject to change. Using or relying on them is not wise or advised.
-
as_response
()¶ Return the Interaction as a Response object.
-
deserialize
()¶ Turn a serialized interaction into a Response.
-
ignore
()¶ Ignore this interaction.
This is only to be used from a before_record or a before_playback callback.
-
match
(matchers)¶ Return whether this interaction is a match.
-
replace
(text_to_replace, placeholder)¶ Replace sensitive data in this interaction.
-
replace_all
(replacements, serializing)¶ Easy way to accept all placeholders registered.
-
Matchers¶
You can specify how you would like Betamax to match requests you are making with the recorded requests. You have the following options for default (built-in) matchers:
Matcher | Behaviour |
---|---|
body | This matches by checking the equality of the request bodies. |
headers | This matches by checking the equality of all of the request headers |
host | This matches based on the host of the URI |
method | This matches based on the method, e.g., GET , POST , etc. |
path | This matches on the path of the URI |
query | This matches on the query part of the URI |
uri | This matches on the entirety of the URI |
Default Matchers¶
By default, Betamax matches on uri
and method
.
Specifying Matchers¶
You can specify the matchers to be used in the entire library by configuring Betamax like so:
import betamax
with betamax.Betamax.configure() as config:
config.default_cassette_options['match_requests_on'].extend([
'headers', 'body'
])
Instead of configuring global state, though, you can set it per cassette. For example:
import betamax
import requests
session = requests.Session()
recorder = betamax.Betamax(session)
match_on = ['uri', 'method', 'headers', 'body']
with recorder.use_cassette('example', match_requests_on=match_on):
# ...
Making Your Own Matcher¶
So long as you are matching requests, you can define your own way of matching.
Each request matcher has to inherit from betamax.BaseMatcher
and implement
match
.
-
class
betamax.
BaseMatcher
¶ Base class that ensures sub-classes that implement custom matchers can be registered and have the only method that is required.
Usage:
from betamax import Betamax, BaseMatcher class MyMatcher(BaseMatcher): name = 'my' def match(self, request, recorded_request): # My fancy matching algorithm Betamax.register_request_matcher(MyMatcher)
The last line is absolutely necessary.
The match method will be given a requests.PreparedRequest object and a dictionary. The dictionary always has the following keys:
- url
- method
- body
- headers
-
match
(request, recorded_request)¶ A method that must be implemented by the user.
Parameters: - request (PreparedRequest) – A requests PreparedRequest object
- recorded_request (dict) – A dictionary containing the serialized request in the cassette
Returns bool: True if they match else False
-
on_init
()¶ Method to implement if you wish something to happen in
__init__
.The return value is not checked and this is called at the end of
__init__
. It is meant to provide the matcher author a way to perform things during initialization of the instance that would otherwise require them to overrideBaseMatcher.__init__
.
Some examples of matchers are in the source reproduced here:
# -*- coding: utf-8 -*-
from .base import BaseMatcher
class HeadersMatcher(BaseMatcher):
# Matches based on the headers of the request
name = 'headers'
def match(self, request, recorded_request):
return dict(request.headers) == self.flatten_headers(recorded_request)
def flatten_headers(self, request):
from betamax.util import from_list
headers = request['headers'].items()
return dict((k, from_list(v)) for (k, v) in headers)
# -*- coding: utf-8 -*-
from .base import BaseMatcher
from requests.compat import urlparse
class HostMatcher(BaseMatcher):
# Matches based on the host of the request
name = 'host'
def match(self, request, recorded_request):
request_host = urlparse(request.url).netloc
recorded_host = urlparse(recorded_request['uri']).netloc
return request_host == recorded_host
# -*- coding: utf-8 -*-
from .base import BaseMatcher
class MethodMatcher(BaseMatcher):
# Matches based on the method of the request
name = 'method'
def match(self, request, recorded_request):
return request.method == recorded_request['method']
# -*- coding: utf-8 -*-
from .base import BaseMatcher
from requests.compat import urlparse
class PathMatcher(BaseMatcher):
# Matches based on the path of the request
name = 'path'
def match(self, request, recorded_request):
request_path = urlparse(request.url).path
recorded_path = urlparse(recorded_request['uri']).path
return request_path == recorded_path
# -*- coding: utf-8 -*-
from .base import BaseMatcher
from requests.compat import urlparse
class PathMatcher(BaseMatcher):
# Matches based on the path of the request
name = 'path'
def match(self, request, recorded_request):
request_path = urlparse(request.url).path
recorded_path = urlparse(recorded_request['uri']).path
return request_path == recorded_path
# -*- coding: utf-8 -*-
from .base import BaseMatcher
from .query import QueryMatcher
from requests.compat import urlparse
class URIMatcher(BaseMatcher):
# Matches based on the uri of the request
name = 'uri'
def on_init(self):
# Get something we can use to match query strings with
self.query_matcher = QueryMatcher().match
def match(self, request, recorded_request):
queries_match = self.query_matcher(request, recorded_request)
request_url, recorded_url = request.url, recorded_request['uri']
return self.all_equal(request_url, recorded_url) and queries_match
def parse(self, uri):
parsed = urlparse(uri)
return {
'scheme': parsed.scheme,
'netloc': parsed.netloc,
'path': parsed.path,
'fragment': parsed.fragment
}
def all_equal(self, new_uri, recorded_uri):
new_parsed = self.parse(new_uri)
recorded_parsed = self.parse(recorded_uri)
return (new_parsed == recorded_parsed)
When you have finished writing your own matcher, you can instruct betamax to use it like so:
import betamax
class MyMatcher(betamax.BaseMatcher):
name = 'my'
def match(self, request, recorded_request):
return True
betamax.Betamax.register_request_matcher(MyMatcher)
To use it, you simply use the name you set like you use the name of the default matchers, e.g.:
with Betamax(s).use_cassette('example', match_requests_on=['uri', 'my']):
# ...
on_init
¶
As you can see in the code for URIMatcher
, we use on_init
to
initialize an attribute on the URIMatcher
instance. This method serves to
provide the matcher author with a different way of initializing the object
outside of the match
method. This also means that the author does not have
to override the base class’ __init__
method.
Serializers¶
You can tell Betamax how you would like it to serialize the cassettes when saving them to a file. By default Betamax will serialize your cassettes to JSON. The only default serializer is the JSON serializer, but writing your own is very easy.
Creating Your Own Serializer¶
Betamax handles the structuring of the cassette and writing to a file, your serializer simply takes a dictionary and returns a string.
Every Serializer has to inherit from betamax.BaseSerializer
and
implement three methods:
betamax.BaseSerializer.generate_cassette_name
which is a static method. This will take the directory the user (you) wants to store the cassettes in and the name of the cassette and generate the file name.betamax.BaseSerializer.serialize()
is a method that takes the dictionary and returns the dictionary serialized as a stringbetamax.BaseSerializer.deserialize()
is a method that takes a string and returns the data serialized in it as a dictionary.
New in version 0.9.0: Allow Serializers to indicate their format is a binary format via
stored_as_binary
.
Additionally, if your Serializer is utilizing a binary format, you will want
to set the stored_as_binary
attribute to True
on your class.
-
class
betamax.
BaseSerializer
¶ Base Serializer class that provides an interface for other serializers.
Usage:
from betamax import Betamax, BaseSerializer class MySerializer(BaseSerializer): name = 'my' @staticmethod def generate_cassette_name(cassette_library_dir, cassette_name): # Generate a string that will give the relative path of a # cassette def serialize(self, cassette_data): # Take a dictionary and convert it to whatever def deserialize(self, cassette_data): # Uses a cassette file to return a dictionary with the # cassette information Betamax.register_serializer(MySerializer)
The last line is absolutely necessary.
-
deserialize
(cassette_data)¶ A method that must be implemented by the Serializer author.
The return value is extremely important. If it is not empty, the dictionary returned must have the following structure:
{ 'http_interactions': [{ # Interaction }, { # Interaction }], 'recorded_with': 'name of recorder' }
Params str cassette_data: The data serialized as a string which needs to be deserialized. Returns: dictionary
-
on_init
()¶ Method to implement if you wish something to happen in
__init__
.The return value is not checked and this is called at the end of
__init__
. It is meant to provide the matcher author a way to perform things during initialization of the instance that would otherwise require them to overrideBaseSerializer.__init__
.
-
Here’s the default (JSON) serializer as an example:
from .base import BaseSerializer
import json
import os
class JSONSerializer(BaseSerializer):
# Serializes and deserializes a cassette to JSON
name = 'json'
stored_as_binary = False
@staticmethod
def generate_cassette_name(cassette_library_dir, cassette_name):
return os.path.join(cassette_library_dir,
'{0}.{1}'.format(cassette_name, 'json'))
def serialize(self, cassette_data):
return json.dumps(cassette_data)
def deserialize(self, cassette_data):
try:
deserialized_data = json.loads(cassette_data)
except ValueError:
deserialized_data = {}
return deserialized_data
This is incredibly simple. We take advantage of the os.path
to properly
join the directory name and the file name. Betamax uses this method to find an
existing cassette or create a new one.
Next we have the betamax.serializers.JSONSerializer.serialize()
which
takes the cassette dictionary and turns it into a string for us. Here we are
just leveraging the json
module and its ability to dump any valid
dictionary to a string.
Finally, there is the
betamax.serializers.JSONSerializer.deserialize()
method which takes a
string and turns it into the dictionary that betamax needs to function.