
HoverPy¶
HoverPy speeds up and simplifies Python development and testing that involves downstream HTTP / HTTPS services. It does so by using a high-performance Go caching proxy to capture, simulate, modify and synthesize network traffic.
from hoverpy import capture, simulate
import requests
@capture("requests.db")
def captured_get():
print(requests.get("http://time.jsontest.com").json())
@simulate("requests.db")
def simulated_get():
print(requests.get("http://time.jsontest.com").json())
captured_get()
simulated_get()
This grants several benefits:
- Increased development speed
- Increased test speed
- Ability to work offline
- A deterministic test environment
- Ability to modify traffic
- Ability to sythesize traffic
- Ability to simulate network latency
from hoverpy import capture, modify
import requests
@simulate("requests.db", delays=[("time.json.com", 1000)])
def simulated_latency():
print(requests.get("http://time.jsontest.com").json())
@modify(middleware="python middleware.py")
def modified_request():
print(requests.get("http://time.jsontest.com").json())
simulated_latency()
modified_request()
If/when the downstream service you are testing against changes, then you can simply delete your db file, and capture the test results again. Or you could have versioned db files for services that use versioning.
HoverPy uses Hoverfly a Service Virtualisation server written in GoLang. For this reason it is rock solid in terms of speed and reliability.
Library Support¶
HoverPy works great with the following HTTP clients out of the box:
Since HoverPy can act as a proxy or a reverse proxy, it can easily be made to work with any networking library or framework.
License¶
HoverPy uses Apache License V2. See LICENSE.txt for more details.
Contents¶
Installation¶
Cloning¶
$ git clone https://github.com/SpectoLabs/hoverpy.git
$ cd hoverpy
$ virtualenv .venv
$ source .venv/bin/activate
$ python setup.py install
This installs hoverpy and its requirements in your .venv folder; make sure to pull often, and run the python setup.py install
when you do.
Testing¶
Please make sure everything is working before proceeding to the next steps.
$ python setup.py test
You should get a series of OKs
.
Output:
...
testModify (hoverpy.tests.modify.testModify.TestModify) ... ok
testTemplate (hoverpy.tests.templates.testTemplates.TestTemplates) ... ok
testCapture (hoverpy.tests.testVirtualisation.TestVirt) ... ok
testPlayback (hoverpy.tests.testVirtualisation.TestVirt) ... ok
Running the examples¶
$ ls examples/*
$ basic delays modify readthedocs tornado unittesting urllib2eg urllib3eg
Please note we’ll cover the examples in the usage page. But for the truly impatient, you can try running the most basic example, just to make sure everything’s working at this point.
$ python examples/basic/basic.py
Hoverfly binary¶
Please note that when you install HoverPy, the Hoverfly binaries get downloaded and installed in your home directory, in
${home}/.hoverfly/bin/dist_vX.X.X/${OS}_${ARCH}/hoverfly
Introduction¶
Building and testing interdependent applications is difficult. Maybe you’re building a mobile application that needs to talk to a legacy API. Or a microservice that relies on two other services that are still in development The problem is the same: how do you develop and test against external dependencies which you cannot control?
You could use mocking libraries as substitutes for external dependencies. But mocks are intrusive, and do not test all the way to the architectural boundary of your application. This means the client code for your external dependency is substituted and not tested.
Stub services are better, but they often involve too much configuration or may not be transparent to your application. Then there is the problem of managing test data. Often, to write proper tests, you need fine-grained control over the data in your mocks or stubs. Managing test data across large projects with multiple teams introduces bottlenecks that impact delivery times.
Integration testing “over the wire” is problematic too. When stubs or mocks are substituted for real services (in a continuous integration environment for example) new variables are introduced. Network latency and random outages can cause integration tests to fail unexpectedly.
Service Virtualisation is software that records the interactions between you, and the big unpredictible world.

A very sturdy software solution for Service Virtualisation is Mirage, which is used extensively in the airline industry. Its successor, Hoverfly, has taken all the lessons learned in the years of use of Mirage. Both Mirage and Hoverfly are open source software, developed at specto.io.
HoverPy is the thin layer between Python and HoverFly. HoverFly is a light-weight and extremely fast proxy written in Go, and does the heavy lifting for HoverPy. So a more accurate picture might be:

Feature overview¶
- “Capture” traffic between a client and a server application
- Use captured traffic to simulate the server application
- Export captured service data as a JSON file
- Import service data JSON files
- Simulate latency by specifying delays which can be applied to individual URLs based on regex patterns, or based on HTTP method
- Flexible request matching using templates
- Supports “middleware” (which can be written in any language) to manipulate data in requests or responses, or to simulate unexpected behaviour such as malformed responses or random errors
- Supports local or remote middleware execution (for example on AWS Lambda)
- Uses BoltDB to persist data in a binary file on disk - so no additional database is required
- REST API
- Run as a transparent proxy or as a webserver
- High performance with minimal overhead
- JUnit rule “wrapper” is available as a Maven dependency
- Supports HTTPS and can generate certificates if required
- Authentication (combination of Basic Auth and JWT)
- Admin UI to change state and view basic metrics
Use cases¶
Hoverfly is designed to cater for two high-level use cases. Capturing real HTTP(S) traffic between an application and an external service for re-use in testing or development.
If the external service you want to simulate already exists, you can put Hoverfly in between your client application and the external service. Hoverfly can then capture every request from the client application and every matching response from the external service (capture mode).
These request/response pairs are persisted in Hoverfly, and can be exported to a service data JSON file. The service data file can be stored elsewhere (a Git repository, for example), modified as required, then imported back into Hoverfly (or into another Hoverfly instance).
Hoverfly can then act as a “surrogate” for the external service, returning a matched response for every request it received (simulate mode). This is useful if you want to create a portable, self-contained version of an external service to develop and test against.
This could allow you to get around the problem of rate-limiting (which can be frustrating when working with a public API) You can write Hoverfly extensions to manipulate the data in pre-recorded responses, or to simulate network latency.
You could work while offline, or you could speed up your workflow by replacing a slow dependency with a fast Hoverfly “surrogate”.
Creating simulated services for use in a testing or development.¶
In some cases, the external service you want to simulate might not exist yet. You can create service simulations by writing service data JSON files. This is in line with the principle of design by contract development. Service data files can be created by each developer, then stored in a Git repository. Other developers can then import the service data directly from the repository URL, providing them with a Hoverfly “surrogate” to work with. Instead of writing a service data file, you could write a “middleware” script for Hoverfly that generates a response “on the fly”, based on the request it receives (synthesize mode). More information on this use-case is available here: Synthetic service example Easy API simulation with the Hoverfly JUnit rule Proceed to the “Modes” and middleware section to understand how Hoverfly is used in these contexts.
Modes and middleware¶
Hoverfly modes¶
Hoverfly has four modes. Detailed guides on how to use these modes are available in the Usage section.
Capture mode¶

In this mode, Hoverfly acts as a proxy between the client application and the external service. It transparently intercepts and stores outgoing requests from the client and matching incoming responses from the external service. This is how you capture real traffic for use in development or testing.
Simulate mode¶

In this mode, Hoverfly uses either previously captured traffic, or imported service data files to mimic the external service. This is useful if you are developing or testing an application that needs to talk to an external service that you don’t have reliable access to. You can use the Hoverfly “surrogate” instead of the real service.
Synthesize mode¶

In this mode, Hoverfly doesn’t use any stored request/response pairs. Instead, it generates responses to incoming requests on the fly and returns them to the client. This mode is dependent on middleware (see below) to generate the responses.
This is useful if you can’t (or don’t want to) capture real traffic, or if you don’t want to write service data files.
Modify mode¶

In this mode, Hoverfly passes requests through from to the server, and passes the responses back. However, it also executes middleware on the requests and responses. This is useful for all kinds of things such as manipulating the data in requests and/or responses on the fly.
Middleware¶
Middleware can be written in any language, as long as that language is supported by the Hoverfly host. For example, you could write middleware in Go, Python or JavaScript (if you have Go, Python or NodeJS installed on the Hoverfly host, respectively).
Middleware is applied to the requests and/or the responses depending on the mode:
- Capture Mode: middleware affects only outgoing requests
- Simulate Mode: middleware affects only responses (cache contents remain untouched)
- Synthesize Mode: middleware creates responses
- Modify Mode: middleware affects requests and responses
- Middleware can be used to do many useful things, such as simulating network latency or failure, rate limits or controlling data in requests and responses.
A detailed guide on how to use middleware is available in the Usage section.
Usage¶
I don’t know about you, but for me the best way of getting into things is by trying them out. In the articles below I take you through simple, but then increasily complex examples of testing heaven using HoverPy.
basic¶
This is by far the simplest example on how to get started with HoverPy.
from hoverpy import capture, simulate
import requests
@capture("requests.db")
def captured_get():
print(requests.get("http://time.jsontest.com").json())
@simulate("requests.db")
def simulated_get():
print(requests.get("http://time.jsontest.com").json())
captured_get()
simulated_get()
$ python examples/basic/basic.py
You should see time printed twice. Notice the time is the same on each request, that is because the simulated_get
was served from data that was captured while calling captured_get
.
You may have noticed this created a requests.db
inside your current directory. This is a boltdb database, holding our requests, and their responses.
readthedocs¶
This is a slightly more advanced example, where we query readthedocs.io for articles.
from hoverpy import capture
import requests
import time
@capture("readthedocs.db", recordMode="once")
def getLinks(limit):
start = time.time()
sites = requests.get(
"http://readthedocs.org/api/v1/project/?limit=%d&offset=0&format=json" % int(limit))
objects = sites.json()['objects']
for link in ["http://readthedocs.org" + x['resource_uri'] for x in objects]:
response = requests.get(link)
print("url: %s, status code: %s" % (link, response.status_code))
print("Time taken: %f" % (time.time() - start))
getLinks(50)
python examples/readthedocs/readthedocs.py
The first time this command is invoked it takes Time taken: 7.658194
. The second time we run it, Time taken: 0.093647
. Please note this uses the recordMode="once"
which is legacy from Ruby’s VCR and Python’s VCR.py.
unittesting¶
Using simuations for unittesting simply involves decorating your tests with the @capture
decorator.
import unittest
import requests
from hoverpy import capture
class TestTime(unittest.TestCase):
@capture("test_time.db", recordMode="once")
def test_time(self):
time = requests.get("http://time.jsontest.com")
self.assertTrue("time", time.json())
if __name__ == '__main__':
unittest.main()
latency¶
Simulating service latency during the development phase of a service is good practice, as it forces developers to write code that acts gracefully and resiliently in the event of unexpected latency during network io.
To add latency to services, simply add the FQDN, delay in milliseconds, and optional HTTP method in an array of tuples for the delays
parameter.
# latency.py
from hoverpy import capture
import requests
delays = [("time.jsontest.com", 3000, "GET"), ("echo.jsontest.com", 1000)]
@capture("delays.db", recordMode="once", delays=delays)
def simulate_network_latency():
print(requests.get("http://time.jsontest.com").text)
print(requests.get("http://echo.jsontest.com/a/b").text)
simulate_network_latency()
$ python examples/latency/latency.py
The latency is added when querying these services. If the HTTP method is omitted in the tupes, then the delay applies to all methods.
modify¶
Modifying requests and responses via middleware is simply a matter of using the modify
function decorator.
import requests
from hoverpy import modify
@modify(middleware="python examples/modify/middleware.py")
def get_modified_time():
print(requests.get("http://time.ioloop.io?format=json").json())
get_modified_time()
$ python examples/modify/modify.py
Output:
{u'date': u'2017-02-17', u'epoch': 101010, u'time': u'21:22:38'}
As you can see, the epoch has been successfully modified by the middleware script.
Middleware¶
Middleware is required, and can be written in any language that is supported in your development environment.
# middleware.py
import sys
import logging
import json
logging.basicConfig(filename='middleware.log', level=logging.DEBUG)
data = sys.stdin.readlines()
payload = data[0]
doc = json.loads(payload)
logging.debug(json.dumps(doc, indent=4, separators=(',', ': ')))
if "request" in doc:
doc["request"]["headers"]["Accept-Encoding"] = ["identity"]
if "response" in doc and doc["response"]["status"] == 200:
if doc["request"]["destination"] == "time.ioloop.io":
body = json.loads(doc["response"]["body"])
body["epoch"] = 101010
doc["response"]["body"] = json.dumps(body)
doc["response"]["headers"]["Content-Length"] = [str(len(json.dumps(body)))]
print(json.dumps(doc))
soap¶
In this example we’ll take a look at using hoverpy when working with SOAP. To run this example, simply execute:
examples/soap/soap.py --capture
which runs the program in capture mode, then:
examples/soap/soap.py
Which simply runs our program in simulate mode.
This program gets our IP address from http://jsontest.com
, then uses it to do some geolocation using a WDSL SOAP web service. In my case, I’m getting this:
{
'ResolveIPResult':{
'City':u'London',
'HasDaylightSavings':False,
'CountryCode':u'GB',
'AreaCode':u'0',
'Country':u'United Kingdom',
'StateProvince':u'H9',
'Longitude':-0.09550476,
'TimeZone':None,
'Latitude':51.5092,
'Organization':None,
'Certainty':90,
'RegionName':None
}
}
Which is what the ip2geo
service thinks is the location of the SpectoLabs office!
from hoverpy import HoverPy
import pysimplesoap
import requests
Above, we bring in our usual suspect libraries. Namely the HoverPy
class, pysimplesoap
which is a straight forward SOAP client, and the requests
library.
from argparse import ArgumentParser
parser = ArgumentParser(description="Perform proxy testing/URL list creation")
parser.add_argument("--capture", help="capture the data", action="store_true")
args = parser.parse_args()
We use argparse so we can run our app in --capture
mode first.
with HoverPy(capture=args.capture):
We then construct HoverPy either in capture, or simulate mode, depending on the flag provided.
ipAddress = requests.get("http://ip.jsontest.com/myip").json()["ip"]
We then make a get HTTP request to http://ip.jsontest.com
for our IP address. This is very similar to our basic example.
pysimplesoap.transport.set_http_wrapper("urllib2")
We now tell pysimplesoap
to use urllib2
, this is because urllib2 happens to play well with proxies.
client = pysimplesoap.client.SoapClient(
wsdl='http://ws.cdyne.com/ip2geo/ip2geo.asmx?WSDL'
)
We then build our SOAP client, pointing to the ip2go WSDL schema description URL.
print(client.ResolveIP(ipAddress=ipAddress, licenseKey="0"))
We finally invoke the ResolveIP
method on our SOAP client. So to resume, in this example we built a program that gets our IP address from one external service, and then builds a SOAP client using a WSDL schema description, and finally queries the SOAP service for our location using said IP address.
If you really want to prove to yourself that hoverfly is indeed playing back the requests, then you can run the script in simulate mode without an internet connection. Timing our script also shows us we’re now running approximately 10x faster.
modify soap¶
In this example we’ll take a look at using hoverpy in conjunction with middleware to modify SOAP data. This example builds upon the previous SOAP example, so I strongly suggest you do that one first.
examples/soap/soapModify.py¶
with HoverPy(modify=True, middleware="python examples/soap/modify_payload.py"):
Above, the only real difference with examples/soap/soap.py
is that we’re loading up HoverPy with middleware enabled.
print(client.ResolveIP(ipAddress=ipAddress, licenseKey="0"))
When running this script with python examples/soap/soapModify.py
you should notice your city is ‘New York’. That’s the middleware modifying the result of our SOAP operation.
The XML from ip2geo¶
Before jumping into the middleware, let’s see what we’ll be modifying.
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<ResolveIPResponse xmlns="http://ws.cdyne.com/">
<ResolveIPResult>
<City>New York</City>
<StateProvince>H9</StateProvince>
<Country>United Kingdom</Country>
<Organization />
<Latitude>51.5092</Latitude>
<Longitude>-0.09550476</Longitude>
<AreaCode>0</AreaCode>
<TimeZone />
<HasDaylightSavings>false</HasDaylightSavings>
<Certainty>90</Certainty>
<RegionName />
<CountryCode>GB</CountryCode>
</ResolveIPResult>
</ResolveIPResponse>
</soap:Body>
</soap:Envelope>
This is the XML that gets sent back to us after calling the ResolveIP
method, as defined in http://ws.cdyne.com/ip2geo/ip2geo.asmx?WSDL. We are interested in modifying the City
node.
examples/soap/modify_payload.py¶
And here are the important parts of our payload modification script.
from lxml import objectify
from lxml import etree
Above we make sure we are importing the lxml classes that will help us modify the data.
if "response" in payload_dict and "body" in payload_dict["response"]:
body = payload_dict["response"]["body"]
try:
Let’s make sure we only operate when we have a response, and it has a body.
root = objectify.fromstring(str(body))
ns = "{http://ws.cdyne.com/}"
logging.debug("transforming")
ipe = ns + "ResolveIPResponse"
ipt = ns + "ResolveIPResult"
root.Body[ipe][ipt].City = "New York"
We parse our xml and turn it into an object. Remember that our program gets our IP address, then tries to geo-locate us based on our IP. The intent of our middleware is to override the city no matter what.
objectify.deannotate(root.Body[ipe][ipt].City)
etree.cleanup_namespaces(root.Body[ipe][ipt].City)
payload_dict["response"]["body"] = etree.tostring(root)
We finally remove annotations and namespaces that got added to the City element by the objectify library, and serialise the modified body back into the response. And we are done.
Tornado¶
HoverPy can be used to make virtualise asynchronous requests made from Tornado’s AsyncHTTPClient.
Capturing traffic¶
from tornado import gen, web, ioloop, httpclient
class MainHandler(web.RequestHandler):
@gen.coroutine
def get(self):
http_client = httpclient.AsyncHTTPClient()
time = yield http_client.fetch("http://time.ioloop.io?format=json")
self.write(time.body)
from hoverpy import capture, lib
@lib.tornado(proxyPort=8500, proxyHost="localhost")
@capture(dbpath="tornado.db")
def start():
app = web.Application([("/", MainHandler)])
app.listen(8080)
ioloop.IOLoop.current().start()
start()
Making a request to our server now captures the requests.
curl http://localhost:8080
In fact you may notice your directory now contains a tornado.db
.
Simulating traffic¶
We can how switch our server to simulate mode:
from tornado import gen, web, ioloop, httpclient
class MainHandler(web.RequestHandler):
@gen.coroutine
def get(self):
http_client = httpclient.AsyncHTTPClient()
time = yield http_client.fetch("http://time.ioloop.io?format=json")
self.write(time.body)
from hoverpy import simulate, lib
@lib.tornado
@simulate(dbpath="tornado.db")
def start():
app = web.Application([("/", MainHandler)])
app.listen(8080)
ioloop.IOLoop.current().start()
start()
Which means we are no longer hitting the real downstream dependency.
Modifying traffic¶
HoverPy can also be used to modify your requests, to introduce failures, or build tolerant readers
from tornado import gen, web, ioloop, httpclient
class MainHandler(web.RequestHandler):
@gen.coroutine
def get(self):
http_client = httpclient.AsyncHTTPClient()
time = yield http_client.fetch("http://time.ioloop.io?format=json")
self.write(time.body)
from hoverpy import modify, lib
@lib.tornado
@modify(middleware="python examples/tornado/middleware.py")
def start():
app = web.Application([("/", MainHandler)])
app.listen(8080)
ioloop.IOLoop.current().start()
start()
This is our middleware:
# middleware.py
import sys
import logging
import json
logging.basicConfig(filename='middleware.log', level=logging.DEBUG)
data = sys.stdin.readlines()
payload = data[0]
doc = json.loads(payload)
logging.debug(json.dumps(doc, indent=4, separators=(',', ': ')))
if "request" in doc:
doc["request"]["headers"]["Accept-Encoding"] = ["identity"]
if "response" in doc and doc["response"]["status"] == 200:
if doc["request"]["destination"] == "time.ioloop.io":
body = json.loads(doc["response"]["body"])
body["epoch"] = 101010
doc["response"]["body"] = json.dumps(body)
doc["response"]["headers"]["Content-Length"] = [str(len(json.dumps(body)))]
print(json.dumps(doc))
Twisted¶
from twisted.python.log import err
from twisted.web.client import ProxyAgent
from twisted.internet import reactor
def display(response):
print "Received response"
print response
from hoverpy import capture, simulate, lib
@simulate("twisted.db")
def main():
endpoint = lib.twisted.TCP4ClientEndpoint()
agent = ProxyAgent(endpoint)
d = agent.request("GET", "http://echo.ioloop.io/a/b?format=json")
d.addCallbacks(display, err)
d.addCallback(lambda ignored: reactor.stop())
reactor.run()
main()