Rematch

Rematch, a simple binary diffing framework that just works.

Note

At least, we hope it will be. Rematch is still a work in progress and is not fully functional at the moment. We’re currently working on bringing up basic functionality. Check us out again soon or watch for updates!

Rematch is intended to be used by reverse engineers for revealing and identifying previously reverse engineered similar functions, and then migrating documentation and annotations to current IDB. Rematch does that by locally collecting and uploading data about functions in your IDB. Rematch uploads information to a web service (which you’re supposed to set up as well), that upon request, is able to match your functions against all (or part) of existing database of previously uploaded functions and provide matches.

A secondary goal of rematch (which is not currently pursued) is to allow synchronization between multiple reverse engineers working on the same file.

Installation

The rematch project is composed of two parts: a server and an IDA plugin client.

While installing the plugin is exteremly easy, installing the server tends to be a little more difficult. Luckily, it’s only done once per organisation.

Installing Rematch Server

Installing a rematch server is only required once for a group of rematch users. Once an admin user is created, additional users can be managed through the admin console.

Warning

Since permissions are not currently enforced, it is advised that confidential data will be kept on servers only accessible to those with permission to access said data. See Privacy section for more details.

Tip

Windows based server installations are possible but not recommended. Some packages (scikit-learn, numpy, scipy) are required by the server but are more difficult to install on windows. Windows Subsystem for Linux may ease the installation process. Using Anaconda for python package management may also be helpful.

Building and running Rematch server docker container

We provide a docker container with Rematch server installed and configured with nginx, mysql, rabbitmq and celery micro components. This makes server deployment a lot easier, however a docker installation and roughly 600 MB of free space are required.

Warning

The docker setup uses the file ./server/.env to set passwords. We highly recommand changing those before building your docker.

If you wish to build your own docker image from source, the docker-compose command can be used to build and fire up a docker, when run inside the root directory of the rematch repository:

$ service docker start ;
$ docker-compose -f ./server/docker-compose.yml build ;
$ docker-compose -f ./server/docker-compose.yml up -d ;

Then execute the following command to set up the rematch server adminstrator account:

$ docker-compose -f ./server/docker-compose.yml exec web ./server/manage.py createsuperuser

Finally, point your browser to http://SERVER_IP:8000/admin/ to manage the service and add more users.

Installing Rematch IDA Plugin

Installing IDA plugins is done by placing the plugin source inside IDA’s plugins directory (location is based on operating system). To make plugin installation as simple as possibe, the rematch plugin has no dependecies.

Once installed the plugin automatically updates itself (as long as it’s configured to), so installing the plugin is a one-time process.

Installing plugin using pip

If pip is installed for IDA’s version of python, using it is the simplest installation method.

Note

By default, pip is not installed for Windows installations of IDA, but is more commonly found in Mac and Linux installations.

To install using IDA’s pip, simply run the following pip command:

$ pip install rematch-idaplugin

Warning

Make sure you’re installing the plugin using a version of pip inside IDA’s copy of python.

If pip is not installed for IDA’s version of python, it is still possible to install the plugin with another copy of pip using pip’s –target flag. To do this run the following pip command line with any instance of pip:

$ pip install rematch-idaplugin --target="<Path to IDA's plugins directory>"

Warning

Using the pip --target flag with a pip version installed by Homebrew does not work because of a known issue with Homebrew. Homebrew OSX users will have to use a different installation method.

Note

IDA’s plugins directory is located inside IDA’s installation directory. For example if IDA is installed at:

C:\Program Files (x86)\IDA 6.9

Then the plugins directory will be:

C:\Program Files (x86)\IDA 6.9\plugins

and the executed command line should be:

$ pip install rematch-idaplugin
    --target="C:\Program Files (x86)\IDA 6.9\plugins"

Installing plugin manually

If you don’t have pip, or prefer not to use it, you can still manually install the plugin by simply extracting the contents of the idaplugin directory in the repository’s root, to IDA’s plugins directory.

Simply download the package from PyPI or Github and extract the idaplugin directory contents into IDA’s plugins directory, so that the file idaplugin/rematch_plugin.py is located in the plugins sub-directory in IDA’s installation directory.

Usage

In this page will go over basic rematch usage and functionality. We’ll start with the server (which is currenly limited in it’s direct usability) and move on to the IDA plugin, where users will spend most of thier time.

Server

The rematch server is built on top of the Django Framework, which has it’s own built-in administration panel. The administation panel functionality Django exposes makes it trivial to manage database objects through the admin panel, granting admin users full control over the server. While Remach doesn’t have its own web interface, it’s common to use the admin panel to manage users and perform other tasks not currently available throught the rematch project.

Django’s admin panel can be used for fine-grained control over most database objects (Such as Vectors and Annotations) through the rematch server, but it’s main functionality is managing users, projects and files.

the admin panel is available at https://SERVER_URL/admin/. Once logged in, you’ll see the lists of database objects divided to categories. Selecting the “Users” object will show you a list of all registered users anda set of filters to filter by. You could edit, delete and create users which will then be able to login using the IDA plugin. Similarly, you can manage Projects, Files and any other object stored on the server.

IDA Plugin

The IDA plugin is the interface to the rematch server. Most of the interaction a user has with rematch will actually be through the clients, dispatching work and uploading data to the server.

Toolbar and Menu

All rematch IDA plugin functionality is exposed through a set of toolbar and menu. Both have exactly the same functions and could be used interchangeably.

_images/toolbar.png _images/menu.png

Login

Before uploading a file, starting a matching task or creating a project, a user must log in. If you do not have a user account on a rematch server, you’ll need to contact the nearest rematch server admin, or set up your own rematch server.

By clicking the “Login” command in the rematch toolbar or menu, the following dialog box will appear:

_images/login_dialog.png

Into which you’ll need to specify the server address, your username and it’s password. You can optionally mark the “remember this password” checkbox to store the password for future logins. To login click the “Login” button and the login dialog will be closed.

Warning

Passwords are stored in plaintext, so make sure to only mark this checkbox on machines you trust. Rematch also supports token based login, and after a successful login a token will be stored automatically (unless diabled from the configuration dialog). For that reason, it is not requried to store login details in most circumstances.

Upon a successful login you’ll be able to create projects, add files, request matches, etc.

Projects, Files and Binding

Todo

Fill up the rest of idaplugin functionality

As discussed in detail on the Architecture page, Files can be created under Projects.

Matching and Data Upload

Match Results and Filter Scripts

Architecture

The rematch solution is divided into two main parts: a client and a server. The server is in-charge of most of the heavy lifting, matching, and data storage. The client is collecting Annotations and Vectors, applying annotations after matches are displayed to the user and overall user interface.

Clients are designed to be replacable, however we only have an IDA client at the moment.

Data Model

Project Layout

A Files object is a database representation of a single binary instance being reverse engineered. While working on two distinctive versions of the same application, each of those versions should have a different File object for it’s executable binary. If the application also has a single DLL, you should have 4 File objects. A good rule of thumb is that each File object should have an IDA IDB file.

In rematch, files are grouped together into Projects. The purpose of projects is completely left to the user. For example a project could be holding all versions of a single executable or all executables of a single application.

The obvious purpose of dividing files to projects is logical seperation and ease of use, but another notable advantage is matching granularity. When starting a Match Task, the user is able to choose to match the current file against either a single oher file, all files in it’s project, all files in another project or against the entire database. This is so a user could create a single project for all files of a specific application version and another project for a different version. Then, to only get matches from the previous version, a single remote project match is performed. Alternatively, a single project can hold multiple versions of a single library (or all libraries with a similar functionality) and then requesting matches between an malware executable and all SSL libraries.

File Binding

Binding a file means an IDB will be associated with a specific File object in the remote server. This lets rematch automatically identify the database object describing the current IDB. This is how matches are made and are linked to a specific IDB, this is how caching of uploaded Vectors and Annotations is done.

File bindings are embedded inside IDB files, which means multiple copies of the same IDB file will share their File databsse objects.

File Version

Todo

Document the file version concept

Matching Process

Matches are made using three entity types defined throughout the rematch project:

Vectors

Todo

explain what vectors are

Todo

document existing vectors

Matcher / Matching Engines

Todo

explain what matchers are

Todo

document existing engines

Strategies

Strategies control the way multiple Matchers are used together, which Instances are matched against which and other similar logical decisions that may have significant implications on the overall outcome of the matching process.

For example, one could wish to match all instances against all other instances, in an “All VS All” kind of way. This is the “All” Strategy. However when comparing big databases, one may point out matching 5-byte and 1000-byte long functions to each-other is redundant, as those are highly unlikely to match. Therefore, “Binning” functions and only matching the bins might speed up the matching process without causing a decrease in match accuracy, as it may reduce a lot of unnecessary matches. This is called the “Binning strategy”.

Frequently Asked Questions

Glossary

Sorted alphabetically

Annotation
A piece of information describing an Instance in more detail, often created by the user while reverse engineering as part of the reverse engineering process. Annotations help the reverse engineer and therefore there’s an advantage in applying annotations to matched Instances.
Engine

Todo

TODO

File

Todo

TODO

Instance

When used throughtout these docs, an Instance generally means a matchable object inside a binary file, or it’s representation in any rematch component.

The following are currently entities:

  1. A function defined within a binary executable.
  2. A function imported into a binary file from another binary.
  3. A stream of initialised data or structure.
  4. A stream of uninitialised data or structure.
Project

Todo

TODO

Vector
Raw data used to describe an Instance in a way that facilitates and enables matching. Those are also occasionally called features in data-science and machine learning circles.
Matcher
Matchers implement the logic of matching Instances together using thier Vectors.
Match Task

Todo

TODO

Strategy
Strategies control the way multiple Matchers are used together, which Instances are matched against which and other similar logical decisions that may have significant implications on the overall outcome of the matching process.

Goal of Rematch

The goal of Rematch is to act as a maintained, extendable, open source tool for advanced assembly function-level binary comparison and matching. Rematch will be a completely open source and free (as in speech) community-driven tool. We support buttom-up organizational methods and desire Rematch to be heavily influenced by it’s users (both in decision making and development).

We’ve noticed that although there are more than a handful of existing binary matching tools, there’s no one tool that provides all of the following:

  1. Open source and community driven.
  2. Supports advanced matching algorithms (ML included ™).
  3. Fully integrated into IDA.
  4. Allows managing multiple projects in a single location.
  5. Enables out of the box one vs. many matches.
  6. Actively maintained.

Contribute and Get Support

Here are a few useful links to help you get in contact, find help and participate:

In particular, we’re having trouble with implementing a stylish web interface for the rematch server, enriching this documentation and the testing framework. Engine suggestions (and obviously PRs) are extremely welcome, as well as feature requests. We tend to be permissive about approving PRs, but you can always raise an issue or discussion about a feature before implementing it.

License

The project is licensed under the GNU-GPL v3 license.