Cuckoo Sandbox 中文文档

Cuckoo Sandbox 是一个开源的自动分析可疑文件的软件。它通过在独立环境内使用自定义组件来监控恶意进程的行为。

本文档旨在说明如何配置、使用和定制Cuckoo

译者说明

联系方式: xyyvsxh@gmail.com

本次翻译的是为了更深入的了解整个Cuckoo的设计理念和架构,以便在使用的过程中能够更合理。

翻译的过程中并没有完全翻译所有内容,而是侧重于有助理解Cuckoo的内容,见谅。

首次使用Cuckoo?

在新版的Cuckoo中,有很多有关可用性的较大的改动。为了更好的使用Cuckoo,建议使用前阅读以下相关的几个章节:

有疑问?

如果你在使用过程中遇到问题,建议先阅读下 FAQ 章节,其中可以已经包含了相关的解答。

FAQ

Here you can find answers for various Frequently Asked Questions:

General Questions

Can I analyze URLs with Cuckoo?

0.5 新版功能: Native support for URL analysis was added to Cuckoo.

在 2.0-rc1 版更改: Cuckoo will not only start the browser (i.e., Internet Explorer) but will also attempt to actively instrument it in order to extract interesting results such as executed Javascript, iframe URLs, etc. See also our 2.0-rc1 blogpost.

Additional details on URL submissions is documented at 提交分析, but it boils down to:

$ cuckoo submit --url http://www.example.com
Can I use Volatility with Cuckoo?

0.5 新版功能: Cuckoo introduces support for optional full memory dumps, which are created at the end of the analysis process. You can use these memory dumps to perform additional memory forensic analysis with Volatility.

Please also consider that we don’t particularly encourage this: since Cuckoo employs some rootkit-like technologies to perform its operations, the results of a forensic analysis would be polluted by the sandbox’s components.

What do I need to use Cuckoo with VMware ESXi?

To run with VMware vSphere Hypervisor (or ESXi) Cuckoo leverages on libvirt or pyVmomi (the Python SDK for the VMware vSphere API). VMware API are used to take control over virtual machines, though these APIs are available only in the licensed version. In VMware vSphere free edition these APIs are read only, so you will be unable to use it with Cuckoo. For the minimum license needed, please have a look at VMware website.

Troubleshooting

After upgrade Cuckoo stops to work

Probably you upgraded it in a wrong way. It’s not a good practice to rewrite the files due to Cuckoo’s complexity and quick evolution.

Please follow the upgrade steps described in Upgrading from a previous release.

Cuckoo stumbles and produces some error I don’t understand

Cuckoo is a mature but always evolving project, it’s possible that you encounter some problems while running it, but before you rush into sending emails to everyone make sure you read what follows.

Cuckoo is not meant to be a point-and-click tool: it’s designed to be a highly customizable and configurable solution for somewhat experienced users and malware analysts.

It requires you to have a decent understanding of your operating systems, Python, the concepts behind virtualization and sandboxing. We try to make it as easy to use as possible, but you have to keep in mind that it’s not a technology meant to be accessible to just anyone.

That being said, if a problem occurs you have to make sure that you did everything you could before asking for time and effort from our developers and users. We just can’t help everyone, we have limited time and it has to be dedicated to the development and fixing of actual bugs.

  • We have extensive documentation, read it carefully. You can’t just skip parts of it.
  • We have a Discussion page where you can find discussion platforms on which we’re frequently helping our users.
  • We have lot of users producing content on Internet, Google it.
  • Spend some of your own time trying fixing the issues before asking ours, you might even get to learn and understand Cuckoo better.

Long story short: use the existing resources, put some efforts into it and don’t abuse people.

If you still can’t figure out your problem, you can ask help on our online communities (see Final Remarks). Make sure when you ask for help to:

  • Use a clear and explicit title for your emails: “I have a problem”, “Help me” or “Cuckoo error” are NOT good titles.
  • Explain in details what you’re experiencing. Try to reproduce several times your issue and write down all steps to achieve that.
  • Use no-paste services and link your logs, configuration files and details on your setup.
  • Eventually provide a copy of the analysis that generated the problem.
Check and restore current snapshot with KVM

If something goes wrong with virtual machine it’s best practice to check current snapshot status. You can do that with the following:

$ virsh snapshot-current "<Name of VM>"

If you got a long XML as output your current snapshot is configured and you can skip the rest of this chapter; anyway if you got an error like the following your current snapshot is broken:

$ virsh snapshot-current "<Name of VM>"
error: domain '<Name of VM>' has no current snapshot

To fix and create a current snapshot first list all machine’s snapshots:

$ virsh snapshot-list "<Name of VM>"
 Name                 Creation Time             State
 ------------------------------------------------------------
 1339506531           2012-06-12 15:08:51 +0200 running

Choose one snapshot name and set it as current:

$ snapshot-current "<Name of VM>" --snapshotname 1339506531
Snapshot 1339506531 set as current

Now the virtual machine state is fixed.

Check and restore current snapshot with VirtualBox

If something goes wrong with virtual it’s best practice to check the virtual machine status and the current snapshot. First of all check the virtual machine status with the following:

$ VBoxManage showvminfo "<Name of VM>" | grep State
State:           powered off (since 2012-06-27T22:03:57.000000000)

If the state is “powered off” you can go ahead with the next check, if the state is “aborted” or something else you have to restore it to “powered off” before:

$ VBoxManage controlvm "<Name of VM>" poweroff

With the following check the current snapshots state:

$ VBoxManage snapshot "<Name of VM>" list --details
Name: s1 (UUID: 90828a77-72f4-4a5e-b9d3-bb1fdd4cef5f)
Name: s2 (UUID: 97838e37-9ca4-4194-a041-5e9a40d6c205) *

If you have a snapshot marked with a star “*” your snapshot is ready, anyway you have to restore the current snapshot:

$ VBoxManage snapshot "<Name of VM>" restorecurrent
Unable to bind result server error

At Cuckoo startup if you get an error message like this one:

2014-01-07 18:42:12,686 [root] CRITICAL: CuckooCriticalError: Unable to bind result server on 192.168.56.1:2042: [Errno 99] Cannot assign requested address

It means that Cuckoo is unable to start the result server on the IP address written in cuckoo.conf (or in machinery.conf if you are using the resultserver_ip option inside). This usually happen when you start Cuckoo without bringing up the virtual interface associated with the result server IP address. You can bring it up manually, it depends from one virtualization software to another, but if you don’t know how to do, a good trick is to manually start and stop an analysis virtual machine, this will bring virtual networking up.

In the case of VirtualBox the hostonly interface vboxnet0 can be created as follows:

# If the hostonly interface vboxnet0 does not exist already.
$ VBoxManage hostonlyif create

# Configure vboxnet0.
$ VBoxManage hostonlyif ipconfig vboxnet0 --ip 192.168.56.1 --netmask 255.255.255.0
Error during template rendering

在 2.0-rc1 版更改.

In our 2.0-rc1 release a bug was introduced that looks as follows in the screenshot below. In order to resolve this issue in your local setup, please open the web/analysis/urls.py file and modify the 21st line by adding an underscore as follows:

-        "/(?P<ip>[\d\.]+)?/(?P<host>[a-zA-Z0-9-\.]+)?"
+        "/(?P<ip>[\d\.]+)?/(?P<host>[ a-zA-Z0-9-_\.]+)?"

The official fixes for this issue can be found in the following commits.

_images/error_template_rendering.png
501 Unsupported Method (‘GET’)

在 2.0-rc1 版更改.

Since 2.0-rc1 Cuckoo supports both the legacy Cuckoo Agent as well as a new, REST API-based, Cuckoo Agent for communication between the Guest and the Host machine. The new Cuckoo Agent is an improved Agent in the sense that it also allows usage outside of Cuckoo. As an example, it is used extensively by VMCloak in order to automatically create, configure, and cloak Virtual Machines.

Now in order to determine whether the Cuckoo Host is talking to the legacy or new Cuckoo Agent it does a HTTP GET request to the root path (/). The legacy Cuckoo Agent, which is based on xmlrpc, doesn’t handle that specific route and therefore returns an error, 501 Unsupported method.

Having said that, the message is not actually an error, it is simply Cuckoo trying to determine to which version of the Cuckoo Agent it is talking.

注解

It should be noted that even though there is a new Cuckoo Agent available, backwards compatibility for the legacy Cuckoo Agent is still available and working properly.

_images/unsupported_method.png
Permission denied for tcpdump

在 2.0.0 版更改.

With the new Cuckoo structure in-place all storage is now, by default, located in ~/.cuckoo, including the PCAP file, which will be stored at ~/.cuckoo/storage/analyses/task_id/dump.pcap. On Ubuntu with AppArmor enabled (default configuration) tcpdump doesn’t have write permission to dot-directories in $HOME, causing the permission denied message and preventing Cuckoo from capturing PCAP files.

One of the workaround is as follows - by installing AppArmor utilities and simply disabling the tcpdump AppArmor profile altogether (more appropriate solutions are welcome of course):

sudo apt-get install apparmor-utils
sudo aa-disable /usr/sbin/tcpdump
DistributionNotFound / No distribution matching the version..

在 2.0.0 版更改.

Installing Cuckoo through the Python package brings its own set of problems, namely that of outdated Python package management software. This FAQ entry targets the following issue..:

$ cuckoo
Traceback (most recent call last):
File "/usr/local/bin/cuckoo", line 5, in <module>
    from pkg_resources import load_entry_point
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2749, in <module>
    working_set = WorkingSet._build_master()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 446, in _build_master
    return cls._build_from_requirements(__requires__)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 459, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 628, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: tlslite-ng==0.6.0a3

Those issues - and related ones - are caused by outdated Python package management software. Fortunately their fix is fairly trivial and therefore the following command should do the trick:

pip install -U pip setuptools
IOError: [Errno 24] Too many open files

It is most certainly possible running into this issue when analyzing samples that have a lot of dropped files, so many that the Processing Utility can’t allocate any new file descriptors anymore.

The easiest workaround for this issue is to bump the soft and hard file descriptor limit for the current user. This may be done as documented in the following blogpost.

In case if you using Supervisor set minfds in supervisord.conf.

Remember that you have to login in to a new shell (i.e., usually logout first) session in order for the changes to take effect.

pkg_resources.ContextualVersionConflict

In case you’re installing or upgrading the Cuckoo Package, it has happened before to people that they got an error much like the following:

pkg_resources.ContextualVersionConflict: (HTTPReplay 0.1.5
(/usr/local/lib/python2.7/dist-packages),
Requirement.parse('HTTPReplay==0.1.17'), set(['Cuckoo']))

Now this is quite odd, as generally speaking we’ve specifically requested pip to install all dependencies with their exact version (and in fact, if you look at pip freeze you’ll see the correct version), but it does happen sometimes that older versions of various libraries are still around.

The easiest way to resolve this issue is by uninstalling all versions of said dependency and reinstalling Cuckoo. In the case presented above, with HTTPReplay, this may look as follows:

$ sudo pip uninstall httpreplay
Uninstalling HTTPReplay-0.1.17:
/usr/local/bin/httpreplay
/usr/local/bin/pcap2mitm
/usr/local/lib/python2.7/dist-packages/HTTPReplay-0.1.17-py2.7.egg-info
...
Proceed (y/n)? y
Successfully uninstalled HTTPReplay-0.1.17

$ sudo pip uninstall httpreplay
Uninstalling HTTPReplay-0.1.5:
/usr/local/lib/python2.7/dist-packages/HTTPReplay-0.1.5-py2.7.egg-info
Proceed (y/n)? y
Successfully uninstalled HTTPReplay-0.1.5

$ sudo pip uninstall httpreplay
Cannot uninstall requirement httpreplay, not installed

Then reinstalling Cuckoo again is simply invoking pip install -U cuckoo or similar.

ValueError: incomplete format key

This issue may appear at runtime after tinkering with settings in $CWD/conf, as input is passed to the configuration parser at runtime unescaped. Double-check your configuration files with an eye out for potentially troublesome character combinations such as %(.

Troubleshooting VM network configuration

In case the network configuration of your Virtual Machine isn’t working as expected, you’ll be prompted with the message to resolve this issue as Cuckoo isn’t able to use it for analyses as-is. There are numerous possibilities as to why the network configuration and/or your setup are incorrect so please read our documentation once more. However, most often the issue lies within one of the following reasons:

  • The IP address of the VM has been configured incorrectly. Please verify that the VM has a static IP address, that it matches the one in the Cuckoo configuration, and that the configured network interface exists and is up. Also, in case of VirtualBox, did you configure the network interface to be a Host-Only interface?
  • Check that there are no firewalls in-place that hinder the communication between your Host and Guest and double check that the Host and Guest can ping each other as well as connect to each other.

If connections from the Cuckoo Host to the Guest work, but the other way around don’t, then some additional problems may be at hand:

  • Is the network configuration equivalent on the host and in the VM? If not, e.g., if the VM sees different IP ranges, then you’ll have to configure the resultserver_ip and resultserver_port, for which we have separate documentation.
  • If you’ve modified the Cuckoo Analyzer (located at $CWD/analyzer) this error message may indicate that a syntax error or other exception was introduced, preventing the Analyzer from being properly started, and thus not being able to perform the analysis as expected.

If you’ve triple-checked the above and are still experiencing issues, then please contact us through one of the various communication channels.

Cuckoo says there’s a version 2.1.0?

If you see the message Outdated! Cuckoo Sandbox version 2.1.0 is available now. and you’ve come to this FAQ entry then you’re entirely correct. There is indeed no version 2.1.0, yet (!). However, due to the logic implemented in the version checker of our 2.0-RC1 and 2.0-RC2 releases, the only way to inform our users about our latest releases is by having a “new” major version release (i.e., 2.1.0 or later). We’ve decided that it’s better to sling a little bit of confusion regarding a non-existing version than not mentioning any new versions to our users altogether. So please bear with us and install the latest version :-)

No handlers could be found for logger X in UWSGI log

If you see this message, it means Cuckoo is throwing an error before its loggers are initialized. This might happen if database migration or CWD updates are required.

Start the development web server to see the error:

$ cuckoo web

除此之外,你也可以向Cuckoo的开发者或者其他Cuckoo用户咨询: Join the discussion.

文档目录

介绍

本章主要解释了一些恶意软件的基本概念,以及什么是Cuckoo,Cuckoo为何适合恶意软件分析。

沙箱

就像 Wikipedia 中定义的一样, “在计算机安全领域,沙箱是一种程序隔离运行的安全机制。 通常用于执行未测试过的代码,或者来自未经验证的第三方的可疑程序”.

这个概念同样也适用于恶意软件分析沙箱: 我们的目的是在一个隔离的环境中运行未知的不可信的程序, 从而获取它的所有行为。

恶意软件沙箱是动态分析的一种方式:不是静态的分析文件,而是在实时监控的情况下运行程序进行分析。

这种方式有利有弊, 但对于获取恶意软件的详细信息是一种很有价值的技术, 比如获取恶意软件的网络行为。因此,分析一个恶意软件,将静态分析与动态分析结合起来, 能够更深入的恶意软件。

简单来说,Cuckoo即是一个允许你进行恶意软件沙箱分析的工具。

沙箱使用

在开始安装、配置和使用Cuckoo之前,你应该花一点时间思考一下,应该用Cuckoo来做什么以及如何做。

下面是需要思考的几个问题:

  • 我想分析什么类型的文件?
  • 我能处理的信息量有多大?
  • 我需要什么样的平台来运行和分析?
  • 我想得到这个文件的什么信息?

隔离环境(例如虚拟机)是恶意软件沙箱中最关键和最重要的部分:应该精心计划。

在选择虚拟机产品前,要考虑以下几个点:
  • 使用哪种操作系统、语言和系统补丁版本。
  • 安装哪些软件以及软件的版本(在分析漏洞时特别的重要)。

考虑到恶意软件的不确定性,分析是否成功可能取决与很多因素,不一定每次都是成功的。

例如,你可以考虑在虚拟机环境里留下一些使用痕迹,比如浏览器记录,cookie,文档,图片等等。 因为恶意软件有可能会操作这些文件,如果没有的话,就无法分析他的行为。

虚拟化的操作系统很容易被检测出来,恶意软件检测到虚拟机环境有可能就不执行, 所以需要尽可能的防止被检测出来是虚拟机。 网络上有很多的虚拟机检测和应对技术。

一旦你准备好,就可以开始部署了。虽然后面也可以不断的修改,但是前期的充分准备可以让后面出现的问题更少。

Cuckoo 是什么?

Cuckoo是一个开源的恶意软件自动分析系统。 通常被用于在隔离的环境中运行和收集恶意软件的信息以便分析。

它可以分析出以下几种类型的结果:

  • 跟踪恶意软件产生函数调用.
  • 恶意软件执行期间的文件操作,包括新建,删除以及文件下载.
  • 恶意软件的内存转储.
  • PCAP格式的网络流量捕获.
  • 恶意软件运行时的截屏.
  • 虚拟机的完整内存转储文件.
Cuckoo的历史

Cuckoo 沙箱最初开始于2010年的谷歌编程之夏中的蜜网项目。它由*Claudio “nex” Guarnieri* 设计和开发,并且 他们现在仍然时该项目的核心开发和领导者。

在2010年夏天的初步开发后, Cuckoo的首个公开测试版本在2011年2月5日发布。

In March 2011, Cuckoo has been selected again as a supported project during Google Summer of Code 2011 with The Honeynet Project, during which Dario Fernandes joined the project and extended its functionality.

On November 2nd 2011 Cuckoo the release of its 0.2 version to the public as the first real stable release. On late November 2011 Alessandro “jekil” Tanasi joined the team expanding Cuckoo’s processing and reporting functionality.

On December 2011 Cuckoo v0.3 gets released and quickly hits release 0.3.2 in early February.

In late January 2012 we opened Malwr.com, a free and public running Cuckoo Sandbox instance provided with a full fledged interface through which people can submit files to be analysed and get results back.

In March 2012 Cuckoo Sandbox wins the first round of the Magnificent7 program organized by Rapid7.

During the Summer of 2012 Jurriaan “skier” Bremer joined the development team, refactoring the Windows analysis component sensibly improving the analysis’ quality.

On 24th July 2012, Cuckoo Sandbox 0.4 is released.

On 20th December 2012, Cuckoo Sandbox 0.5 “To The End Of The World” is released.

On 15th April 2013 we released Cuckoo Sandbox 0.6, shortly after having launched the second version of Malwr.com.

On 1st August 2013 Claudio “nex” Guarnieri, Jurriaan “skier” Bremer and Mark “rep” Schloesser presented Mo’ Malware Mo’ Problems - Cuckoo Sandbox to the rescue at Black Hat Las Vegas.

On 9th January 2014, Cuckoo Sandbox 1.0 is released.

In March 2014 Cuckoo Foundation born as non-profit organization dedicated to growth of Cuckoo Sandbox and the surrounding projects and initiatives.

On 7th April 2014, Cuckoo Sandbox 1.1 is released.

On the 7th of October 2014, Cuckoo Sandbox 1.1.1 is released after a Critical Vulnerability had been disclosed by Robert Michel.

On the 4th of March 2015, Cuckoo Sandbox 1.2 has been released featuring a wide array of improvements regarding the usability of Cuckoo.

During summer 2015 Cuckoo Sandbox started the development of Mac OS X malware analysis as a Google Summer of Code project within The Honeynet Project. Dmitry Rodionov qualified for the project and developed a working analyzer for Mac OS X.

On the 21st of February 2016 version 2.0 Release Candidate 1 is released. This version ships with almost two years of combined effort into making Cuckoo Sandbox a better project for daily usage.

Use Cases

Cuckoo 由于其模块化的设计,既可以作为独立的应用程序,亦可嵌入到大的系统中。

它可以用于分析:

  • 通用的Windows可执行文件
  • DLL文件
  • PDF文档
  • Microsoft Office文档
  • URLs 和 HTML 文件
  • PHP 脚本
  • CPL 文件
  • Visual Basic (VB) 脚本
  • ZIP 文件
  • Java JAR 文件
  • Python 脚本
  • 大部分的其他文件类型

由于它的模块化以及脚本化, Cuckoo不限制你用它来实现任何系统。

更多信息请参考 Customization 章节。

架构

Cuckoo沙箱包含了一个核心管理组件,用于管理样本的执行和分析。 每次分析都是在隔离的虚拟或者物理机环境上执行。

Cuckoo 由一个宿主机(管理组件)加上多个沙箱(物理机或者虚拟机)组成。 宿主机上的管理组件负责了一个样本分析的全部过程,样本的执行过程都是在沙箱中进行。

如下图片说明了Cuckoo的主要架构:

_images/architecture-main.png
Obtaining Cuckoo

2.0-rc2 版后已移除: Although Cuckoo can still be downloaded from the website we discourage from doing so, given that simply installing it through pip is the preferred way to get Cuckoo. Please refer to Cuckoo 安装.

Cuckoo can be downloaded from the official website, where the stable and packaged releases are distributed, or can be cloned from our official git repository.

警告

While being more updated, including new features and bugfixes, the version available in the git repository should be considered an under development stage. Therefore its stability is not guaranteed and it most likely lacks updated documentation.

License

Cuckoo Sandbox license is shipped with Cuckoo and contained in the “LICENSE” file inside the “docs” folder.

Disclaimer

Cuckoo is distributed as it is, in the hope that it will be useful, but without any warranty neither the implied merchantability or fitness for a particular purpose.

Whatever you do with this tool is uniquely your own responsibility.

Cuckoo Foundation

The Cuckoo Foundation is a non-profit organization incorporated as a Stichting in the Netherlands and it’s mainly dedicated to support of the development and growth of Cuckoo Sandbox, an open source malware analysis system, and the surrounding projects and initiatives.

The Foundation operates to secure financial and infrastructure support to our software projects and coordinates the development and contributions from the community.

Community guidelines

Cuckoo Sandbox is an open source project and we appreciate any form of contribution. These guidelines are meant to help you and us to answer questions, solve issues, and merge code as soon as we can. So, it is great that you are reading these guidelines! We will try to keep this as short as possible.

Introduction

These guidelines contain information on

  • What to include when creating issues for
    • Reporting bugs/errors/unexpected behavior
    • Feature suggestions/requests
  • Contributing code/documentation

We obviously want to fix, help with, and merge issues and contributions as fast as possible. To do this, we will likely ask some questions/post comments on your issue or pull request. We ask that you keep an eye on your issue/PR and try to answer questions we ask. Realise that it may take a while before we fix your issue or answer your question.

If after 60 days there is no progress in an issue or PR because of missing information, we may consider closing the issue. You are, of course, always welcome to re-open it in case additional information can be provided!

Creating issues

Issues.. Useful for many things. Bug/error/unexpected behavior reporting, asking questions, making suggestions/feature requests etc. When making any of these, it is very useful for us and you if you include the information listed here.

Reporting bugs, errors, and unexpected behavior

You notice a bug, see an error or behavior you did not expect and want to report it to us? That is great, thanks in advance! Before you report it, please see our FAQ. Common issues and their solutions are already mentioned here. You may also find a solution by searching existing issues.

You can also contact us using any of the methods mentioned at cuckoosandbox.org/discussion.

Now, if you do create an issue, it is very useful if you do and include the following information if you can and if it applies:

  • Use a descriptive issue title
  • Try to reproduce your issue
    • How can we reproduce it?
  • What was the intended goal of your usage of Cuckoo Sandbox?
    • Submitting a task, waiting for a result, adding a module etc.
  • Any information on your environment?
    • Your Cuckoo Sandbox version
    • The operating system the Cuckoo host is running on
    • Parts of the configuration related to the error
    • If you customized code, can you tell us what was customized?
  • What happened?
    • Try to explain what happened in detail - this makes it possible for us to reproduce, confirm, and fix the issue.
    • For errors etc, please include the log with this error. Preferably with a link to an online paste service.
    • If you can, include a hash of the file being analyzed by Cuckoo.
  • What did you try to do so far?
    • If you tried to do anything to fix it, please include what you have tried so far.
Feature requests/suggestions

You have thought of or would like to see a new feature in Cuckoo Sandbox. Maybe you have a suggestion to change something? Great! We would love to hear about it.

When creating a feature request/suggestion, include the following if it applies:

  • A descriptive issue title
  • What is your suggestion?
    • What do you want to change/add?
  • What is the goal of this change/addition?
  • Do you have suggestions for the implementation?
    • For example: using a specific library/package
Asking questions

Have a question about Cuckoo Sandbox? Maybe it has already been asked. Please see our FAQ and documentation first.

Did not find your answer? Feel free to contact us using any of the methods mentioned here, or by creating an issue.

Code and documentation contributions

You want to contribute by writing code or documentation? That is great, all help is appreciated! It is very easy to get started:

  1. Fork our repository
  2. Take a look at our development documentation for guidelines and tips
  3. Make the changes that you want to contribute
  4. Create a pull request
Testing

It is very important for us to keep Cuckoo Sandbox operational. This is why we only merge a contribution after we know it was tested and does not break anything. To unit test Cuckoo, we use Pytest. All existing tests for Cuckoo are located in the tests/ folder.

It would be appreciated if you did add a test to your contribution. This way, the correct operation of your contribution can be tested in the future.

Pull requests

When creating a pull request, please include the following:

  • What did you create/change?
  • What is the goal of this addition/change?
  • Did you test your addition/change?

安装

本章节主要说明如何安装Cuckoo。推荐Linux系统(Debian 或者 Ubuntu 等)。

注解

【译者注】 测试过程中选择了Debian 8, Debian下大部分的软件安装方式与Ubuntu是类似的

Although the recommended setup is GNU/Linux (Debian or Ubuntu preferably), Cuckoo has proved to work smoothly on Mac OS X and Microsoft Windows 7 as host as well. The recommended and tested setup for guests are Windows XP and 64-bit Windows 7 for Windows analysis, Mac OS X Yosemite for Mac OS X analysis, and Debian for Linux Analysis, although Cuckoo should work with other releases of guest Operating Systems as well.

注解

This documentation refers to Host as the underlying operating systems on which you are running Cuckoo (generally being a GNU/Linux distribution) and to Guest as the Windows virtual machine used to run the isolated analysis.

宿主机的准备

我们推荐使用*GNU/Linux*系统,本文中将会使用**最新的 Ubuntu LTS版本**为例。

依赖

在安装和配置Cuckoo之前,需要先安装依赖的一些软件和库。

注解

【译者注】 Debian下Apt软件安装,可以去掉命令前面的sudo

安装 Python 库 (Ubuntu/Debian-based)

Cuckoo的管理组件完全由Python脚本编写,所以就需要适合的Python版本。 当前,我们完全兼容的Python版本是 2.7

老版本的Python和Python 3(未来可能会支持) 目前都是不支持的。

以下一些通过Apt安装的软件都是必须的:

$ sudo apt-get install python python-pip python-dev libffi-dev libssl-dev
$ sudo apt-get install python-virtualenv python-setuptools
$ sudo apt-get install libjpeg-dev zlib1g-dev swig

如果要使用我们基于Django开发的Web界面, 则MongoDB是必须要安装的:

$ sudo apt-get install mongodb

如果要使用PostgreSQL数据库(推荐), PostgreSQL也必须安装:

$ sudo apt-get install postgresql libpq-dev

YaraPydeep可选 的插件。如果选择安装的话,具体安装步骤可以参考他们的官网.

如果使用KVM的话,则需要安装KVM相关依赖:

$ sudo apt-get install qemu-kvm libvirt-bin ubuntu-vm-builder bridge-utils python-libvirt

If you want to use XenServer you’ll have to install the XenAPI Python package:

$ sudo pip install XenAPI

如果要使用*mitm*辅助模块 ( SSL/TLS 中间人攻击), 需要安装 mitmproxy. 可以参考官网的相关安装说明.

Installing Python libraries (on Mac OS X)

This is mostly the same as the installation on Ubuntu/Debian, except that we’ll be using the brew package manager. Install all the required dependencies as follows (this list is WIP):

$ brew install libmagic cairo pango openssl

In addition to that you’ll also want to expose the openssl header files in the standard GCC/Clang include directory, so that yara-python may compile successfully. This can be done as follows:

$ cd /usr/local/include
$ ln -s ../opt/openssl/include/openssl .
Installing Python libraries (on Windows 7)

To be documented.

虚拟化软件

Cuckoo沙箱支持大部分的虚拟化软件,可以很方便的添加和使用各种虚拟化支持。

本文档以VirtualBox为例。选择哪种虚拟机软件并不影响后续的分析, 但是如果你选择了相应的虚拟机,应该按照我们相应的文档和FAQ去配置。

注解

【译者注】 测试过程中选择了KVM

Assuming you decide to go for VirtualBox, you can get the proper package for your distribution at the official download page. Please find following the commands to install the latest version of VirtualBox on your Ubuntu LTS machine. Note that Cuckoo supports VirtualBox 4.3, 5.0, and 5.1:

$ echo deb http://download.virtualbox.org/virtualbox/debian xenial contrib | sudo tee -a /etc/apt/sources.list.d/virtualbox.list
$ wget -q https://www.virtualbox.org/download/oracle_vbox_2016.asc -O- | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get install virtualbox-5.1

For more information on VirtualBox, please refer to the official documentation.

安装 tcpdump

Tcpdump用于抓取恶意软件运行过程中产生的所有流量。

安装命令:

$ sudo apt-get install tcpdump apparmor-utils
$ sudo aa-disable /usr/sbin/tcpdump

AppArmor 只有当PCAP文件生成没有权限的时候才需要,可以参考 Permission denied for tcpdump

禁用了AppArmor 的Linux的平台下, 比如Debian, 仅需要安装 tcpdump:

$ sudo apt-get install tcpdump

Tcpdump需要root权限,如果不想运行在root用户下,需要做以下设置:

$ sudo setcap cap_net_raw,cap_net_admin=eip /usr/sbin/tcpdump

可以用以下命令验证是否配置正确:

$ getcap /usr/sbin/tcpdump
/usr/sbin/tcpdump = cap_net_admin,cap_net_raw+eip

如果没有`setcap`命令, 则需要安装下面的包:

$ sudo apt-get install libcap2-bin

或者 (不推荐)

$ sudo chmod +s /usr/sbin/tcpdump

需要注意的是 setcap 命令不安全,有可能造成提权漏洞,我们建议将Cuckoo安装在专有的环境里。

安装 Volatility

Volatility 用于分析内存转储文件的可选工具. Cuckoo与Volatility配合,可以更深度和全面的分析,可以防止恶意软件利用rookit技术逃逸沙箱的监控。

为了能够工作正常,Cuckoo要求Volatility版本不低于 2.3, 推荐最新版本2.5。 可以从官网下载 official repository.

可以查阅Volatility官方文档的安装说明.

安装 M2Crypto

当前 M2Crypto 库需要 SWIG 支持. Ubuntu/Debian-like 系统下可以通过以下命令安装:

$ sudo apt-get install swig

SWIG 安装好之后,通过以下命令安装 M2Crypto:

$ sudo pip install m2crypto==0.24.0
安装 guacd

guacd 是RDP,SSH,VNC等远程控制的代理层, 是Cuckoo的Web界面的远程终端中使用,可选。

没有它,远程控制功能就无法使用,版本要求0.9.9及以上。我们推荐安装最新版本 使用如下命令安装:

$ sudo apt install libguac-client-rdp0 libguac-client-vnc0 libguac-client-ssh0 guacd

如果只需要远程桌面功能,则可以跳过 libguac-client-vnc0libguac-client-ssh0 两个包.

如果你使用了较老的Linux发行版,又想使用最新的guacd,那只能自己动手编译,就不做过多说明了:

$ sudo apt -y install libcairo2-dev libjpeg-turbo8-dev libpng-dev libossp-uuid-dev libfreerdp-dev
$ mkdir /tmp/guac-build && cd /tmp/guac-build
$ wget https://www.apache.org/dist/guacamole/0.9.14/source/guacamole-server-0.9.14.tar.gz
$ tar xvf guacamole-server-0.9.14.tar.gz && cd guacamole-server-0.9.14
$ ./configure --with-init-dir=/etc/init.d
$ make && sudo make install && cd ..
$ sudo ldconfig
$ sudo /etc/init.d/guacd start

When installing from source, make sure you don’t have another version of any of the libguac- libraries installed from your package manager or you might experience issues due to incompatibilities which can crash guacd.

Note that the VirtualBox Extension Pack must also be installed to take advantage of the Cuckoo Control functionality exposed by Guacamole.

Cuckoo 安装
创建用户

Cuckoo可以运行在已有用户下面,也可以新建一个用户来跑Cuckoo。 但是要保证虚拟机和Cuckoo运行在相同的用户下。

创建新用户:

$ sudo adduser cuckoo

If you’re using VirtualBox, make sure the new user belongs to the “vboxusers” group (or the group you used to run VirtualBox):

$ sudo usermod -a -G vboxusers cuckoo

如果使用KVM,要将用户加入到Libvirtd用户组:

$ sudo usermod -a -G libvirtd cuckoo
增加打开文件数限制

FAQ 文档里的问题 IOError: [Errno 24] Too many open files 由于操作系统的打开文件 数限制,会导致报表生成失败。

安装 Cuckoo

安装最新版本的Cuckoo比较简单. 我们推荐使用 pip``setuptools``来安装最新版本的Cuckoo。 (一些可能存在的问题 DistributionNotFound / No distribution matching the version..).

警告

缺少依赖的时候会导致各种问题.建议安装前仔细阅读 依赖 章节.

$ sudo pip install -U pip setuptools
$ sudo pip install -U cuckoo

全局 安装Cuckoo是没有问题的,但是我们 强力推荐virtualenv 来安装

$ virtualenv venv
$ . venv/bin/activate
(venv)$ pip install -U pip setuptools
(venv)$ pip install -U cuckoo

为什么我们推荐使用 virtualenv 呢:

  • Cuckoo的依赖并不是用的最新版本,可能会与系统已有的版本冲突.
  • 系统中其他软件的安装,可能会导致Cuckoo的依赖产生问题.
  • 使用virtualenv,可以让非root用户也可以安装相关软件.
  • 简单来说virtualenv是最佳实践.

Please refer to Cuckoo 工作目录 and Cuckoo 工作目录使用说明 to learn more about the Cuckoo Working Directory and how to operate it.

Install Cuckoo from file

By downloading a hard copy of the Cuckoo Package and installing it offline, one may set up Cuckoo using a cached copy and/or have a backup copy of current Cuckoo versions in the future. We also feature the option to download such a tarball on our website.

Obtaining the tarball of Cuckoo and all of its dependencies manually may be done as follows:

$ pip download cuckoo

You will end up with a file Cuckoo-2.0.0.tar.gz (or a higher number, depending on the latest released stable version) as well as all of its dependencies (e.g., alembic-0.8.8.tar.gz).

Installing that exact version of Cuckoo may be done as you’re familiar with from installing it using pip directly, except now using the filename of the tarball:

$ pip install Cuckoo-2.0.0.tar.gz

On systems where no internet connection is available, the $ pip download cuckoo command may be used to fetch all of the required dependencies and as such one should be able to - in theory - install Cuckoo completely offline using those files, i.e., by executing something like the following:

$ pip install *.tar.gz
Build/Install Cuckoo from source

By cloning Cuckoo Sandbox from our official repository, you can install it from source. After cloning, follow the steps mentioned in Development with the Python Package to start the installation.

Cuckoo 工作目录

2.0.0 新版功能.

新版本中多了一个 Cuckoo 工作目录 的概念, 用来存储之前的所有配置文件, 生成的数据以及分析结果。 具体包括但不限于以下几种文件:

  • 配置文件
  • Cuckoo 签名规则
  • Cuckoo 分析器
  • Cuckoo 客户端
  • Yara 规则集
  • Cuckoo 分析数据存储
  • 其他文件..

Cuckoo 工作目录 相比较之前的方式,有了更多的优点.

注解

This document merely shows the installation part of the CWD, for its actual usage, please refer to the Cuckoo 工作目录使用说明 document.

配置

If you have ever updated your Cuckoo setup to a later version, you have run into the issue where you had to make a backup of your configuration, update your Cuckoo instance, and either restore your configuration or re-apply it completely.

With the introduction of the CWD we have gotten rid of this update nightmare.

Cuckoo 首次运行的时候 CWD 目录会自动创建,输出如下:

$ cuckoo -d

        _       _                   _             _              _            _
        /\ \     /\_\               /\ \           /\_\           /\ \         /\ \
        /  \ \   / / /         _    /  \ \         / / /  _       /  \ \       /  \ \
        / /\ \ \  \ \ \__      /\_\ / /\ \ \       / / /  /\_\    / /\ \ \     / /\ \ \
    / / /\ \ \  \ \___\    / / // / /\ \ \     / / /__/ / /   / / /\ \ \   / / /\ \ \
    / / /  \ \_\  \__  /   / / // / /  \ \_\   / /\_____/ /   / / /  \ \_\ / / /  \ \_\
    / / /    \/_/  / / /   / / // / /    \/_/  / /\_______/   / / /   / / // / /   / / /
    / / /          / / /   / / // / /          / / /\ \ \     / / /   / / // / /   / / /
/ / /________  / / /___/ / // / /________  / / /  \ \ \   / / /___/ / // / /___/ / /
/ / /_________\/ / /____\/ // / /_________\/ / /    \ \ \ / / /____\/ // / /____\/ /
\/____________/\/_________/ \/____________/\/_/      \_\_\\/_________/ \/_________/

Cuckoo Sandbox 2.0.0
www.cuckoosandbox.org
Copyright (c) 2010-2017

=======================================================================
    Welcome to Cuckoo Sandbox, this appears to be your first run!
    We will now set you up with our default configuration.
    You will be able to modify the configuration to your likings
    by exploring the /home/cuckoo/.cuckoo directory.

    Among other configurable things of most interest is the
    new location for your Cuckoo configuration:
            /home/cuckoo/.cuckoo/conf
=======================================================================

Cuckoo has finished setting up the default configuration.
Please modify the default settings where required and
start Cuckoo again (by running `cuckoo` or `cuckoo -d`).

从输出消息中可以看到 CWD 的具体路径。默认是在当前用户目录下 ~/.cuckoo . 配置文件在 $CWD/conf 目录下.

由于现在有了 CWD 目录, 配置与Cuckoo的引擎分离, 所以以后的版本更新维护会更方便。 两边都可以独立升级。

CWD 路径

默认情况下 CWD 默认目录是 ~/.cuckoo 。 但是这个路径也是可以通过以下几种方式修改的, 优先级从高到低

  • 通过命令行参数 --cwd (e.g., --cwd ~/.cuckoo).
  • 通过配置环境变量 CUCKOO (e.g., export CUCKOO=~/.cuckoo).
  • 通过配置环境变量 CUCKOO_CWD .
  • 当前目录名为 .cuckoo (e.g., cd ~/.cuckoo 则会将当前目录作为 CWD).
  • 默认路径 ~/.cuckoo.

由于 CWD 目录的可配, 理论上可以并行Cuckoo进程, 例如可以同时运行Windows 和 Android 分析。

下面有一些修改 CWD 路径的命令样例供参考.

# Places the CWD in /opt/cuckoo. Note that Cuckoo will normally create the
# CWD itself, but in order to create a directory in /opt root capabilities
# are usually required.
$ sudo mkdir /opt/cuckoo
$ sudo chown cuckoo:cuckoo /opt/cuckoo
$ cuckoo --cwd /opt/cuckoo

# You could place this line in your .bashrc, for example.
$ export CUCKOO=/opt/cuckoo
$ cuckoo

Experimenting with multiple Cuckoo setups is now as simple as creating multiple CWD’s and configuring them accordingly.

配置

Cuckoo 中有几个核心的配置文件:

  • cuckoo.conf: 用于配置通用选项和分析参数.
  • auxiliary.conf: 用于开启或者分配辅助模块.
  • <machinery>.conf: 用于配置和填入虚拟机相关参数(使用何种虚拟机,则选择哪种虚拟机配置文件,例如选择kvm, 则配置kvm.conf).
  • memory.conf: Volatility 配置选项.
  • processing.conf: 用户开启或者配置数据处理模块.
  • reporting.conf: 用于开关报表模块.

Cuckoo正常工作至少需要配置两个文件 cuckoo.conf<machinery>.conf.

cuckoo.conf

文件路径 $CWD/conf/cuckoo.conf. 注意下下 $CWD 目录指的Cuckoo工作目录,具体可以参考 Cuckoo 工作目录 . The cuckoo.conf 包含了通用的选项,修改前要熟知其含义.

配置文件中已经对相关选项做了详细的注释,如下几个选项我们做一下特别的说明:

  • [cuckoo] 中的 machinery :
    该选项指定使用何种虚拟机引擎 (e.g., virtualbox or vmware).
  • [resultserver] 中的 ipport :
    这个IP和端口是Cuckoo的结果服务需要监听的,要确保虚拟机的网络对该IP和端口是可达的, 否则可能造成没有分析结果.
  • [database] 中的 connection :
    这个配置用于定义数据库链接URL。可以使用任何 SQLAlchemy 支持的 Database Urls 格式.

警告

Check your interface for resultserver IP! Some virtualization software (for example Virtualbox) don’t bring up the virtual networking interfaces until a virtual machine is started. Cuckoo needs to have the interface where you bind the resultserver up before the start, so please check your network setup. If you are not sure about how to get the interface up, a good trick is to manually start and stop an analysis virtual machine, this will bring virtual networking up. If you are using NAT/PAT in your network, you can set up the resultserver IP to 0.0.0.0 to listen on all interfaces, then use the specific options resultserver_ip and resultserver_port in <machinery>.conf to specify the address and port as every machine sees them. Note that if you set resultserver IP to 0.0.0.0 in cuckoo.conf you have to set resultserver_ip for all your virtual machines.

auxiliary.conf

辅助模块在恶意软件运行的同时运行, 该配置文件中可以修改相关选项.

以下是 $CWD/conf/auxiliary.conf 的文件内容. .. note:

【译者注】 文件内容就不翻译了,选项含义都较为明确
[sniffer]
# Enable or disable the use of an external sniffer (tcpdump) [yes/no].
enabled = yes

# Specify the path to your local installation of tcpdump. Make sure this
# path is correct.
tcpdump = /usr/sbin/tcpdump

# We used to define the network interface to capture on in auxiliary.conf, but
# this has been moved to the "interface" field of each Virtual Machinery
# configuration.

# Specify a Berkeley packet filter to pass to tcpdump.
# Note: packer filtering is not possible when using "nictrace" functionality
# from VirtualBox (for example dumping inter-VM traffic).
bpf = 

[mitm]
# Enable man in the middle proxying (mitmdump) [yes/no].
enabled = no

# Specify the path to your local installation of mitmdump. Make sure this
# path is correct.
mitmdump = /usr/local/bin/mitmdump

# Listen port base. Each virtual machine will use its own port to be
# able to make a good distinction between the various running analyses.
# Generally port 50000 should be fine, in this case port 50001, 50002, etc
# will also be used - again, one port per analyses.
port_base = 50000

# Script file to interact with the network traffic. Please refer to the
# documentation of mitmproxy/mitmdump to get an understand of their internal
# workings. (https://mitmproxy.org/doc/scripting/inlinescripts.html)
script = stuff/mitm.py

# Path to the certificate to be used by mitmdump. This file will be
# automatically generated for you if you run mitmdump once. It's just that
# you have to copy it from ~/.mitmproxy/mitmproxy-ca-cert.p12 to somewhere
# in the analyzer/windows/ directory. Recommended is to write the certificate
# to analyzer/windows/bin/cert.p12, in that case the following option should
# be set to bin/cert.p12.
certificate = bin/cert.p12

[services]
# Provide extra services accessible through the network of the analysis VM
# provided in separate, standalone, Virtual Machines [yes/no].
enabled = no

# Comma-separated list with each Virtual Machine containing said service(s).
services = honeyd

# Time in seconds required to boot these virtual machines. E.g., some services
# will only get online after a minute because initialization takes a while.
timeout = 0

[reboot]
# This auxiliary module should be enabled for reboot analysis support.
enabled = yes
<machinery>.conf

虚拟机模块定义了Cuckoo与选择的虚拟机引擎之间是如何交互的.

每种虚拟机引擎都有独立的配置文件,例如KVM引擎就是kvm.conf.

Cuckoo 默认使用的是 Virtualbox.

以下即是 $CWD/conf/Virtualbox.conf 的文件内容.

不同虚拟机的配置文件看起来类似, 只是稍有不同. 例如., XenServer 通过API操作,所以需要填写URL和认证信息.

配置文件中对选项含义也有详细备注.

以下是 $CWD/conf/kvm.conf 的文件内容.

[kvm]
# Specify a comma-separated list of available machines to be used. For each
# specified ID you have to define a dedicated section containing the details
# on the respective machine. (E.g. cuckoo1,cuckoo2,cuckoo3)
machines = cuckoo1

# Specify the name of the default network interface that will be used
# when dumping network traffic with tcpdump.
# Example (virbr0 is the interface name):
interface = virbr0


[cuckoo1]
# Specify the label name of the current machine as specified in your
# libvirt configuration.
label = cuckoo1

# Specify the operating system platform used by current machine
# [windows/darwin/linux].
platform = windows

# Specify the IP address of the current virtual machine. Make sure that the
# IP address is valid and that the host machine is able to reach it. If not,
# the analysis will fail. You may want to configure your network settings in
# /etc/libvirt/<hypervisor>/networks/
ip = 192.168.122.101

# (Optional) Specify the snapshot name to use. If you do not specify a snapshot
# name, the KVM MachineManager will use the current snapshot.
# Example (Snapshot1 is the snapshot name):
snapshot = 

# (Optional) Specify the name of the network interface that should be used
# when dumping network traffic from this machine with tcpdump.
# Example (virbr0 is the interface name):
interface = 

# (Optional) Specify the IP of the Result Server, as your virtual machine sees it.
# The Result Server will always bind to the address and port specified in cuckoo.conf,
# however you could set up your virtual network to use NAT/PAT, so you can specify here
# the IP address for the Result Server as your machine sees it. If you don't specify an
# address here, the machine will use the default value from cuckoo.conf.
# NOTE: if you set this option you have to set result server IP to 0.0.0.0 in cuckoo.conf.
# Example:
resultserver_ip = 

# (Optional) Specify the port for the Result Server, as your virtual machine sees it.
# The Result Server will always bind to the address and port specified in cuckoo.conf,
# however you could set up your virtual network to use NAT/PAT, so you can specify here
# the port for the Result Server as your machine sees it. If you don't specify a port
# here, the machine will use the default value from cuckoo.conf.
# Example:
resultserver_port = 

# (Optional) Set your own tags. These are comma separated and help to identify
# specific VMs. You can run samples on VMs with tag you require.
tags = 

# (Optional) Specify the OS profile to be used by volatility for this
# virtual machine. This will override the guest_profile variable in
# memory.conf which solves the problem of having multiple types of VMs
# and properly determining which profile to use.
osprofile =
memory.conf

Volatility 工具提供的内存分析的大量插件, 其中一部分插件运行很慢。 $CWD/conf/volatility.conf 配置文件可以让你配置开关哪些插件。 如果需要运行内存分析,需要打开两个开关:

  • 启用 $CWD/conf/processing.conf 中的 volatility
  • 启用 $CWD/conf/cuckoo.conf 中的 memory_dump

$CWD/conf/memory.conf 文件的基础配置一节中, 可以配置是否在内存分析完成后,删除转储文件。 可以节省大量的磁盘空间, 配置内存如下:

# Basic settings
[basic]
# Profile to avoid wasting time identifying it
guest_profile = WinXPSP2x86
# Delete memory dump after volatility processing.
delete_memdump = no

在此之下,每个插件都有相应的配置:

# Scans for hidden/injected code and dlls
# http://code.google.com/p/volatility/wiki/CommandReference#malfind
[malfind]
enabled = on
filter = on

# Lists hooked api in user mode and kernel space
# Expect it to be very slow when enabled
# http://code.google.com/p/volatility/wiki/CommandReference#apihooks
[apihooks]
enabled = off
filter = on

每个插件都可以单独是否开启白名单filter. [mask] 中的 pid_generic 可以配置进程id 白名单, 在白名单中的进程不做内存分析:

# Masks. Data that should not be logged
# Just get this information from your plain VM Snapshot (without running malware)
# This will filter out unwanted information in the logs
[mask]
# pid_generic: a list of process ids that already existed on the machine before the malware was started.
pid_generic = 4, 680, 752, 776, 828, 840, 1000, 1052, 1168, 1364, 1428, 1476, 1808, 452, 580, 652, 248, 1992, 1696, 1260, 1656, 1156
processing.conf

该配置文件用于开关以及配置结果分析模块. 结果分析模块属于 cuckoo.processing 模块,主要用于对原始数据进行分析 .

$CWD/conf/processing.conf 中每一个分析模块都有相应的配置section.

# Enable or disable the available processing modules [yes/no].
# If you add a custom processing module to your Cuckoo setup, you have to add
# a dedicated entry in this file, or it won't be executed.
# You can also add additional options under the section of your module and
# they will be available in your Python class.

[analysisinfo]
enabled = yes

[apkinfo]
enabled = no
# Decompiling dex files with androguard in a heavy operation. For large dex
# files it can really take quite a while - it is recommended to limit to a
# certain filesize.
decompilation_threshold = 5000000

[baseline]
enabled = no

[behavior]
enabled = yes

[buffer]
enabled = yes

[debug]
enabled = yes

[droidmon]
enabled = no

[dropped]
enabled = yes

[dumptls]
enabled = yes

[extracted]
enabled = yes

[googleplay]
enabled = no
android_id = 
google_login = 
google_password = 

[memory]
# Create a memory dump of the entire Virtual Machine. This memory dump will
# then be analyzed using Volatility to locate interesting events that can be
# extracted from memory.
enabled = no

[misp]
enabled = no
url = 
apikey = 

# Maximum amount of IOCs to look up (hard limit).
maxioc = 100

[network]
enabled = yes

# Allow domain whitelisting
whitelist_dns = no

# Allow DNS responses from your configured DNS server for whitelisting to
# deactivate when responses come from some other DNS
# Can be also multiple like : 8.8.8.8,8.8.4.4
allowed_dns = 

[procmemory]
# Enables the creation of process memory dumps for each analyzed process right
# before they terminate themselves or right before the analysis finishes.
enabled = yes
# It is possible to load these process memory dumps in IDA Pro through the
# generation of IDA Python-based script files. Although currently symbols and
# such are not properly recovered, it is still nice to get a quick look at
# specific memory addresses of a process.
idapro = no
# Extract executable images from this process memory dump. This allows us to
# relatively easily extract injected executables.
extract_img = yes
# Also extract DLL files from the process memory dump.
extract_dll = no
# Delete process memory dumps after analysis to save disk space.
dump_delete = no

[procmon]
# Enable procmon processing. This only takes place when the "procmon=1" option
# is set for an analysis.
enabled = yes

[screenshots]
enabled = yes
# Set to the actual tesseract path (i.e., /usr/bin/tesseract or similar)
# rather than "no" to enable OCR analysis of screenshots.
# Note: doing OCR on the screenshots is a rather slow process.
tesseract = no

[snort]
enabled = no
# Following are various configurable settings. When in use of a recent 2.9.x.y
# version of Snort there is no need to change any of the following settings as
# they represent the defaults.
#
snort = /usr/local/bin/snort
conf = /etc/snort/snort.conf

[static]
enabled = yes
# On bigger PDF files PeePDF may take a substantial amount of time to perform
# static analysis of PDF files, with times of over an hour per file estimated
# in production. This option will by default limit the maximum processing time
# to one minute, but this may be adjusted accordingly. Note that if the timeout
# is hit, no static analysis results through PeePDF will be available.
pdf_timeout = 60

[strings]
enabled = yes

[suricata]
enabled = no

# Following are various configurable settings. When in use of a recent version
# of Suricata there is no need to change any of the following settings as they
# represent the defaults.
suricata = /usr/bin/suricata
conf = /etc/suricata/suricata.yaml
eve_log =  eve.json
files_log = files-json.log
files_dir = files

# By specifying the following line our processing module can use the socket
# mode in Suricata. This is quite the performance improvement as instead of
# having to load all the Suricata rules for each time the processing module is
# ran (i.e., for every task), the rules are only loaded once and then we talk
# to its API. This does require running Suricata as follows or similar;
# "suricata --unix-socket -D".
# (Please find more information in utils/suricata.sh for now).
# socket = /var/run/suricata/cuckoo.socket
socket = 

[targetinfo]
enabled = yes

[virustotal]
enabled = no
# How much time we can wait to establish VirusTotal connection and get the
# report.
timeout = 60
# Enable this option if you want to submit files to VirusTotal not yet available
# in their database.
# NOTE: if you are dealing with sensitive stuff, enabling this option you could
# leak some files to VirusTotal.
scan = no
# Add your VirusTotal API key here. The default API key, kindly provided
# by the VirusTotal team, should enable you with a sufficient throughput
# and while being shared with all our users, it shouldn't affect your use.
key = a0283a2c3d55728300d064874239b5346fb991317e8449fe43c902879d758088

[irma]
enabled = no
# IRMA @ github : https://github.com/quarkslab/irma
# How much time we can wait to establish IRMA connection and get the report.
timeout = 60
# Enable this option if you want to submit files to IRMA not yet available.
scan = no
# Force scan of submitted files
force = no
# URL to your IRMA installation
# For example : https://your.irma.host
url =

如果你有私有的 VirusTotal key, 可以将它修改为自己的key.

reporting.conf

$CWD/conf/reporting.conf 主要用于配置报告生成.

主要包含以下内容.

# Enable or disable the available reporting modules [on/off].
# If you add a custom reporting module to your Cuckoo setup, you have to add
# a dedicated entry in this file, or it won't be executed.
# You can also add additional options under the section of your module and
# they will be available in your Python class.

[feedback]
# Automatically report errors that occurred during an analysis. Requires the
# Cuckoo Feedback settings in cuckoo.conf to have been filled out properly.
enabled = no

[jsondump]
enabled = yes
indent = 4
calls = yes

[singlefile]
# Enable creation of report.html and/or report.pdf?
enabled = no
# Enable creation of report.html?
html = no
# Enable creation of report.pdf?
pdf = no

[misp]
enabled = no
url = 
apikey = 

# The various modes describe which information should be submitted to MISP,
# separated by whitespace. Available modes: maldoc ipaddr hashes url.
mode = maldoc ipaddr hashes url

[mongodb]
enabled = no
host = 127.0.0.1
port = 27017
db = cuckoo
store_memdump = yes
paginate = 100
# MongoDB authentication (optional).
username = 
password = 

[elasticsearch]
enabled = no
# Comma-separated list of ElasticSearch hosts. Format is IP:PORT, if port is
# missing the default port is used.
# Example: hosts = 127.0.0.1:9200, 192.168.1.1:80
hosts = 127.0.0.1
# Increase default timeout from 10 seconds, required when indexing larger
# analysis documents.
timeout = 300
# Set to yes if we want to be able to search every API call instead of just
# through the behavioral summary.
calls = no
# Index of this Cuckoo instance. If multiple Cuckoo instances connect to the
# same ElasticSearch host then this index (in Moloch called "instance") should
# be unique for each Cuckoo instance.
index = cuckoo

# Logging time pattern.  This sets how elasticsearch creates indexes
# by default it is yearly in most instances this will be sufficient
# valid options: yearly, monthly, daily
index_time_pattern = yearly

# Cuckoo node name in Elasticsearch to identify reporting host. Can be useful
# for automation and while referring back to correct Cuckoo host.
cuckoo_node = 

[moloch]
enabled = no
# If the Moloch web interface is hosted on a different IP address than the
# Cuckoo Web Interface then you'll want to override the IP address here.
host = 
# If you wish to run Moloch in http (insecure) versus https (secure) mode,
# set insecure to yes.
insecure = no

# Following are various configurable settings. When in use of a recent version
# of Moloch there is no need to change any of the following settings as they
# represent the defaults.
moloch_capture = /data/moloch/bin/moloch-capture
conf = /data/moloch/etc/config.ini
instance = cuckoo

[notification]
# Notification module to inform external systems that analysis is finished.
# You should consider keeping this as very last reporting module.
enabled = no

# External service URL where info will be POSTed.
# example : https://my.example.host/some/destination/url
url = 

# Cuckoo host identifier - can be hostname.
# for example : my.cuckoo.host
identifier = 

[mattermost]
enabled = no

# Mattermost webhook URL.
# example : https://my.mattermost.host/hooks/yourveryrandomkey
url = 

# Cuckoo host URL to make analysis ID clickable.
# example : https://my.cuckoo.host/
myurl = 

# Username to show when posting message
username = cuckoo

# What kind of data to show apart from default.
# Show virustotal hits.
show_virustotal = no

# Show matched cuckoo signatures.
show_signatures = no

# Show collected URL-s by signature "network_http".
show_urls = no

# Hide filename and create hash of it
hash_filename = no
# Hide URL and create hash of it
hash_url = no

通过将选项值修改为 on 或者 off 来开关相应的报告生成

单次分析路由

从 Cuckoo 2.0-rc1 版本起, 每个文件分析都可以有单独的网络路由。 换句话说,如果要去分析三个文件, 第一个可以不允许访问网络,第二个可以通过VPN访问网络, 第三个可以通过Tor路由访问网络。

然后,除了单次分析路由, 之前的默认路由方式更为常用。

我们的样例里以 VirtualBox 为例.

全局路由

在深入功能更丰富更复杂的单次路由之前,我们首先看之前的全局路由方式, 基于 iptables 规则, 一次设置, 永久有效。

在以下的配置中,我们假设分配给我们 VirtualBox 虚拟机的网络是 vboxnet0 , 虚拟机的网络是 192.168.56.101 子网 /24 , 出口网卡是 eth0 。 下面的 iptables 规则设置,将会允许虚拟机访问 Cuckoo 的宿主机以及互联网。

$ sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.56.0/24 -j MASQUERADE

# Default drop.
$ sudo iptables -P FORWARD DROP

# Existing connections.
$ sudo iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT

# Accept connections from vboxnet to the whole internet.
$ sudo iptables -A FORWARD -s 192.168.56.0/24 -j ACCEPT

# Internal traffic.
$ sudo iptables -A FORWARD -s 192.168.56.0/24 -d 192.168.56.0/24 -j ACCEPT

# Log stuff that reaches this point (could be noisy).
$ sudo iptables -A FORWARD -j LOG

以上的配置已经差不多了,我们还需要配置一个内核参数,打开IP转发。 不过这个配置是临时生效的,如果需要一直生效,需要在开机启动的时候自动执行下面两条命令:

$ echo 1 | sudo tee -a /proc/sys/net/ipv4/ip_forward
$ sudo sysctl -w net.ipv4.ip_forward=1

Iptables 规则也是临时的,重启后失效,如果需要一直生效可以安装 iptables-persistent 或者使用自启动脚本。

有些Linux新的发行版中, 对于网卡的命名规则已经变了, 配置的时候尤其要注意。

单次分析路由配置

上面已经分析过老的路由方式了, 接下来我们分析更细粒度管理的动态网络路由组件。

如本章引言中所述, 从 2.0-rc1 版本起, 我们引入了 Cuckoo Rooter, 增加了更多的路由方式.

下面是可选的几种路由方式.

路由选项 描述
None Routing 无路由方式,这是唯一一种不需要Rooter运行的方式, 默认路由生效。
Drop Routing 完全丢弃所有的非Cuckoo流量, 包括虚拟机子网内的流量.
Internet Routing 完整的互联网访问
InetSim Routing 将所有流量路由至宿主机上的虚拟网络服务
Tor Routing 将所有流量路由至Tor网络
VPN Routing 将所有流量路由值预先配置号的多个VPN网络
使用单次网络分析路由

通过上面的描述,我们已经了解了基础知识了,接下来我们开始实践。 Cuckoo 配置好之后, 启动Cuckoo Rooter,选择一种网络模式去分析就简单了。

Cuckoo Rooter 具体说明可以参考 Cuckoo Rooter Usage .

配置 iproute2

由于Linux 内核对于 TCP/IP 源路由需要注册所有的网卡信息, 所以我们使用 iproute2

下面我们将以配置 Internet Routing 为例。

假设出口网卡是 eth0

配置 iproute2 需要打开 /etc/iproute2/rt_tables 文件,内容可能如下显示:

#
# reserved values
#
255     local
254     main
253     default
0       unspec
#
# local
#

选个文件中没有的数字, 在文件末尾新建一行,填入数字 加 网卡名称。 例如:

#
# reserved values
#
255     local
254     main
253     default
0       unspec
#
# local
#

400     eth0

如果需要配置多个网卡,每个网卡都需要在文件中配置.

None Routing

什么都不做,使用 全局路由.

Drop Routing

drop routing 跟默认的 None Routing 很像 (如果没有配置全局的 iptables 规则), 不过它不允许虚拟机对互联网的访问。

使用 drop routing 只允许Cuckoo的内部流量, 任何对外的 DNS 或者 TCP/IP 都会被阻断。

Internet Routing

该路由模式允许虚拟机有完整的互联网路由, 不过正因为如此, 它允许恶意软件通过上行链路 连接到网络,我们称之为 dirty line

注解

dirty line 的出口网卡需要在 配置 iproute2 中配置.

InetSim Routing

InetSim 是一个提供模拟网络服务的开源软件。 如果需要使用 InetSim Routing 我们需要部署 InetSim 并且配置相关信息,让Cuckoo 可以使用。

$CWD/conf/routing.conf 中的配置:

[inetsim]
enabled = yes
server = 192.168.56.1

为了尽快的使用InetSim, 可以下载最新的 REMnux 发布版本,它包含了最新版本的InetSim。

注解

【译者注】 REMnux 是个包含了很多分析工具的虚拟机。

Tor Routing

注解

【译者注】 Tor 网络国内基本无法使用,就不翻译了。

注解

Although we highly discourage the use of Tor for malware analysis - the maintainers of Tor exit nodes already have a hard enough time keeping up their servers - it is in fact a well-supported feature.

First of all Tor will have to be installed. Please find instructions on installing the latest stable version of Tor here.

We’ll then have to modify the Tor configuration file (not talking about Cuckoo’s configuration for Tor yet!) In order to do so, we will have to provide Tor with the listening address and port for TCP/IP connections and UDP requests. For a default VirtualBox setup, where the host machine has IP address 192.168.56.1, the following lines will have to be configured in the /etc/tor/torrc file:

TransPort 192.168.56.1:9040
DNSPort 192.168.56.1:5353

Don’t forget to restart Tor (/etc/init.d/tor restart). That leaves us with the Tor configuration for Cuckoo, which may be found in the $CWD/conf/routing.conf file. The configuration is pretty self-explanatory so we’ll leave filling it out as an exercise to the reader (in fact, toggling the enabled field goes a long way):

[tor]
enabled = yes
dnsport = 5353
proxyport = 9040

Note that the port numbers in the /etc/tor/torrc and $CWD/conf/routing.conf files must match in order for the two to interact correctly.

VPN Routing

VPN 路由允许通过多个VPN节点进行分析。 通过在配置文件中设置不同国家的VPN信息, 我们可以模拟在不同的国家和IP地址下, 恶意软件是否有不同的行为。

VPN的配置和虚拟机信息的配置很类似, 配置在 $CWD/conf/routing.conf 文件中。

一个配置样例如下

[vpn]
# Are VPNs enabled?
enabled = yes

# Comma-separated list of the available VPNs.
vpns = vpn0

[vpn0]
# Name of this VPN. The name is represented by the filepath to the
# configuration file, e.g., cuckoo would represent /etc/openvpn/cuckoo.conf
# Note that you can't assign the names "none" and "internet" as those would
# conflict with the routing section in cuckoo.conf.
name = vpn0

# The description of this VPN which will be displayed in the web interface.
# Can be used to for example describe the country where this VPN ends up.
description = Spain, Europe

# The tun device hardcoded for this VPN. Each VPN *must* be configured to use
# a hardcoded/persistent tun device by explicitly adding the line "dev tunX"
# to its configuration (e.g., /etc/openvpn/vpn1.conf) where X in tunX is a
# unique number between 0 and your lucky number of choice.
interface = tun0

# Routing table name/id for this VPN. If table name is used it *must* be
# added to /etc/iproute2/rt_tables as "<id> <name>" line (e.g., "201 tun0").
# ID and name must be unique across the system (refer /etc/iproute2/rt_tables
# for existing names and IDs).
rt_table = tun0

注解

每个VPN网卡需要在 配置 iproute2 中配置.

Configuration (Android Analysis)

2.0-rc2 版后已移除: Android Analysis may not work as expected due to the changes to becoming a Cuckoo Package. Proper Android integration will be picked up as a Cuckoo update in the future.

注解

【译者注】 暂时不支持了,后续再说。

To get Cuckoo running Android analysis you should download the Android SDK and extract it in a folder Cuckoo can access. You should also configure avd.conf with the settings of your setup.

avd.conf

The main file for Android environment settings is $CWD/conf/avd.conf, it contains all the generic configuration used to launch the Android emulator and run the analysis.

The file is largely commented and self-explanatory, but some important options are as follows:

  • emulator_path:
    The path to the Android emulator (it is located inside Android SDK).
  • adb_path:
    The path to the Android Debug Bridge utility (it is located inside Android SDK).
  • avd_path:
    The path where the AVD images are located.

客户机的准备

到这里,你应该已经配置好Cuckoo, 也设计和定义好需要用于分析的虚拟机了。

现在我们就跟着下面的文件来安装和设置虚拟机,以Windows系统为例,linux的虚拟机参考 Installing the Linux host.

创建虚拟机

安装好 虚拟机软件后, 就可以开始创建虚拟机了.

虚拟机软件的使用和配置不在本文的范围内,可以参考您选择的虚拟机软件官网文档.

注解

You can find some hints and considerations on how to design and create your virtualized environment in the 沙箱 chapter.

注解

我们推荐64位的Win7或者WinXP虚拟机, 如果是Win7系统,需要关闭UAC。

在 2.0-rc2 版更改: We used to suggest Windows XP as a guest VM but nowadays a 64-bit Windows 7 machine yields much better results.

注解

KVM 用户 - 要选择一种支持快照的虚拟机镜像格式. 可以查阅 保存虚拟机 获取更多信息

创建的虚拟机, Cuckoo并不要求特殊的硬件配置信息, 你可以选择最适合需要的配置。

依赖

为了Windows虚拟机可以与Cuckoo工作正常,需要安装一些必须的软件和库。

安装 Python

Python 是 Cuckoo 客户端(分析器) 正常工作的必须软件。

可以直接从官网下载安装,要求 Python2.7 版本。

Cuckoo 客户端组件依赖于部分额外的Python 库, 包括:

这些组件不是必须要安装的, 但是不安装的话,分析组件的部分功能就无法正常使用。

其他软件

至此,Cuckoo 正常工作所需的软件的已经安装完成了。

不过根据你需要分析的文件类型, 也同时需要安装相应的软件, 例如浏览器,PDF阅读器,Office软件等。 记得要关闭这些软件的检查更新和自动更新。

这些额外的软件是否需要安装,完全取决于你是否所需。 可以阅读 沙箱 章节了解更多的信息.

网络配置

现在开始配置虚拟机的网络。

Windows 设置

在配置底层网络之前,可能需要调整一些windows虚拟机的内部配置。

最重要的事情是 关闭 Windows 防火墙自动更新。 这些都会影响恶意软件的行为,进而影响Cuckoo对这些行为的分析。

可以在Windows 控制面板中关闭相应的开关,如下图所示:

_images/windows_security.png
虚拟网络

现在可以决定虚拟机如何访问互联网或者本地局域网。

以前老的版本中, Cuckoo 虚拟机和宿主机之间的数据是通过共享文件夹进行交互。 从0.4版本起,则通过XMLRPC 协议来交互。

所以,需要配置给虚拟机配置静态IP,配置完成后,通过PING来测试虚拟机与宿主机之间 的通信是否正常。不要使用DHCP, 每次IP都不同的情况下,无法正常通信。

这些配置都要依赖于你的需求和所选的虚拟机软件的特性。

警告

虚拟网络报错! 虚拟网络配置是非常重要的部分。大部分Cuckoo遇到的问题都与网络配置有关系。 在你配置完成之后, 尽量用PING和TELNET工具测试是否正常。

推荐使用 Host-Only 模式的虚拟网络。 可以查看 单次分析路由 获取更多信息。

安装客户端

从0.4版本起, Cuckoo设计了以跨平台的交互客户端,可以在Windows, Android, Linux 和Mac OSX 系统上运行。

只有安装和启动了Cuckoo 客户端, 分析才能工作正常。

客户端的安装和启动是十分简单的。

$CWD/agent/ 目录中,可以找到 agent.py 文件。 把文件拷贝到虚拟机中, 然后将脚本启动起来。 客户端会启动一个小型的API服务,用于与宿主机通信。

在Windows系统中, 只要将脚本名称 从 agent.py 改为 agent.pyw , 可以在运行的时候不显示终端的窗口。

将脚本拷贝到 启动 目录, 即可实现脚本开机自启动。

保存虚拟机

在快照之前,一定要确保 Windows系统完全启动了,并且客户端在运行

虚拟机准备好之后,做一个快照,保存准备好的状态, 每种虚拟机软件,快照的方式都略有不同。

下面介绍了几种不同的虚拟机下快照的方法。

VirtualBox

VirtualBox可以在图形界面上直接创建快照或者通过命令行创建:

$ VBoxManage snapshot "<Name of VM>" take "<Name of snapshot>" --pause

快照创建好之后,可以通过如下命令关机和还原快照:

$ VBoxManage controlvm "<Name of VM>" poweroff
$ VBoxManage snapshot "<Name of VM>" restorecurrent
KVM

如果决定使用KVM的话,首先必须选择一种支持快照的磁盘镜像格式。 我们推荐使用QCOW2格式, 创建快照较为便捷。

使用libvirt来操作KVM虚拟机是最方便的, 它提供了 virshvirt-manager 的命令行或者界面方式来管理虚拟机。

如果之前创建的虚拟机不是QCOW2的格式,也可以通过命令来转化格式,例如:

$ cd /your/disk/image/path
$ qemu-img convert -O qcow2 your_disk.raw your_disk.qcow2

然后修改虚拟机定义文件:

$ virsh edit "<Name of VM>"

找到磁盘相关的部分:

<disk type='file' device='disk'>
    <driver name='qemu' type='raw'/>
    <source file='/your/disk/image/path/your_disk.raw'/>
    <target dev='hda' bus='ide'/>
    <address type='drive' controller='0' bus='0' unit='0'/>
</disk>

修改磁盘的文件后缀:

<disk type='file' device='disk'>
    <driver name='qemu' type='qcow2'/>
    <source file='/your/disk/image/path/your_disk.qcow2'/>
    <target dev='hda' bus='ide'/>
    <address type='drive' controller='0' bus='0' unit='0'/>
</disk>

重新开机测试虚拟机是否正常。

快照创建命令如下:

$ virsh snapshot-create "<Name of VM>"

有多个快照的情况下,有可能导致如下的错误:

ERROR: No snapshot found for virtual machine VM-Name

虚拟机快照的查看和删除,可以通过以下命令:

$ virsh snapshot-list "VM-Name"
$ virsh snapshot-delete "VM-Name" 1234567890
VMware Workstation

VMware也可以通过界面或者命令行的方式来创建快照:

$ vmrun snapshot "/your/disk/image/path/wmware_image_name.vmx" your_snapshot_name

your_snapshot_name 是快照的名称。 创建完成之后关闭虚拟机,可以通过界面或者命令行方式:

$ vmrun stop "/your/disk/image/path/wmware_image_name.vmx" hard
XenServer

注解

【译者注】 XenServer没有用到,就不翻译了。

If you decided to adopt XenServer, the XenServer machinery supports starting virtual machines from either disk or a memory snapshot. Creating and reverting memory snapshots require that the Xen guest tools be installed in the virtual machine. The recommended method of booting XenServer virtual machines is through memory snapshots because they can greatly reduce the boot time of virtual machines during analysis. If, however, the option of installing the guest tools is not available, the virtual machine can be configured to have its disks reset on boot. Resetting the disk ensures that malware samples cannot permanently modify the virtual machine.

Memory Snapshots

The Xen guest tools can be installed from the XenCenter application that ships with XenServer. Once installed, restart the virtual machine and ensure that the Cuckoo agent is running.

Snapshots can be taken through the XenCenter application and the command line interface on the control domain (Dom0). When creating the snapshot from XenCenter, ensure that the “Snapshot disk and memory” is checked. Once created, right-click on the snapshot and note the snapshot UUID.

To snapshot from the command line interface, run the following command:

$ xe vm-checkpoint vm="vm_uuid_or_name" new-name-label="Snapshot Name/Description"

The snapshot UUID is printed to the screen once the command completes.

Regardless of how the snapshot was created, save the UUID in the virtual machine’s configuration section. Once the snapshot has been created, you can shutdown the virtual machine.

Booting from Disk

If you can’t install the Xen guest tools or if you don’t need to use memory snapshots, you will need to ensure that the virtual machine’s disks are reset on boot and that the Cuckoo agent is set to run at boot time.

Running the agent at boot time can be configured in Windows by adding a startup item for the agent.

The following commands must be run while the virtual machine is powered off.

To set the virtual machine’s disks to reset on boot, you’ll first need to list all the attached disks for the virtual machine. To list all attached disks, run the following command:

$ xe vm-disk-list vm="vm_name_or_uuid"

Ignoring all CD-ROM and read-only disks, run the following command for each remaining disk to change it’s behavior to reset on boot:

$ xe vdi-param-set uuid="vdi_uuid" on-boot=reset

After the disk is set to reset on boot, no permanent changes can be made to the virtual machine’s disk. Modifications that occur while a virtual machine is running will not persist past shutdown.

虚拟机克隆

如果打算同时跑多个虚拟机,就需要用到虚拟机克隆。 通过克隆即可拥有多个准备好的虚拟机了。

克隆的新虚拟机也同样包含的一样的配置。 如果有问题的话,还是需要按照之前的章节 来重新配置新克隆的虚拟机。 可以参考 网络配置, 安装客户端 保存虚拟机 不过一般情况下是不会出现的。

Installing the Linux host

注解

【译者注】 暂时没有用到,就不翻译了。

Install dependencies on host:

$ sudo apt-get install uml-utilities bridge-utils

Preconfigure network tap interfaces on host, required to avoid have to start as root:

Get list of virtual machines to configure interface per vm from conf/qemu.conf

Example:
    machines = ubuntu_x32, ubuntu_x64, ubuntu_arm, ubuntu_mips, ubuntu_mipsel

You should preconfigure network interface for all of them, they all should have tap prefix:

$ sudo tunctl -b -u cuckoo -t tap_ubuntu_x32
$ sudo ip link set tap_ubuntu_x32 master br0
$ sudo ip link set dev tap_ubuntu_x32 up
$ sudo ip link set dev br0 up

$ sudo tunctl -b -u cuckoo -t tap_ubuntu_x64
$ sudo ip link set tap_ubuntu_x64 master br0
$ sudo ip link set dev tap_ubuntu_x64 up
$ sudo ip link set dev br0 up
The following instructions are only for x32/x64 ubuntu 17.04 linux guests

** Note if you run cuckoo with with no cuckoo user, replace cuckoo after -u to your user **

Add agent to autorun, the easier way is to add it to crontab:

$ sudo crontab -e
@reboot python path_to_agent.py

Install dependencies inside of the virtual machine:

$ sudo apt-get install systemtap gcc patch linux-headers-$(uname -r)

Install kernel debugging symbols:

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys C8CAB6595FDFF622

$ codename=$(lsb_release -cs)
$ sudo tee /etc/apt/sources.list.d/ddebs.list << EOF
  deb http://ddebs.ubuntu.com/ ${codename}          main restricted universe multiverse
  #deb http://ddebs.ubuntu.com/ ${codename}-security main restricted universe multiverse
  deb http://ddebs.ubuntu.com/ ${codename}-updates  main restricted universe multiverse
  deb http://ddebs.ubuntu.com/ ${codename}-proposed main restricted universe multiverse
EOF

$ sudo apt-get update
$ sudo apt-get install linux-image-$(uname -r)-dbgsym

Patch SystemTap tapset (this will change in the future):

$ wget https://raw.githubusercontent.com/cuckoosandbox/cuckoo/master/stuff/systemtap/expand_execve_envp.patch
$ wget https://raw.githubusercontent.com/cuckoosandbox/cuckoo/master/stuff/systemtap/escape_delimiters.patch
$ sudo patch /usr/share/systemtap/tapset/linux/sysc_execve.stp < expand_execve_envp.patch
$ sudo patch /usr/share/systemtap/tapset/uconversions.stp < escape_delimiters.patch

Compile Kernel extension:

$ wget https://raw.githubusercontent.com/cuckoosandbox/cuckoo/master/stuff/systemtap/strace.stp
$ sudo stap -p4 -r $(uname -r) strace.stp -m stap_ -v

Once the compilation finishes you should see the file stap_.ko in the same folder. You will now be able to test the STAP kernel extension as follows.

Test Kernel extension:

$ sudo staprun -v ./stap_.ko

Output should be something like as follows:

staprun:insert_module:x Module stap_ inserted from file path_to_stap_.ko

The stap_.ko file should be placed in /root/.cuckoo:

$ sudo mkdir /root/.cuckoo
$ sudo mv stap_.ko /root/.cuckoo/

Disable firewall inside of the vm, if exists:

$ sudo ufw disable

Disable NTP inside of the vm:

$ sudo timedatectl set-ntp off

Optional - preinstalled remove software and configurations:

$ sudo apt-get purge update-notifier update-manager update-manager-core ubuntu-release-upgrader-core
$ sudo apt-get purge whoopsie ntpdate cups-daemon avahi-autoipd avahi-daemon avahi-utils
$ sudo apt-get purge account-plugin-salut libnss-mdns telepathy-salut

Preparing the Guest (Physical Machine)

注解

【译者注】 主要使用虚拟机,物理机就不翻译了。

警告

This chapter only applies for a Physical Machine setup! For normal Cuckoo usage please ignore it.

At this point you should have configured the Cuckoo host component and you should have designed and defined the number and the names of the physical machines you are going to use for malware execution.

Now it’s time to create such machines and to configure them properly.

Creation of the Physical Machine

Once you have properly installed your imaging software, you can proceed on creating all the physical machines you need.

Using and configuring your imaging software is out of the scope of this guide, so please refer to the official documentation.

注解

You can find some hints and considerations on how to design and create your virtualized environment in the 沙箱 chapter.

注解

For analysis purposes you are recommended to use Windows XP Service Pack 3, but Cuckoo Sandbox also proved to work with Windows 7 with User Access Control disabled.

When creating the physical machine, Cuckoo doesn’t require any specific configuration. You can choose the options that best fit your needs.

Requirements

In order to make Cuckoo run properly in your physical Windows system, you will have to install some required software and libraries.

Install Python

Python is a strict requirement for the Cuckoo guest component (analyzer) in order to run properly.

You can download the proper Windows installer from the official website. Also in this case Python 2.7 is preferred.

Some Python libraries are optional and provide some additional features to Cuckoo guest component. They include:

  • Python Pillow: it’s used for taking screenshots of the Windows desktop during the analysis.

They are not strictly required by Cuckoo to work properly, but you are encouraged to install them if you want to have access to all available features. Make sure to download and install the proper packages according to your Python version.

NOTE: Physical machinery is currently not supported by the new cuckoo agent. Please use the old cuckoo agent for physical machinery in the meantime.

Additional Software

At this point you should have installed everything needed by Cuckoo to run properly.

Depending on what kind of files you want to analyze and what kind of sandboxed Windows environment you want to run the malware samples in, you might want to install additional software such as browsers, PDF readers, office suites etc. Remember to disable the “auto update” or “check for updates” feature of any additional software.

This is completely up to you and to what your needs are. You can get some hints by reading the 沙箱 chapter.

Additional Host Requirements

The physical machine manager uses RPC requests to reboot physical machines. The net command is required for this to be accomplished, and is available from the samba-common-bin package.

On Debian/Ubuntu you can install it with:

$ sudo apt-get install samba-common-bin

In order for the physical machine manager to work, you must have a way for physical machines to be returned to a clean state. In development/testing Fog was used as a platform to handle re-imaging the physical machines. However, any re-imaging platform can be used (Clonezilla, Deepfreeze, etc) to accomplish this.

Cuckoo Configuration Requirements

Since we are using physical machines to perform our analysis, we must account for the reboot/rebuild time of our physical machines in our Cuckoo configuration. Specifically, we must modify the vm_state timeout as specified in conf/cuckoo.conf:

vm_state = 60

By default, this value is set to 60 (seconds). We need to update it so that it reflects the amount of time required to reboot and rebuild the physical guest. In testing 10 minutes (i.e., vm_state = 600) has proven sufficient, provided a Windows 7 setup with a 1 gbit connection. However, it is recommended that you analyze the time it takes to reboot/rebuild the phyical machine in your environment before setting this value.

Network Configuration

Now it’s time to setup the network for your physical machine.

Windows Settings

Before configuring the underlying networking of the sandbox, you might want to tweak some settings inside Windows itself.

One of the most important things to do is disabling Windows Firewall and the Automatic Updates. The reason behind this is that they can affect the behavior of the malware under normal circumstances and that they can pollute the network analysis performed by Cuckoo, by dropping connections or including irrelevant requests.

You can do so from Windows’ Control Panel as shown in the picture:

_images/windows_security1.png

Using a physical machine manager requires a few more configuration options than the virtual machine managers in order to run properly. In addition to the steps laid out in the regular Preparing the Guest section, some settings need to be changed for physical machines to work properly.

  • Enable auto-logon (Allows for the agent to start upon reboot)
  • Enable Remote RPC (Allows for Cuckoo to reboot the sandbox using RPC)
  • Turn off paging (Optional)
  • Disable Screen Saver (Optional)

In Windows 7 the following commands can be entered into an Administrative command prompt to enable auto-logon and Remote RPC.

reg add "hklm\software\Microsoft\Windows NT\CurrentVersion\WinLogon" /v DefaultUserName /d <USERNAME> /t REG_SZ /f
reg add "hklm\software\Microsoft\Windows NT\CurrentVersion\WinLogon" /v DefaultPassword /d <PASSWORD> /t REG_SZ /f
reg add "hklm\software\Microsoft\Windows NT\CurrentVersion\WinLogon" /v AutoAdminLogon /d 1 /t REG_SZ /f
reg add "hklm\system\CurrentControlSet\Control\TerminalServer" /v AllowRemoteRPC /d 0x01 /t REG_DWORD /f
reg add "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System" /v LocalAccountTokenFilterPolicy /d 0x01 /t REG_DWORD /f
Networking

Now you need to decide how to make your physical machine able to access Internet or your local network.

While in previous releases Cuckoo used shared folders to exchange data between the Host and Guests, from release 0.4 it adopts a custom agent that works over the network using a simple XMLRPC protocol.

In order to make it work properly you’ll have to configure your machine’s network so that the Host and the Guest can communicate. Testing the network access by pinging a guest is a good practice, to make sure the virtual network was set up correctly. Use only static IP addresses for your guest, as today Cuckoo doesn’t support DHCP and using it will break your setup.

This stage is very much up to your own requirements and to the characteristics of your virtualization software.

For physical machines, make sure when setting the IP address of the guest to also set the Gateway and DNS server to be the IP address of the Cuckoo server on the physical network. For example, if your Cuckoo server has the IP address of 192.168.1.1, then you would set the Gateway and DNS server in Windows Settings to be 192.168.1.1 as well.

_images/windows_network.png
Installing the Agent

Installing the Agent on a Physical machine is the same as installing it in a Virtual Machine, therefore please refer to 安装客户端.

Saving the Guest

Now you should be ready to save the physical machine to a clean state. In order for the physical machine manager to work, you must have a way for physical machines to be returned to a clean state.

Before doing this make sure you rebooted it softly and that it’s currently running, with Cuckoo’s agent running and with Windows fully booted.

Now you can proceed saving the machine. The way to do it obviously depends on the imaging software you decided to use.

In development/testing Fog (http://www.fogproject.org/) was used as a platform to handle re-imaging the physical machines. However, any re-imaging platform can be used (Clonezilla, Deepfreeze, etc.) to accomplish this.

If you follow all the below steps properly, your virtual machine should be ready to be used by Cuckoo.

Fog

After installing Fog, you will need to create an image and add an image and a host to the Fog server.

To add an image to the fog server, open the Image Management window (http://<your_fog_server>/fog/management/index.php?node=images) and click “Create New Image.” Provide the proper inputs for your OS configuration and click “Add”

_images/fog_image_management.png

Next you will need to add the host you plan to re-image to Fog. To add a host, open a web browser and navigate to the Host Management page of Fog (http://<your_fog_server>/fog/management/index.php?node=host). Click “Create New Host.” Provide the proper inputs for your host configuration. Be sure to select the image you created above from the “Host Image” option, when finished click the “Add” button.

_images/fog_host_management.png

At this point you should be ready to take an image from the guest machine. In order to take an image you will need to navigate to the Task Management page and list all hosts (http://<your_fog_server>/fog/management/index.php?node=tasks&sub=listhosts). From here you should be able to click the Capture icon, which should instantly add a task to the queue to take an image. Now you should reboot your Cuckoo guest image and it should PXE boot into Fog and capture the base image from the cuckoo guest.

Now that you have created and capture an image in FOG, Cuckoo will use this image to rebuild the guest machine after each analysis task. If you have provided Cuckoo with valid FOG credentials and enabled Remote RPC (as shown in the Network Configuration section), Cuckoo will automatically schedule the Deploy Task in FOG and will also reboot the guest machine for you.

Setup using VMWare (Bonus!)

Traditionally Cuckoo requires to be running some sort of virtualization software (e.g. VMware, Virtualbox, etc). The physical machine manager will also work with other virtual machines, so long as they are configured to revert to a snapshot on shutdown/reboot, and running the agent.py script. A use case for this functionality would be to run the cuckoo server and the guest sandboxes each in their own virtual machine on a single host, allowing for development/testing of Cuckoo without requiring a dedicated Linux host.

Upgrading from a previous release

2.0.0 新版功能: Automatically upgrade from an older Cuckoo setup into a new one by importing the old setup.

注解

【译者注】 暂时没有用到,后续再翻译。

This document describes the process of importing an older Cuckoo setup in order to upgrade your Cuckoo to the latest and greatest version. This importing process is possible for Cuckoo 0.6 and upwards. Naturally it doesn’t re-apply any custom code changes that you applied to your old setup, but it does migrate your database, configuration, and analyses to the new version (in a best-effort manner).

Now, in order to upgrade your setup, you’ll simply have to perform the following steps:

  1. Come up with a Cuckoo 工作目录 for the new setup (although the default one should work just fine, assuming it doesn’t exist yet).
  2. Optionally create a backup of your data (Cuckoo will also offer to do this for you before doing the actual setup import).
  3. Run the cuckoo import command.
The cuckoo import command

The cuckoo import App performs a number of steps in order to import an older setup. Previously we had manual steps for performing a database migration, these have been integrated in the import process as well.

The usage of cuckoo import is as follows:

$ cuckoo import --help
Usage: cuckoo import [OPTIONS] PATH

  Imports an older Cuckoo setup into a new CWD. The old setup should be
  identified by PATH and the new CWD may be specified with the --cwd
  parameter, e.g., "cuckoo --cwd /tmp/cwd import old-cuckoo".

Options:
  --copy     Copy all existing analyses to the new CWD (default)
  --move     Move all existing analyses to the new CWD
  --symlink  Symlink all existing analyses to the new CWD
  --help  Show this message and exit.

As per the limited usage documentation of this command, there is an input and an output directory and a couple of different modes. The rest is done by cuckoo import according to best-practice manners.

The three different modes are best described as follows. Keep in mind that these modes only inform the importing process on what to do with the existing analyses and, in the case of sqlite3 usage, the database file. These modes do not apply to any other used databases or data not mentioned in this document.

  • copy: Copies all the analyses from the old setup to the new CWD. In this mode the old storage/ folder will be copied to $CWD/storage/. The copy mode is useful if you want to maintain a backup of the old setup and its analyses, allowing one to restore it with the appropriate SQL backup. Note that this mode will double the size of your existing analyses directory as it does a full copy.
  • move: Moves all the analyses from the old setup to the new CWD. In this mode the old storage/ folder is moved to $CWD/storage/. After the import process you won’t have a backup of your old data anymore, but you will be able to reference to it in the new CWD / setup.
  • symlink: Creates a symbolic link from each analysis in the old setup, i.e., storage/analyses/XYZ, to the new CWD, i.e., $CWD/storage/XYZ. This method is the most desired (as you’ll be able to access the existing analyses in both the old setup as well as the new CWD), but doesn’t work on Windows.

The default mode is copy due to its feature of remaining available on both the old setup as well as the new CWD as well as being cross-platform (i.e., symlink mode isn’t supported on Windows). After reading this documentation one may opt to go for symlink or move mode on non-Windows systems and move mode on Windows systems, though.

Following are the steps taken by Cuckoo when performing an import:

  • The user has to accept a non-binding EULA-like agreement that (just kidding) attempts to inform him or her regarding the implications of importing an older setup.
  • The version of the old Cuckoo setup is identified.
  • It is ensured that the new CWD does not already exist.
  • The old Cuckoo Configuration is read, migrated, and then validated to be fit for usage with the new Cuckoo version, i.e., you can configure a Cuckoo 0.6 setup and migrate it all the way to the latest version and it will simply work.
  • The new CWD is created and it is configured with the migrated configuration.
  • The user is prompted to optionally create a SQL database backup. On Linux-based systems this should work out of the box (and you’ll get a hard error otherwise), but due to issues with $PATH this may require manually fixing up the command on Windows & Mac OS X systems.
  • After the ability to create a SQL database backup, the database schema is migrated to the latest version in-place, i.e., you will not be able to use your old Cuckoo setup with this database anymore (hence the backup).
  • Any and all existing analyses are imported to the new CWD using the mode as specified, or if it has not been specified, the default copy method.

You are now the happy owner of an up-to-date Cuckoo setup. Please inform us of any feedback that you may have through one of the various communication channels that we’ve put in-place.

警告

One should not clean the old Cuckoo setup after the import. By attempting to do so you may lose the existing analyses (if symlink mode is used) and the SQL, MongoDB, and ElasticSearch databases.

使用说明

本章主要介绍如何使用Cuckoo.

启动 Cuckoo

使用如下命令可以启动Cuckoo:

$ cuckoo

启动后,可以看到如下的日志输出:

  eeee e   e eeee e   e  eeeee eeeee
  8  8 8   8 8  8 8   8  8  88 8  88
  8e   8e  8 8e   8eee8e 8   8 8   8
  88   88  8 88   88   8 8   8 8   8
  88e8 88ee8 88e8 88   8 8eee8 8eee8

 Cuckoo Sandbox 2.0.0
 www.cuckoosandbox.org
 Copyright (c) 2010-2017

 Checking for updates...
 Good! You have the latest version available.

2017-03-31 17:08:53,527 [cuckoo.core.scheduler] INFO: Using "virtualbox" as machine manager
2017-03-31 17:08:53,935 [cuckoo.core.scheduler] INFO: Loaded 1 machine/s
2017-03-31 17:08:53,964 [cuckoo.core.scheduler] INFO: Waiting for analysis tasks.

Cuckoo 会在开始的时候,请求 api.cuckoosandbox.org 检查更新。 不过可以在配置文件中修改 version_check 来关闭更新检查。

启动完成后,Cuckoo 就等着提交文件来分析了。

cuckoo 有多个命令行参数,通过 –help 参数可以看到所有的参数:

$ cuckoo --help
Usage: cuckoo [OPTIONS] COMMAND [ARGS]...

Invokes the Cuckoo daemon or one of its subcommands.

To be able to use different Cuckoo configurations on the same
machine with the same Cuckoo installation, we use the so-called
Cuckoo Working Directory (aka "CWD"). A default CWD is
available, but may be overridden through the following options -
listed in order of precedence.

* Command-line option (--cwd)
* Environment option ("CUCKOO")
* Environment option ("CUCKOO_CWD")
* Current directory (if the ".cwd" file exists)
* Default value ("~/.cuckoo")

Options:
  -d, --debug             Enable verbose logging
  -q, --quiet             Only log warnings and critical messages
  -m, --maxcount INTEGER  Maximum number of analyses to process
  --user TEXT             Drop privileges to this user
  --cwd TEXT              Cuckoo Working Directory
  --help                  Show this message and exit.

Commands:
  api          Operate the Cuckoo REST API.
  clean        Clean the CWD and associated databases.
  community    Fetch supplies from the Cuckoo Community.
  distributed  Distributed Cuckoo helper utilities.
  dnsserve     Custom DNS server.
  import       Imports an older Cuckoo setup into a new CWD.
  init         Initializes Cuckoo and its configuration.
  machine      Dynamically add/remove machines.
  migrate      Perform database migrations.
  process      Process raw task data into reports.
  rooter       Instantiates the Cuckoo Rooter.
  submit       Submit one or more files or URLs to Cuckoo.
  web          Operate the Cuckoo Web Interface.

--debug--quiet 用来控制Cuckoo的日志级别。

后台运行 Cuckoo

刚开始用的时候,手动起几次Cuckoo没什么感觉, 但是如果有很多台机器去管理的话, 自动化的运行Cuckoo就比较有必要了。

幸运的是,Cuckoo在 CWD 目录提供了一个 supervisord 的配置文件 supervisord.conf

运行 supervisord 指定配置文件路径:

$ supervisord -c $CWD/supervisord.conf

注解

【译者注】 supervisord 类似 Watchdog, 如果Cuckoo进程不存在,就会自动拉起。

需要注意的是, 默认情况下 supervisord 会启动4个 Processing Utility 实例。 要把 $CWD/conf/cuckoo.conf 配置中的 process_results 选项关闭。

配置好之后, 通过 supervisord 就可以管理cuckoo 进程了, 例如:

# Stop the Cuckoo daemon and the processing utilities.
$ supervisorctl stop cuckoo:

# Start the Cuckoo daemon and the processing utilities.
$ supervisorctl start cuckoo:

注意下, 命令中 cuckoo 后面 需要有个 冒号 : 表示一组cuckoo进程,包含process 和 daemon进程。

Cuckoo 工作目录使用说明

注解

本章阅读前,可以先看下Cuckoo安装和 Cuckoo 工作目录.

在文章开始之前,我们首先说明下,CWD 的引入到底改进了很多地方, 这么说吧, 能够直接提升我们的生活质量:)

改进点:

  • Cuckoo 安装 一节中所述 Cuckoo 现在的安装和升级只需要 执行一条命令 pip install -U cuckoo.
  • 由于现在Cuckoo已经录入Python官方版本库,所以我们对版本升级控制的更加严格了, 升级过程中会尽可能减少对用户已有数据的影响.
  • 也因为升级更加方便,我们的版本发布将会更加频繁。比如,BUG修复的速度将会更快。
  • Cuckoo的配置文件不在归档到GIt版本库中. 用户如果是从旧版本的Cuckoo升级过来的. 需要手动将配置文件更新到新的配置文件中。
  • 新的Cuckoo将所有的配置文件都集中放到 CWD 目录.
  • 新版本的Cuckoo支持一次安装运行多个实例指向不同的 CWD 目录。
  • 新版Cuckoo中,整合了之前的多个脚本, 以 cuckoo 脚本命令行参数的方式来运行 Cuckoo Apps
Usage

Cuckoo 安装(Cuckoo 安装)和配置(Cuckoo 工作目录)完成后, 就可以开始使用了。 如果安装过程种有问题,可以参考 DistributionNotFound / No distribution matching the version.. 如果使用的是virtualenv安装,可以参考如下命令

$ virtualenv venv
$ . venv/bin/activate
(venv)$ pip install -U pip setuptools
(venv)$ pip install -U cuckoo
(venv)$ cuckoo --cwd ~/.cuckoo

开始使用之前,如果需要修改Cuckoo的默认配置, 配置目录在 $CWD/conf/ 。 如果添加虚拟机或者修改数据库, 可以参考 客户机的准备。 如果需要WEB界面上看到样本分析报告, $CWD/conf/reporting.conf 中的 mongodb 一定要启用。 参考 Web 界面

接下来我们需要下载 Cuckoo Community, 其中包含了300多个恶意软件行为签名,可用于简化我们对结果的分析。 下载命令如下:

(venv)$ cuckoo community

或者如果有下载好的 community 压缩包, (例如 wget https://github.com/cuckoosandbox/community/archive/master.tar.gz) 可以通过如下命令直接导入:

(venv)$ cuckoo community --file master.tar.gz

至此,我们就可以开始提交样本了, 可以参考 提交工具 。 多个样本可以在一次命令中提交,例如:

(venv)$ cuckoo submit /tmp/sample1.exe /tmp/sample2.exe /tmp/sample3.exe
Success: File "/tmp/sample1.exe" added as task with ID #1
Success: File "/tmp/sample2.exe" added as task with ID #2
Success: File "/tmp/sample3.exe" added as task with ID #3
(venv)$ cuckoo submit --url google.com bing.com
Success: URL "google.com" added as task with ID #4
Success: URL "bing.com" added as task with ID #5

样本的分析,依赖 cuckoo 的守护进程。 默认情况下, 直接运行守护进程, 不限制同时分析的样本数量(可通过 -m 参数指定)

# This command is equal to what used to be "./cuckoo.py -d".
(venv)$ cuckoo -d

如果需要从WEB界面查看界面分析结果, 则需要运行cuckoo WEB进程。 对于测试环境或者并发数较小的环境, 可以通过内置的 Django WEB server 来运行, 实际环境下,我们更推荐基于高性能的WEB服务器来部署, 可以参考 Web 界面部署

(venv)$ cuckoo web
Performing system checks...

System check identified no issues (0 silenced).
March 31, 2017 - 12:10:46
Django version 1.8.4, using settings 'cuckoo.web.web.settings'
Starting development server at http://localhost:8000/
Quit the server with CONTROL-C.

另外,cuckoo 还包含了一些其他的领命, 例如 cuckoo clean (Clean all Tasks and Samples), Cuckoo Rooter 以及 Cuckoo Apps 列出的一些实用工具, 除此之外就没别的了。 so, happy analyzing.

提交分析

提交工具

最简单的提交样本分析的方式是通过 cuckoo submit 命令, 例如:

$ cuckoo submit --help
Usage: cuckoo submit [OPTIONS] [TARGET]...

  Submit one or more files or URLs to Cuckoo.

Options:
  -u, --url           Submitting URLs instead of samples
  -o, --options TEXT  Options for these tasks
  --package TEXT      Analysis package to use
  --custom TEXT       Custom information to pass along this task
  --owner TEXT        Owner of this task
  --timeout INTEGER   Analysis time in seconds
  --priority INTEGER  Priority of this task
  --machine TEXT      Machine to analyze these tasks on
  --platform TEXT     Analysis platform
  --memory            Enable memory dumping
  --enforce-timeout   Don't terminate the analysis early
  --clock TEXT        Set the system clock
  --tags TEXT         Analysis tags
  --baseline          Create baseline task
  --remote TEXT       Submit to a remote Cuckoo instance
  --shuffle           Shuffle the submitted tasks
  --pattern TEXT      Provide a glob-pattern when submitting a
                      directory
  --max INTEGER       Submit up to X tasks at once
  --unique            Only submit samples that have not been
                      analyzed before
  -d, --debug         Enable verbose logging
  -q, --quiet         Only log warnings and critical messages
  --help              Show this message and exit.

通过 cuckoo submit 可以指定文件或者目录, 如果是目录的话, 会遍历并提交里面的文件。

对于提交为样本类型会在后续的代码中自动分析, 可以参考 Analysis Packages

Example: 提交一个本地的二进制文件:

$ cuckoo submit /path/to/binary

Example: 提交一个 URL:

$ cuckoo submit --url http://www.example.com

Example: 提交一个本地的二进制文件并且指定了较高的优先级:

$ cuckoo submit --priority 5 /path/to/binary

Example: 提交一个本地的二进制文件并且设置最长分析时间是60秒:

$ cuckoo submit --timeout 60 /path/to/binary

Example: 提交一个本地的二进制文件并且指定文件类型:

$ cuckoo submit --package <name of package> /path/to/binary

Example: 提交一个本地的二进制文件并且指定网络路由方式是tor:

$ cuckoo submit -o route=tor /path/to/binary

Example: 提交一个本地的二进制文件并且指定文件类型,以及指定二进制文件运行时携带的参数:

$ cuckoo submit --package exe --options arguments=--dosomething /path/to/binary.exe

Example: 提交一个本地的二进制文件并且指定运行的虚拟机是 cuckoo1:

$ cuckoo submit --machine cuckoo1 /path/to/binary

Example: 提交一个本地的二进制文件并且指定虚拟机平台是windows:

$ cuckoo submit --platform windows /path/to/binary

Example: 提交一个本地的二进制文件并且要求完整内存dumps:

$ cuckoo submit --memory /path/to/binary

Example: 提交一个本地的二进制文件并且强制使用最大的单个样本分析时长:

$ cuckoo submit --enforce-timeout /path/to/binary

Example: 提交一个本地的二进制文件并且指定设置虚拟机的系统日期时间:

$ cuckoo submit --clock "01-24-2001 14:41:20" /path/to/binary

Example: 提交一个本地的二进制文件并且要求内存分析, 且设置内存分析的参数

$ cuckoo submit --memory --options free=yes /path/to/binary
API

REST API 的使用方法参考 REST API.

分布式 Cuckoo

分布式的Cuckoo 可以参考 Distributed Cuckoo.

Python 函数库

为了数据库的兼容性,我们使用了一个流行的Python ORM 库 SQLAlchemy, 可以支持多种数据库类型,包括但不限于 SQLite, MySQL or MariaDB, PostgreSQL 。

Cuckoo 被设计成可以方便集成到大的系统中。 我们推荐使用 REST API 接口, 参考 REST API 。 如果想实现自己的提交脚本,也可以使用 add_path()add_url() 函数。

函数接口如下.

add_path(file_path[, timeout=0[, package=None[, options=None[, priority=1[, custom=None[, owner=""[, machine=None[, platform=None[, tags=None[, memory=False[, enforce_timeout=False], clock=None[]]]]]]]]]]]]])

Add a local file to the list of pending analysis tasks. Returns the ID of the newly generated task.

参数:
  • file_path (string) – path to the file to submit
  • timeout (integer) – maximum amount of seconds to run the analysis for
  • package (string or None) – analysis package you want to use for the specified file
  • options (string or None) – list of options to be passed to the analysis package (in the format key=value,key=value)
  • priority (integer) – numeric representation of the priority to assign to the specified file (1 being low, 2 medium, 3 high)
  • custom (string or None) – custom value to be passed over and possibly reused at processing or reporting
  • owner (string or None) – task owner
  • machine (string or None) – Cuckoo identifier of the virtual machine you want to use, if none is specified one will be selected automatically
  • platform (string or None) – operating system platform you want to run the analysis one (currently only Windows)
  • tags (string or None) – tags for machine selection
  • memory (True or False) – set to True to generate a full memory dump of the analysis machine
  • enforce_timeout (True or False) – set to True to force the execution for the full timeout
  • clock (string or None) – provide a custom clock time to set in the analysis machine
返回类型:

integer

Example usage:

1
2
3
4
5
>>> from cuckoo.core.database import Database
>>> db = Database()
>>> db.add_path("/tmp/malware.exe")
1
>>>
add_url(url[, timeout=0[, package=None[, options=None[, priority=1[, custom=None[, owner=""[, machine=None[, platform=None[, tags=None[, memory=False[, enforce_timeout=False], clock=None[]]]]]]]]]]]]])

Add a local file to the list of pending analysis tasks. Returns the ID of the newly generated task.

参数:
  • url (string) – URL to analyze
  • timeout (integer) – maximum amount of seconds to run the analysis for
  • package (string or None) – analysis package you want to use for the specified URL
  • options (string or None) – list of options to be passed to the analysis package (in the format key=value,key=value)
  • priority (integer) – numeric representation of the priority to assign to the specified URL (1 being low, 2 medium, 3 high)
  • custom (string or None) – custom value to be passed over and possibly reused at processing or reporting
  • owner (string or None) – task owner
  • machine (string or None) – Cuckoo identifier of the virtual machine you want to use, if none is specified one will be selected automatically
  • platform (string or None) – operating system platform you want to run the analysis one (currently only Windows)
  • tags (string or None) – tags for machine selection
  • memory (True or False) – set to True to generate a full memory dump of the analysis machine
  • enforce_timeout (True or False) – set to True to force the execution for the full timeout
  • clock (string or None) – provide a custom clock time to set in the analysis machine
返回类型:

integer

Example Usage:

1
2
3
4
5
6
>>> from cuckoo.core.database import Database
>>> db = Database()
>>> db.connect()
>>> db.add_url("http://www.cuckoosandbox.org")
2
>>>

Web 界面

Cuckoo 提供一个较为完成的web界面,提供样本提交,报告查看, 分析结果搜索功能。

配置

Web 界面依赖 Mongodb, 如果没有安装或者 reporting.conf 没有打开开关,运行就会报错。

$CWD/web/local_settings.py 文件中包含了web 界面的配置信息.

# Copyright (C) 2013 Claudio Guarnieri.
# Copyright (C) 2014-2017 Cuckoo Foundation.
# This file is part of Cuckoo Sandbox - http://www.cuckoosandbox.org
# See the file 'docs/LICENSE' for copying permission.

import web.errors

# Maximum upload size (10GB, so there's basically no limit).
MAX_UPLOAD_SIZE = 10*1024*1024*1024

# Override default secret key stored in $CWD/web/.secret_key
# Make this unique, and don't share it with anybody.
# SECRET_KEY = "YOUR_RANDOM_KEY"

# Language code for this installation. All choices can be found here:
# http://www.i18nguy.com/unicode/language-identifiers.html
LANGUAGE_CODE = "en-us"

ADMINS = (
    # ("Your Name", "your_email@example.com"),
)

MANAGERS = ADMINS

# Allow verbose debug error message in case of application fault.
# It's strongly suggested to set it to False if you are serving the
# web application from a web server front-end (i.e. Apache).
DEBUG = False
DEBUG404 = False

# A list of strings representing the host/domain names that this Django site
# can serve.
# Values in this list can be fully qualified names (e.g. 'www.example.com').
# When DEBUG is True or when running tests, host validation is disabled; any
# host will be accepted. Thus it's usually only necessary to set it in production.
ALLOWED_HOSTS = ["*"]

handler404 = web.errors.handler404
handler500 = web.errors.handler500

生产环境下,我们建议 关闭 DEBUG 开关, 以及至少配置一个 ADMIN 信息 用于发送告警的通知邮件。

在 2.0.0 版更改: The default maximum upload size has been bumped from 25 MB to 10 GB so that virtually any file should be accepted.

启动 Web 界面

通过如下命令即可启动 Web 界面:

$ cuckoo web runserver

如果需要指定监听的IP和端口,可以参考如下命令:

$ cuckoo web runserver 0.0.0.0:PORT

或者:

$ cuckoo web -H 0
Web 界面部署

默认的 Web 界面部署方式基本上没有什么大问题。 但是如果需要更好的性能和稳定性,我们推荐 WSGI 方式部署。 本章简单介绍了, 如何通过 uWSGInginx 来部署。 以下都是以 Ubuntu环境下为例, 但是其他操作系统下,配置也是类似的

首先需要安装相关依赖包:

$ sudo apt-get install uwsgi uwsgi-plugin-python nginx
uWSGI 设置

首先通过 cuckoo web --uwsgi 来生成 uWSGI 的配置文件内容, 配置文件存储在 /etc/uwsgi/apps-available/cuckoo-web.ini ,内容如下:

$ cuckoo web --uwsgi
[uwsgi]
plugins = python
virtualenv = /home/cuckoo/cuckoo
module = cuckoo.web.web.wsgi
uid = cuckoo
gid = cuckoo
static-map = /static=/home/..somepath..
# If you're getting errors about the PYTHON_EGG_CACHE, then
# uncomment the following line and add some path that is
# writable from the defined user.
# env = PYTHON_EGG_CACHE=
env = CUCKOO_APP=web
env = CUCKOO_CWD=/home/..somepath..

配置文件中大部分内容是继承自 uWSGI的默认配置, 以及导入了 cuckoo.web.web.wsgi。 由于示例中 Cuckoo 是通过 virtualenv 来安装的,所以配置中含有了相关信息, 如果不是 virtualenv 安装,则没有类似的配置信息。

连接配置文件,启动 uwsgi 应用.

$ sudo ln -s /etc/uwsgi/apps-available/cuckoo-web.ini /etc/uwsgi/apps-enabled/
$ sudo service uwsgi start cuckoo-web    # or reload, if already running

注解

uwsgi 的日志文件路径 /var/log/uwsgi/app/cuckoo-web.log. UNIX socket 文件路径 /run/uwsgi/app/cuckoo-web/socket.

nginx 设置

uWSGI的应用已经跑起来了,接下来把NGINX配置成反向代理模式,转发HTTP请求到uWSGI应用。

通过 cuckoo web --nginx 命令生成配置文件内容, 配置文件存储到 /etc/nginx/sites-available/cuckoo-web 目录

$ cuckoo web --nginx
upstream _uwsgi_cuckoo_web {
    server unix:/run/uwsgi/app/cuckoo-web/socket;
}

server {
    listen localhost:8000;

    # Cuckoo Web Interface
    location / {
        client_max_body_size 1G;
        uwsgi_pass  _uwsgi_cuckoo_web;
        include     uwsgi_params;
    }
}

确保 Nginx 有权限连接到uWSGI 应用。 如果 cuckoo 以 cuckoo 用户组运行, 则需要将www-data 用户加入到用户组:

$ sudo adduser www-data cuckoo

链接配置,并启动nginx

$ sudo ln -s /etc/nginx/sites-available/cuckoo-web /etc/nginx/sites-enabled/
$ sudo service nginx start    # or reload, if already running

至此 web 界面就跑起来了, 监听端口是 8000。 接下来可以继续调整配置,例如调整nginx的性能参数,或者使用https 服务, 这些本文档就不做详细说明了, 各位如果有兴趣,可以自己去研究。

REST API

正在 提交分析 章节提到的, Cuckoo 提供一个基于 Flask 的轻量级 RESET APT 服务。

启动 API 服务

API 启动命令如下:

$ cuckoo api

默认情况下绑定的是 localhost:8090. 如果需要修改监听,命令如下:

$ cuckoo api --host 0.0.0.0 --port 1337
$ cuckoo api -H 0.0.0.0 -p 1337
Web 部署

默认的方式已经可以处理大部分场景。 如果需要更高的性能和稳定性,可以使用 uWSGInginx 来部署API。

uWSGI 部署需要安装相关依赖:

$ sudo apt-get install uwsgi uwsgi-plugin-python nginx
uWSGI 设置

First, use uWSGI to run the API server as an application.

首先通过 cuckoo api --uwsgi 来生成 uWSGI 的配置文件内容, 配置文件存储在 /etc/uwsgi/apps-available/cuckoo-api.ini ,内容如下:

$ cuckoo api --uwsgi
[uwsgi]
plugins = python
virtualenv = /home/cuckoo/cuckoo
module = cuckoo.apps.api
callable = app
uid = cuckoo
gid = cuckoo
env = CUCKOO_APP=api
env = CUCKOO_CWD=/home/..somepath..

配置文件中大部分内容是继承自 uWSGI的默认配置, 以及导入了 cuckoo.apps.api。 由于示例中 Cuckoo 是通过 virtualenv 来安装的,所以配置中含有了相关信息, 如果不是 virtualenv 安装,则没有类似的配置信息

连接配置文件,启动 uwsgi 应用.

$ sudo ln -s /etc/uwsgi/apps-available/cuckoo-api.ini /etc/uwsgi/apps-enabled/
$ sudo service uwsgi start cuckoo-api    # or reload, if already running

注解

uwsgi 的日志文件路径 /var/log/uwsgi/app/cuckoo-api.log. UNIX socket 文件路径 /run/uwsgi/app/cuckoo-api/socket.

nginx 设置

uWSGI的应用已经跑起来了,接下来把NGINX配置成反向代理模式,转发HTTP请求到uWSGI应用。

通过 cuckoo api --nginx 命令生成配置文件内容, 配置文件存储到 /etc/nginx/sites-available/cuckoo-api 目录:

$ cuckoo api --nginx
upstream _uwsgi_cuckoo_api {
    server unix:/run/uwsgi/app/cuckoo-api/socket;
}

server {
    listen localhost:8090;

    # REST API app
    location / {
        client_max_body_size 1G;
        uwsgi_pass  _uwsgi_cuckoo_api;
        include     uwsgi_params;
    }
}

确保 Nginx 有权限连接到uWSGI 应用。 如果 cuckoo 以 cuckoo 用户组运行, 则需要将www-data 用户加入到用户组:

$ sudo adduser www-data cuckoo

链接配置,并启动nginx

$ sudo ln -s /etc/nginx/sites-available/cuckoo-api /etc/nginx/sites-enabled/
$ sudo service nginx start    # or reload, if already running

至此 web 界面就跑起来了, 监听端口是 8090。 接下来可以继续调整配置,例如调整nginx的性能参数,或者使用https 服务, 这些本文档就不做详细说明了, 各位如果有兴趣,可以自己去研究。

接口

下表是当前可用的接口和简单描述, 欲知详情,可以点击接口名称

接口名称 接口描述
POST /tasks/create/file 提交一个样本并创建分析任务.
POST /tasks/create/url 提交一个URL并创建分析任务.
POST /tasks/create/submit 提交一个或多个样本并创建分析任务.
GET /tasks/list 返回数据中存储的分析任务列表. 可通过参数控制返回的任务数量.
GET /tasks/sample 根据样本ID返回任务列表.
GET /tasks/view 根据任务ID返回任务详情.
GET /tasks/reschedule 根据任务ID重新开始任务.
GET /tasks/delete 根据任务ID删除任务和任务报表.
GET /tasks/report 根据任务ID返回报表内容. 默认为JSON格式报表,可选其他格式
GET /tasks/screenshots 根据任务ID和截图ID返回截图内容.
GET /tasks/rereport 根据任务ID重新生成报表.
GET /tasks/reboot 根据任务ID重启任务.
GET /memory/list 根据任务ID 返回memory dump 列表.
GET /memory/get 根据任务ID和memory dump id返回Memory dump内容.
GET /files/view 根据 MD5 hash, SHA256 hash 或者任务ID返回样本信息.
GET /files/get 根据SHA256 hash值获取样本内容.
GET /pcap/get 根据任务ID返回PCAP文件内容.
GET /machines/list 返回当前可用的虚拟机列表.
GET /machines/view 根据虚拟机名称返回虚拟机相信信息.
GET /cuckoo/status 返回Cuckoo当前状态,包括版本信息和任务概览.
GET /vpn/status 返回VPN状态.
GET /exit 关闭API服务.
/tasks/create/file

POST /tasks/create/file

提交一个样本并创建分析任务. 返回创建的任务ID.

Example request:

curl -F file=@/path/to/file http://localhost:8090/tasks/create/file

Example request using Python..

import requests

REST_URL = "http://localhost:8090/tasks/create/file"
SAMPLE_FILE = "/path/to/malwr.exe"

with open(SAMPLE_FILE, "rb") as sample:
    files = {"file": ("temp_file_name", sample)}
    r = requests.post(REST_URL, files=files)

# Add your code to error checking for r.status_code.

task_id = r.json()["task_id"]

# Add your code for error checking if task_id is None.

Example response.

{
    "task_id" : 1
}

Form parameters:

  • file (required) - 样本内容 (multipart encoded file content)
  • package (optional) - 样本文件类型
  • timeout (optional) (int) - 分析超时时长 (in seconds)
  • priority (optional) (int) - 任务优先级 (1-3)
  • options (optional) - 样本执行参数
  • machine (optional) - 指定运行的虚拟机名称
  • platform (optional) - 指定运行的虚拟机平台 (e.g. “windows”)
  • tags (optional) - 指定虚拟机启动的tags,以逗号分割, 该选项生效的前提是 platform 参数必须设置
  • custom (optional) - 自定义字符串,用于分析和报表模块
  • owner (optional) - 指定任务所属责任人
  • clock (optional) - 设置虚拟机系统时间 (format %m-%d-%Y %H:%M:%S)
  • memory (optional) - 开启完整的虚拟机内存dump
  • unique (optional) - 只提交样本,不分析
  • enforce_timeout (optional) - 开启强制使用最大分析超时时长

Status codes:

  • 200 - 接口提交成功
  • 400 - 样本已存在 (unique 参数开启的情况下)
/tasks/create/url

POST /tasks/create/url

提交一个URL并创建分析任务. 返回创建的任务ID.

Example request.

curl -F url="http://www.malicious.site" http://localhost:8090/tasks/create/url

Example request using Python.

import requests

REST_URL = "http://localhost:8090/tasks/create/url"
SAMPLE_URL = "http://example.org/malwr.exe"

data = {"url": SAMPLE_URL}
r = requests.post(REST_URL, data=data)

# Add your code to error checking for r.status_code.

task_id = r.json()["task_id"]

# Add your code to error checking if task_id is None.

Example response.

{
    "task_id" : 1
}

Form parameters:

  • url (required) - 待分析的URL (multipart encoded content)
  • package (optional) - 样本文件类型
  • timeout (optional) (int) - 分析超时时长 (in seconds)
  • priority (optional) (int) - 任务优先级 (1-3)
  • options (optional) - 样本执行参数
  • machine (optional) - 指定运行的虚拟机名称
  • platform (optional) - 指定运行的虚拟机平台 (e.g. “windows”)
  • tags (optional) - 指定虚拟机启动的tags,以逗号分割, 该选项生效的前提是 platform 参数必须设置
  • custom (optional) - 自定义字符串,用于分析和报表模块
  • owner (optional) - 指定任务所属责任人
  • memory (optional) - 开启完整的虚拟机内存dump
  • enforce_timeout (optional) - 开启强制使用最大分析超时时长
  • clock (optional) - 设置虚拟机系统时间 (format %m-%d-%Y %H:%M:%S)

Status codes:

  • 200 - 提交成功
/tasks/create/submit

POST /tasks/create/submit

提交一个或多个样本 或者 多个URL或hash值 并创建分析任务。 返回创建的任务ID列表。

Example request.

# Submit two executables.
curl http://localhost:8090/tasks/create/submit -F files=@1.exe -F files=@2.exe

# Submit http://google.com
curl http://localhost:8090/tasks/create/submit -F strings=google.com

# Submit http://google.com & http://facebook.com
curl http://localhost:8090/tasks/create/submit -F strings=$'google.com\nfacebook.com'

Example request using Python.

import requests

# Submit one or more files.
r = requests.post("http://localhost:8090/tasks/create/submit", files=[
    ("files", open("1.exe", "rb")),
    ("files", open("2.exe", "rb")),
])

# Add your code to error checking for r.status_code.

submit_id = r.json()["submit_id"]
task_ids = r.json()["task_ids"]
errors = r.json()["errors"]

# Add your code to error checking on "errors".

# Submit one or more URLs or hashes.
urls = [
    "google.com", "facebook.com", "cuckoosandbox.org",
]
r = requests.post(
    "http://localhost:8090/tasks/create/submit",
    data={"strings": "\n".join(urls)}
)

Example response from the executable submission.

{
    "submit_id": 1,
    "task_ids": [1, 2],
    "errors": []
}

Form parameters:

  • file (optional) - 兼容 /tasks/create/file 接口的样本
  • files (optional) - 提交分析队列的多个样本
  • strings (optional) - 按行分割的多个URL或HASH值列表 (to be obtained using your VirusTotal API key)
  • timeout (optional) (int) - 分析超时时长 (in seconds)
  • priority (optional) (int) - 任务优先级 (1-3)
  • options (optional) - 样本执行参数
  • tags (optional) - 指定虚拟机启动的tags,以逗号分割, 该选项生效的前提是 platform 参数必须设置
  • custom (optional) - 自定义字符串,用于分析和报表模块
  • owner (optional) - 指定任务所属责任人
  • memory (optional) - 开启完整的虚拟机内存dump
  • enforce_timeout (optional) - 开启强制使用最大分析超时时长
  • clock (optional) - 设置虚拟机系统时间 (format %m-%d-%Y %H:%M:%S)

Status codes:

  • 200 - 提交成功
/tasks/list

GET /tasks/list/ (int: limit) / (int: offset)

返回任务列表.

Example request.

curl http://localhost:8090/tasks/list

Example response.

{
    "tasks": [
        {
            "category": "url",
            "machine": null,
            "errors": [],
            "target": "http://www.malicious.site",
            "package": null,
            "sample_id": null,
            "guest": {},
            "custom": null,
            "owner": "",
            "priority": 1,
            "platform": null,
            "options": null,
            "status": "pending",
            "enforce_timeout": false,
            "timeout": 0,
            "memory": false,
            "tags": []
            "id": 1,
            "added_on": "2012-12-19 14:18:25",
            "completed_on": null
        },
        {
            "category": "file",
            "machine": null,
            "errors": [],
            "target": "/tmp/malware.exe",
            "package": null,
            "sample_id": 1,
            "guest": {},
            "custom": null,
            "owner": "",
            "priority": 1,
            "platform": null,
            "options": null,
            "status": "pending",
            "enforce_timeout": false,
            "timeout": 0,
            "memory": false,
            "tags": [
                        "32bit",
                        "acrobat_6",
                    ],
            "id": 2,
            "added_on": "2012-12-19 14:18:25",
            "completed_on": null
        }
    ]
}

Parameters:

  • limit (optional) (int) - 返回的最大任务数量
  • offset (optional) (int) - 任务列表开始位置

Status codes:

  • 200 - 成功
/tasks/sample

GET /tasks/sample/ (int: sample_id)

根据样本ID返回任务列表.

Example request.

curl http://localhost:8090/tasks/sample/1

Example response.

{
    "tasks": [
        {
            "category": "file",
            "machine": null,
            "errors": [],
            "target": "/tmp/malware.exe",
            "package": null,
            "sample_id": 1,
            "guest": {},
            "custom": null,
            "owner": "",
            "priority": 1,
            "platform": null,
            "options": null,
            "status": "pending",
            "enforce_timeout": false,
            "timeout": 0,
            "memory": false,
            "tags": [
                        "32bit",
                        "acrobat_6",
                    ],
            "id": 2,
            "added_on": "2012-12-19 14:18:25",
            "completed_on": null
        }
    ]
}

Parameters:

  • sample_id (required) (int) - 样本ID

Status codes:

  • 200 - 成功
/tasks/view

GET /tasks/view/ (int: id)

根据任务ID返回任务详情.

Example request.

curl http://localhost:8090/tasks/view/1

Example response.

{
    "task": {
        "category": "url",
        "machine": null,
        "errors": [],
        "target": "http://www.malicious.site",
        "package": null,
        "sample_id": null,
        "guest": {},
        "custom": null,
        "owner": "",
        "priority": 1,
        "platform": null,
        "options": null,
        "status": "pending",
        "enforce_timeout": false,
        "timeout": 0,
        "memory": false,
        "tags": [
                    "32bit",
                    "acrobat_6",
                ],
        "id": 1,
        "added_on": "2012-12-19 14:18:25",
        "completed_on": null
    }
}

Note: status 参数包含以下几种状态:

  • pending
  • running
  • completed
  • reported

Parameters:

  • id (required) (int) - 任务ID

Status codes:

  • 200 - 成功
  • 404 - 未找到任务
/tasks/reschedule

GET /tasks/reschedule/ (int: id) / (int: priority)

根据任务ID重新设置任务分析计划,设置任务优先级,默认为 1

Example request.

curl http://localhost:8090/tasks/reschedule/1

Example response.

{
    "status": "OK"
}

Parameters:

  • id (required) (int) - 任务ID
  • priority (optional) (int) - 任务优先级

Status codes:

  • 200 - 成功
  • 404 - 未找到任务
/tasks/delete

GET /tasks/delete/ (int: id)

根据任务ID删除任务和任务报表.

Example request.

curl http://localhost:8090/tasks/delete/1

Parameters:

  • id (required) (int) - 任务ID

Status codes:

  • 200 - 成功
  • 404 - 未找到任务
  • 500 - 无法删除任务
/tasks/report

GET /tasks/report/ (int: id) / (str: format)

根据任务ID返回报表内容.

Example request.

curl http://localhost:8090/tasks/report/1

Parameters:

  • id (required) (int) - 任务ID
  • format (optional) - 报表格式 [json/html/all/dropped/package_files]. 默认为JSON格式. all 返回 tar.bz2 格式的压缩包,包含所有报告文件, dropped 返回 tar.bz2 格式压缩包, 包含所有样本产生的文件, package_files 返回分析模块传到宿主机上的所有文件.

Status codes:

  • 200 - 成功
  • 400 - 报告格式参数错误
  • 404 - 未找到相应的报告
/tasks/screenshots

GET /tasks/screenshots/ (int: id) / (str: number)

根据任务ID返回截图内容.

Example request.

wget http://localhost:8090/tasks/screenshots/1

Parameters:

  • id (required) (int) - 任务ID
  • screenshot (optional) - 截图的序号 (e.g. 0001, 0002)

Status codes:

  • 404 - 文件或者文件夹未找到
/tasks/rereport

GET /tasks/rereport/ (int: id)

根据任务ID重新生成报表.

Example request.

curl http://localhost:8090/tasks/rereport/1

Example response.

{
    "success": true
}

Parameters:

  • id (required) (int) - 任务ID

Status codes:

  • 200 - 成功
  • 404 - 未找到任务
/tasks/reboot

GET /tasks/reboot/ (int: id) **

根据已有的任务分析ID添加重新分析任务.

Example request.

curl http://localhost:8090/tasks/reboot/1

Example response.

{
    "task_id": 1,
    "reboot_id": 3
}

Parameters:

  • id (required) (int) - 任务ID

Status codes:

  • 200 - 成功
  • 404 - 创建任务失败
/memory/list

GET /memory/list/ (int: id)

根据任务ID返回一个或者多个Memory dump的内容

Example request.

wget http://localhost:8090/memory/list/1

Parameters:

  • id (required) (int) - 任务ID

Status codes:

  • 404 - 未找到文件
/memory/get

GET /memory/get/ (int: id) / (str: number)

根据任务ID和Memory dump的序号,返回内容.

Example request.

wget http://localhost:8090/memory/get/1/1908

Parameters:

  • id (required) (int) - 任务ID
  • pid (required) - memory dump 文件序号 (e.g. 205, 1908)

Status codes:

  • 404 - 文件未找到
/files/view

GET /files/view/md5/ (str: md5)

GET /files/view/sha256/ (str: sha256)

GET /files/view/id/ (int: id)

根据指定的 MD5 hash, SHA256 hash 或者样本ID号返回样本信息.

Example request.

curl http://localhost:8090/files/view/id/1

Example response.

{
    "sample": {
        "sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
        "file_type": "empty",
        "file_size": 0,
        "crc32": "00000000",
        "ssdeep": "3::",
        "sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
        "sha512": "cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e",
        "id": 1,
        "md5": "d41d8cd98f00b204e9800998ecf8427e"
    }
}

Parameters:

  • md5 (optional) - MD5 值
  • sha256 (optional) - SHA256 hash 值
  • id (optional) (int) - 样本 ID

Status codes:

  • 200 - 成功
  • 400 - 无效的查找项
  • 404 - 文件未找到
/files/get

GET /files/get/ (str: sha256)

根据 SHA256 hash值返回样本内容.

Example request.

curl http://localhost:8090/files/get/e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > sample.exe

Status codes:

  • 200 - 成功
  • 404 - 文件未找到
/pcap/get

GET /pcap/get/ (int: task)

根据任务ID返回PCAP文件内容.

Example request.

curl http://localhost:8090/pcap/get/1 > dump.pcap

Status codes:

  • 200 - 成功
  • 404 - 文件未找到
/machines/list

GET /machines/list

返回可用的虚拟机详情列表.

Example request.

curl http://localhost:8090/machines/list

Example response.

{
    "machines": [
        {
            "status": null,
            "locked": false,
            "name": "cuckoo1",
            "resultserver_ip": "192.168.56.1",
            "ip": "192.168.56.101",
            "tags": [
                        "32bit",
                        "acrobat_6",
                    ],
            "label": "cuckoo1",
            "locked_changed_on": null,
            "platform": "windows",
            "snapshot": null,
            "interface": null,
            "status_changed_on": null,
            "id": 1,
            "resultserver_port": "2042"
        }
    ]
}

Status codes:

  • 200 - 成功
/machines/view

GET /machines/view/ (str: name)

根据虚拟机名称返回虚拟机详情.

Example request.

curl http://localhost:8090/machines/view/cuckoo1

Example response.

{
    "machine": {
        "status": null,
        "locked": false,
        "name": "cuckoo1",
        "resultserver_ip": "192.168.56.1",
        "ip": "192.168.56.101",
        "tags": [
                    "32bit",
                    "acrobat_6",
                ],
        "label": "cuckoo1",
        "locked_changed_on": null,
        "platform": "windows",
        "snapshot": null,
        "interface": null,
        "status_changed_on": null,
        "id": 1,
        "resultserver_port": "2042"
    }
}

Status codes:

  • 200 - 成功
  • 404 - 未找到虚拟机
/cuckoo/status

GET /cuckoo/status/

返回cuckoo 当前状态。 在 1.3 版本中,增加了磁盘状态 包含磁盘已使用,未使用以及总磁盘(仅在类Unix系统中有效)。 同时增加了CPU负载情况,包含CPU过去的1分钟,5分钟和15分钟的负载(仅在类Unix系统中有效)。

Diskspace directories:

  • analyses - $CUCKOO/storage/analyses/
  • binaries - $CUCKOO/storage/binaries/
  • temporary - tmppath as specified in conf/cuckoo.conf

Example request.

curl http://localhost:8090/cuckoo/status

Example response.

{
    "tasks": {
        "reported": 165,
        "running": 2,
        "total": 167,
        "completed": 0,
        "pending": 0
    },
    "diskspace": {
        "analyses": {
            "total": 491271233536,
            "free": 71403470848,
            "used": 419867762688
        },
        "binaries": {
            "total": 491271233536,
            "free": 71403470848,
            "used": 419867762688
        },
        "temporary": {
            "total": 491271233536,
            "free": 71403470848,
            "used": 419867762688
        }
    },
    "version": "1.0",
    "protocol_version": 1,
    "hostname": "Patient0",
    "machines": {
        "available": 4,
        "total": 5
    }
}

Status codes:

  • 200 - 成功
  • 404 - 虚拟机未找到
/vpn/status

GET /vpn/status

返回VPN状态.

Example request.

curl http://localhost:8090/vpn/status

Status codes:

  • 200 - 查询成功
  • 500 - 不可用
/exit

GET /exit

如果在调试模式以及使用的werkzeug服务,可以关闭当前的API 服务

Example request.

curl http://localhost:8090/exit

Status codes:

  • 200 - 成功
  • 403 - 该接口仅有debug模式有效
  • 500 - 报错

Distributed Cuckoo

As mentioned in 提交分析, Cuckoo provides a REST API for Distributed Cuckoo usage. Distributed Cuckoo allows one to setup a single REST API point to which samples and URLs can be submitted which will then, in turn, be submitted to one of the configured Cuckoo nodes.

A typical setup thus includes a machine on which Distributed Cuckoo is run and one or more machines running an instance of the Cuckoo daemon and the Cuckoo REST API.

A few notes:

  • Using Distributed Cuckoo only makes sense when running at least two cuckoo nodes.
  • Distributed Cuckoo can be run on a machine that also runs a Cuckoo daemon and REST API, however, make sure it has enough disk space if the intention is to submit a lot of samples.
Starting the Distributed REST API

The Distributed REST API has the following command line options:

$ cuckoo distributed server --help
Usage: cuckoo distributed server [OPTIONS]

Options:
  -H, --host TEXT     Host to bind the Distributed Cuckoo server on
  -p, --port INTEGER  Port to bind the Distributed Cuckoo server on
  --uwsgi             Dump uWSGI configuration
  --nginx             Dump nginx configuration
  --help              Show this message and exit.

As may be derived from the help output, starting Distributed Cuckoo may be as simple as running cuckoo distributed server.

The various configuration options are described in the configuration file, but following we have more in-depth descriptions as well. More advanced usage naturally includes deployment using uWSGI and nginx.

Distributed Cuckoo Configuration
Report Formats

The reporting formats denote which reports you’d like to retrieve later on. Note that all task-related data will be removed from the Cuckoo nodes once the related reports have been fetched so that the machines are not running out of disk space. This does, however, force you to specify all the report formats that you’re interested in, because otherwise that information will be lost.

Reporting formats include, but are not limited to and may also include your own reporting formats, report.json, report.html, etc.

Samples Directory

The samples directory denotes the directory where the submitted samples will be stored temporarily, until the associated task has been deleted.

Reports Directory

Much like the Samples Directory the Reports Directory defines the directory where reports will be stored until they’re fetched and deleted from the Distributed REST API.

RESTful resources

Following are all RESTful resources. Also make sure to check out the Quick usage section which documents the most commonly used commands.

Resource Description
GET GET /api/node Get a list of all enabled Cuckoo nodes.
POST POST /api/node Register a new Cuckoo node.
GET GET /api/node/<name> Get basic information about a node.
PUT PUT /api/node/<name> Update basic information of a node.
POST POST /api/node/<name>/refresh Refresh a Cuckoo nodes metadata.
DELETE DELETE /api/node/<name> Disable (not completely remove!) a node.
GET GET /api/task Get a list of all (or a part) of the tasks in the database.
POST POST /api/task Create a new analysis task.
GET GET /api/task/<id> Get basic information about a task.
DELETE DELETE /api/task/<id> Delete all associated information of a task.
GET GET /api/report/<id>/<format> Fetch an analysis report.
GET /api/node

Returns all enabled nodes. For each node the information includes the associated name, its API URL, and machines:

$ curl http://localhost:9003/api/node
{
    "success": true,
    "nodes": {
        "localhost": {
            "machines": [
                {
                    "name": "cuckoo1",
                    "platform": "windows",
                    "tags": []
                }
            ],
            "name": "localhost",
            "url": "http://localhost:8090/"
        }
    }
}
POST /api/node

Register a new Cuckoo node by providing the name and the URL:

$ curl http://localhost:9003/api/node -F name=localhost \
    -F url=http://localhost:8090/
{
    "success": true
}
GET /api/node/<name>

Get basic information about a particular Cuckoo node:

$ curl http://localhost:9003/api/node/localhost
{
    "success": true,
    "nodes": [
        {
            "name": "localhost",
            "url": "http://localhost:8090/"
            "machines": [
                {
                    "name": "cuckoo1",
                    "platform": "windows",
                    "tags": []
                }
            ]
        }
    ]
}
PUT /api/node/<name>

Update basic information of a Cuckoo node:

$ curl -XPUT http://localhost:9003/api/node/localhost -F name=newhost \
    -F url=http://1.2.3.4:8090/
{
    "success": true
}
POST /api/node/<name>/refresh

Refreshes metadata associated by a Cuckoo node, in particular, its machines:

$ curl -XPOST http://localhost:9003/api/node/localhost/refresh
{
    "success": true,
    "machines": [
        {
            "name": "cuckoo1",
            "platform": "windows",
            "tags": []
        },
        {
            "name": "cuckoo2",
            "platform": "windows",
            "tags": []
        }
    ]
}
DELETE /api/node/<name>

Disable a Cuckoo node, therefore not having it process any new tasks, but keeping its history in the Distributed Cuckoo database:

$ curl -XDELETE http://localhost:9003/api/node/localhost
{
    "success": true
}
GET /api/task

Get a list of all tasks in the database. In order to limit the amount of results, there’s an offset, limit, finished, and owner field available:

$ curl http://localhost:9003/api/task?limit=1
{
    "success": true,
    "tasks": {
        "1": {
            "clock": null,
            "custom": null,
            "owner": "",
            "enforce_timeout": null,
            "machine": null,
            "memory": null,
            "options": null,
            "package": null,
            "path": "/tmp/dist-samples/tmphal8mS",
            "platform": "windows",
            "priority": 1,
            "tags": null,
            "task_id": 1,
            "timeout": null
        }
    }
}
POST /api/task

Submit a new file or URL to be analyzed:

$ curl http://localhost:9003/api/task -F file=@sample.exe
{
    "success": true,
    "task_id": 2
}
GET /api/task/<id>

Get basic information about a particular task:

$ curl http://localhost:9003/api/task/2
{
    "success": true,
    "tasks": {
        "2": {
            "id": 2,
            "clock": null,
            "custom": null,
            "owner": "",
            "enforce_timeout": null,
            "machine": null,
            "memory": null,
            "options": null,
            "package": null,
            "path": "/tmp/tmpPwUeXm",
            "platform": "windows",
            "priority": 1,
            "tags": null,
            "timeout": null,
            "task_id": 1,
            "node_id": 2,
            "finished": false
        }
    }
}
DELETE /api/task/<id>

Delete all associated data of a task, namely the binary, the PCAP, and the reports:

$ curl -XDELETE http://localhost:9003/api/task/2
{
    "success": true
}
GET /api/report/<id>/<format>

Fetch a report for the given task in the specified format:

# Defaults to the JSON report.
$ curl http://localhost:9003/api/report/2
...
Proposed setup

The following description depicts a Distributed Cuckoo setup with two Cuckoo machines, cuckoo0 and cuckoo1. In this setup the first machine, cuckoo0, also hosts the Distributed Cuckoo REST API.

Configuration settings

Our setup will require a couple of updates with regards to the configuration files.

conf/cuckoo.conf

Update process_results to off as we will be running our own results processing script (for performance reasons).

Update tmppath to something that holds enough storage to store a few hundred binaries. On some servers or setups /tmp may have a limited amount of space and thus this wouldn’t suffice.

Update connection to use something not sqlite3. Preferably PostgreSQL or MySQL. SQLite3 doesn’t support multi-threaded applications and as such is not a good choice for systems such as Cuckoo (as-is).

You should create a database specifically for the distributed cuckoo setup. Do not be tempted to use any existing cuckoo database in order to avoid update problems with the DB scripts. In the configuration use the new database name. The remaining configuration such as usernames, servers, etc can be the same as for your cuckoo install. Don’t forget to use one DB per node and one for the machine running Distributed Cuckoo (the “management machine” or “controller”).

conf/processing.conf

You may want to disable some processing modules, such as virustotal.

conf/reporting.conf

Depending on which report(s) are required for integration with your system it might make sense to only make those report(s) that you’re going to use. Thus disabling the other ones.

conf/virtualbox.conf

Assuming VirtualBox is the Virtual Machine manager of choice, the mode will have to be changed to headless or you will have some restless nights (this is the default nowadays).

Setup Cuckoo

On each machine you will have to run the Cuckoo Daemon, the Cuckoo API, and one or more Cuckoo Process instances. For more information on setting that up, please refer to Starting Cuckoo.

Setup Distributed Cuckoo

On the Distributed Cuckoo machine you’ll have to setup the Distributed Cuckoo REST API and the Distributed Cuckoo Worker.

As stated earlier, Distributed Cuckoo REST API may be started by running cuckoo distributed server or by deploying it properly with uWSGI and nginx.

The Distributed Cuckoo Worker may be started by running supervisorctl start distributed in the CWD (make sure to start supervisord first as per 后台运行 Cuckoo). This will automatically start the Worker with the correct configuration and arguments, etc.

Register Cuckoo nodes

As outlined in Quick usage the Cuckoo nodes have to be registered with the Distributed Cuckoo REST API:

$ curl http://localhost:9003/api/node -F name=cuckoo0 -F url=http://localhost:8090/
$ curl http://localhost:9003/api/node -F name=cuckoo1 -F url=http://1.2.3.4:8090/

Having registered the Cuckoo nodes all that’s left to do now is to submit tasks and fetch reports once finished. Documentation on these commands can be found in the Quick usage section. In case your Cuckoo node is not on localhost, replace localhost with the IP address of the node where the Cuckoo REST API is running.

If you want to experiment with load balancing between the nodes you may want to try using a lower value for the threshold parameter in the $CWD/distributed/settings.py file as the default value is 500 (meaning tasks are assigned to Cuckoo nodes in batches of 500).

Quick usage

For practical usage the following few commands will be most interesting.

Register a Cuckoo node, in this case a Cuckoo API running on the same machine in this case:

$ curl http://localhost:9003/api/node -F name=localhost -F ip=127.0.0.1

Disable a Cuckoo node:

$ curl -XDELETE http://localhost:9003/api/node/localhost

Submit a new analysis task without any special requirements (e.g., using Cuckoo tags, a particular machine, etc):

$ curl http://localhost:9003/api/task -F file=@/path/to/sample.exe

Get the report of a task has been finished (if it hasn’t finished you’ll get an error with code 420). Following example will default to the JSON report:

$ curl http://localhost:9003/api/report/1

If a Cuckoo node gets stuck and needs a reset, the following steps could be performed to restart it cleanly. Note that this requires usage of our SaltStack configuration and some manual SQL commands (and preferably the Distributed Cuckoo Worker is temporary disabled, i.e., supervisorctl stop distributed):

$ psql -c "UPDATE SET status = 'pending' WHERE status = 'processing' AND node_id = 123"
$ salt cuckoo1 state.apply cuckoo.clean
$ salt cuckoo1 state.apply cuckoo.start

If the entire Cuckoo cluster was somehow locked up, i.e., all tasks have been ‘assigned’, are ‘processing’, or have the ‘finished’ status while none of the Cuckoo nodes are currently working on said analyses (e.g., due to numerous resets etc), then the following steps may be used to reset the entire state:

$ supervisorctl -c ~/.cuckoo/supervisord.conf stop distributed
$ salt '*' state.apply cuckoo.stop
$ salt '*' state.apply cuckoo.clean
$ psql -c "UPDATE task SET status = 'pending', node_id = null WHERE status IN ('assigned', 'processing', 'finished')"
$ salt '*' state.apply cuckoo.start
$ supervisorctl -c ~/.cuckoo/supervisord.conf start distributed

If a Cuckoo node has a number of tasks that failed to process, therefore locking up the Cuckoo node altogether, then upgrading the Cuckoo instances with a bugfixed version and re-processing all analyses may do the trick:

$ salt cuckoo1 state.apply cuckoo.update  # Upgrade Cuckoo.
# To make sure there are failed analyses in the first place.
$ salt cuckoo1 cmd.run "sudo -u cuckoo psql -c \"SELECT * FROM tasks WHERE status = 'failed_processing'\"
# Reset each analyses to be re-processed.
$ salt cuckoo1 cmd.run "sudo -u cuckoo psql -c \"UPDATE tasks SET status = 'completed', processing = null WHERE status = 'failed_processing'\""

In order to upgrade the Distributed Cuckoo master, one may want to perform the following steps:

$ /etc/init.d/uwsgi stop
$ supervisorctl -c ~/.cuckoo/supervisord.conf stop distributed
$ pip uninstall -y cuckoo
$ pip install cuckoo==2.0.0         # Specify your version here.
$ pip install Cuckoo-2.0.0.tar.gz   # Or use a locally archived build.
$ cuckoo distributed migrate
$ supervisorctl -c ~/.cuckoo/supervisord.conf start distributed
$ /etc/init.d/uwsgi start
$ /etc/init.d/nginx restart

In order to test your entire Cuckoo cluster, i.e., every machine on every Cuckoo node, one may take the stuff/distributed/cluster-test.py script as an example. As-is it allows one to check for an active internet connection in each and every configured machine in the cluster. This script may be used to identify machines that are incorrect or have been corrupted in one way or another. Example usage may look as follows:

# Assuming Distributed Cuckoo listens on localhost and that you want to
# run the 'internet' script (see also the source of cluster-test.py).
$ python stuff/distributed/cluster-test.py localhost -s internet

工具

Cuckoo 包含了一套用于自动化的工具。之前是放在 utils/ 目录的, 现在全部被整合到一起了。

Cuckoo Apps

一个 Cuckoo App 现在是 cuckoo 命令的一个子命令。 每个 app 都具有各自的功能, 调用方式都是类似的,例如:

$ cuckoo submit --help
$ cuckoo api --help
$ cuckoo clean --help

In these examples we provided the --help parameter which shows the functionality and all available parameters for the particular Cuckoo App.

Submission Utility

Submits samples to analysis. This tool is described in 提交分析.

Web Utility

Cuckoo’s web interface. This tool is described in Web 界面.

Processing Utility

在 2.0.0 版更改: We used to have longstanding issues with ./utils/process.py randomly freezing up and ./utils/process2.py only being able to handle PostgreSQL-based databases. These two commands have now been merged into one Cuckoo App and no longer show signs of said issues or limitations.

For bigger Cuckoo setups it is recommended to separate the results processing from the Cuckoo analyses due to performance issues (with multiple threads & the Python GIL). Using cuckoo process it is also possible to re-generate Cuckoo reports, this is mostly used while developing and debugging Cuckoo Processing modules, Cuckoo Signatures, and Cuckoo Reporting modules.

In order to do results processing in one or more separate process(es) one has to disable the process_results configuration item in $CWD/conf/cuckoo.conf by setting the value to off. Then a Cuckoo Processing instance has to be started, this can be done as follows:

$ cuckoo process instance1

If one Cuckoo Processing instance is not enough to handle all the incoming analyses, simply create a second, third, and possibly more instances:

$ cuckoo process instance2

In order to re-generate a Cuckoo report of an analysis task, use the -r switch:

$ cuckoo process -r 1

It is also possible to re-generate multiple or a range of Cuckoo reports at once. The following will reprocess tasks 1, 2, 5, 6, 7, 8, 9, 10:

$ cuckoo process -r 1,2,5-10

For more information see also the help on this Cuckoo App:

$ cuckoo process --help
Usage: cuckoo process [OPTIONS] [INSTANCE]

  Process raw task data into reports.

Options:
  -r, --report TEXT       Re-generate one or more reports
  -m, --maxcount INTEGER  Maximum number of analyses to process
  --help                  Show this message and exit.

In automated mode an instance name is required (e.g., instance1) as seen in the examples earlier above!

Community Download Utility

This Cuckoo App downloads Cuckoo Signatures, the latest monitoring binaries, and other goodies from the Cuckoo Community Repository and installs them in your CWD.

To get all the latest and greatest from the Cuckoo Community simply execute as follows and wait until it finishes - it currently doesn’t have any progress indication:

$ cuckoo community

For more usage see as follows:

$ cuckoo community --help
Usage: cuckoo community [OPTIONS]

  Utility to fetch supplies from the Cuckoo Community.

Options:
  -f, --force              Overwrite existing files
  -b, --branch TEXT        Specify a different community branch rather than
                           master
  --file, --filepath PATH  Specify a local copy of a community .tar.gz file
  --help                   Show this message and exit.
Database migration utility

在 2.0.0 版更改: This used to be a special process, but has since been integrated properly as a Cuckoo App.

This utility helps migrating your data between Cuckoo releases. It’s developed on top of the Alembic framework and it should provide data migration for both SQL database and Mongo database. This tool is already described in Upgrading from a previous release.

Stats utility

2.0-rc2 版后已移除: This utility will not be ported to a Cuckoo App as this information can also be retrieved through both the Cuckoo API as well as the Cuckoo Web Interface.

Machine utility

在 2.0.0 版更改: This used to be a standalone and hacky script directly modifying the Cuckoo configuration. It’s now much better integrated and will be able to somewhat properly interact with Cuckoo.

The machine Cuckoo App is designed to help you automatize the configuration of virtual machines in Cuckoo. It takes a list of machine details as arguments and write them in the specified configuration file of the machinery module enabled in cuckoo.conf. Following are the available options:

$ cuckoo machine --help
Usage: cuckoo machine [OPTIONS] VMNAME [IP]

Options:
  --debug              Enable verbose logging
  --add                Add a Virtual Machine
  --delete             Delete a Virtual Machine
  --platform TEXT      Guest Operating System
  --options TEXT       Machine options
  --tags TEXT          Tags for this Virtual Machine
  --interface TEXT     Sniffer interface for this Virtual Machine
  --snapshot TEXT      Specific Virtual Machine Snapshot to use
  --resultserver TEXT  IP:Port of the Result Server
  --help               Show this message and exit.

As an example, a machine may be added to Cuckoo’s configuration as follows:

$ cuckoo machine --add cuckoo1 192.168.56.101 --platform windows --snapshot vmcloak
Distributed scripts

This tool is described in Distributed Cuckoo.

Mac OS X Bootstrap scripts

2.0.0 版后已移除: These files will be moved elsewhere in an upcoming update and so should any documentation that references these scripts.

A couple of bootstrap scripts used for Mac OS X analysis are located in utils/darwin folder, they are used to bootstrap the guest and host system for Mac OS X malware analysis. Some settings are defined as constants inside them, so it is suggested to have a look at them and configure them for your needs.

SMTP Sinkhole

2.0.0 版后已移除: This script has been removed since this functionality should be implemented properly using a Postfix setup.

Setup script

2.0.0 版后已移除: This script has been replaced by a similar but much more powerful SaltStack state.

Cuckoo Rooter

The Cuckoo Rooter is a new concept, providing root access for various commands to Cuckoo (which itself generally speaking runs as non-root). This command is currently only available for Ubuntu and Debian-like systems.

In particular, the rooter helps Cuckoo out with running network-related commands in order to provide per-analysis routing options. For more information on that, please refer to the 单次分析路由 document. Cuckoo and the rooter communicate through a UNIX socket for which the rooter makes sure that Cuckoo can reach it.

Its usage is as follows:

$ cuckoo rooter --help
Usage: cuckoo rooter [OPTIONS] [SOCKET]

Options:
  -g, --group TEXT  Unix socket group
  --service PATH    Path to service(8) for invoking OpenVPN
  --iptables PATH   Path to iptables(8)
  --ip PATH         Path to ip(8)
  --sudo            Request superuser privileges
  --help            Show this message and exit.

By default the rooter will default to chown’ing the cuckoo user as user and group for the UNIX socket, as recommended when Cuckoo 安装. If you’re running Cuckoo under a user other than cuckoo, you will have to specify this to the rooter as follows:

$ sudo cuckoo rooter -g <user>

The other options are fairly straightforward - you can specify the paths to specific Linux commands. By default one shouldn’t have to do this though, as the rooter takes the default paths for the various utilities as per a default setup.

Virtualenv

Due to the fact that the rooter must be run as root user, there are some slight complications when using a virtualenv to run Cuckoo. More specifically, when running sudo cuckoo rooter, the $VIRTUAL_ENV environment variable will not be passed along, due to which Python will not be executed from the same virtualenv as it would have been normally.

To resolve this one simply has to execute the cuckoo binary from the virtualenv session directly. E.g., if your virtualenv is located at ~/venv, then running the rooter command could be done as follows:

$ sudo ~/venv/bin/cuckoo rooter

Alternatively one may use the --sudo flag which will call sudo on the correct cuckoo binary with all the provided flags. In turn the user will have to enter his or her password and, assuming all is fine, the Cuckoo Rooter will be started properly, e.g.:

(venv)$ cuckoo rooter --sudo
Cuckoo Rooter Usage

Using the Cuckoo Rooter is actually pretty easy. If you know how to start it, you’re basically good to go. Even though Cuckoo talks with the Cuckoo Rooter for each analysis with a routing option other than None Routing, the Cuckoo Rooter does not keep any state or attach to any Cuckoo instance in particular.

It is therefore that once the Cuckoo Rooter has been started you may leave it be - the Cuckoo Rooter will take care of itself from that point onwards, no matter how often you restart your Cuckoo instance.

Cuckoo Feedback

2.0.0 新版功能.

The Cuckoo Feedback form allows users to provide instant feedback to the Cuckoo Core Developer team. By doing so, our development team will be able to more quickly react upon errors, partially incorrect analysis results, errors occurred during an analysis or in the web interface, and anything else that our users think requires some extra attention. All in all, this optional feature gives those users that are interested in a second opinion the ability to do so in a convenient way for both the user as well as the team behind Cuckoo Sandbox.

注解

As a user you are able to ping back to us through the Cuckoo Feedback from embedded in most pages of the web interface (e.g., an analysis page or a 404 page not found / 500 internal error page).

Following a screenshot of a part of the new (as of Cuckoo 2.0.0) analysis results page with the side bar locked in (i.e., permanently open).

_images/side-bar.png

At the bottom of the side bar you’ll see the Feedback button which will pop up the following feedback form. Naturally filling out all of the fields in this form will allow you to send us feedback (in a secure manner).

It should be noted that, may you decide to provide feedback on a regular, you can also fill out your name, company, and email address (where you’ll receive any answers) in the $CWD/conf/cuckoo.conf configuration file so those will be auto-filled for you upon opening the feedback form.

_images/feedback-form.png

Analysis Packages

The analysis packages are a core component of Cuckoo Sandbox. They consist in structured Python classes which, when executed in the guest machines, describe how Cuckoo’s analyzer component should conduct the analysis.

Cuckoo provides some default analysis packages that you can use, but you are able to create your own or modify the existing ones. You can find them at analyzer/windows/modules/packages/.

As described in 提交分析, you can specify some options to the analysis packages in the form of key1=value1,key2=value2. The existing analysis packages already include some default options that can be enabled.

Following is a list of the options that work for all analysis packages unless explicitly stated otherwise:

  • free [yes/no]: if enabled, no behavioral logs will be produced and the malware will be executed freely.
  • procmemdump [yes/no]: if enabled, take memory dumps of all actively monitored processes.
  • human 0: if disabled, human-like interaction (i.e., mouse movements) will not be enabled

Following is the list of existing packages in alphabetical order:

  • applet: used to analyze Java applets.

    Options:

    • class: specify the name of the class to be executed. This option is mandatory for a correct execution.
  • bin: used to analyze generic binary data, such as shellcodes.

  • cpl: used to analyze Control Panel Applets.

  • dll: used to run and analyze Dynamically Linked Libraries.

    Options:

    • function: specify the function to be executed. If none is specified, Cuckoo will try to run DllMain.
    • arguments: specify arguments to pass to the DLL through commandline.
    • loader: specify a process name to use to fake the DLL launcher name instead of rundll32.exe (this is used to fool possible anti-sandboxing tricks of certain malware)
  • doc: used to run and analyze Microsoft Word documents.

  • exe: default analysis package used to analyze generic Windows executables.

    Options:

    • arguments: specify any command line argument to pass to the initial process of the submitted malware.
  • generic: used to run and analyze generic samples via cmd.exe.

  • ie: used to analyze Internet Explorer’s behavior when opening the given URL or HTML file.

  • jar: used to analyze Java JAR containers.

    Options:

    • class: specify the path of the class to be executed. If none is specified, Cuckoo will try to execute the main function specified in the Jar’s MANIFEST file.
  • js: used to run and analyze Javascript files (e.g., those found in attachments of emails).

  • hta: used to run and analyze HTML Application files.

  • msi: used to run and analyze MSI windows installer.

  • pdf: used to run and analyze PDF documents.

  • ppt: used to run and analyze Microsoft PowerPoint documents.

  • ps1: used to run and analyze PowerShell scripts.

  • python: used to run and analyze Python scripts.

  • vbs: used to run and analyze VBScript files.

  • wsf: used to run and analyze Windows Script Host files.

  • xls: used to run and analyze Microsoft Excel documents.

  • zip: used to run and analyze Zip archives.

    Options:

    • file: specify the name of the file contained in the archive to execute. If none is specified, Cuckoo will try to execute sample.exe.
    • arguments: specify any command line argument to pass to the initial process of the submitted malware.
    • password: specify the password of the archive. If none is specified, Cuckoo will try to extract the archive without password or use the password “infected”.

You can find more details on how to start creating new analysis packages in the Analysis Packages customization chapter.

As you already know, you can select which analysis package to use by specifying its name at submission time (see 提交分析) as follows:

$ cuckoo submit --package <package name> /path/to/malware

If none is specified, Cuckoo will try to detect the file type and select the correct analysis package accordingly. If the file type is not supported by default the analysis will be aborted, therefore we encourage to specify the package name whenever possible.

For example, to launch a malware and specify some options you can do:

$ cuckoo submit --package dll --options function=FunctionName,loader=explorer.exe /path/to/malware.dll

Analysis Results

Once an analysis is completed, several files are stored in a dedicated directory. All the analyses are stored under the $CWD/storage/analyses/ inside a subdirectory named after the incremental numerical ID that represents the analysis task in the database.

Following is an example of an analysis directory structure:

.
|-- analysis.log
|-- binary
|-- dump.pcap
|-- memory.dmp
|-- files
|   |-- 1234567890_dropped.exe
|-- logs
|   |-- 1232.bson
|   |-- 1540.bson
|   `-- 1118.bson
|-- reports
|   |-- report.html
|   |-- report.json
`-- shots
    |-- 0001.jpg
    |-- 0002.jpg
    |-- 0003.jpg
    `-- 0004.jpg
analysis.log

This is a log file generated by the analyzer that contains a trace of the analysis execution inside the guest environment. It will report the creation of processes, files and eventual errors occurred during the execution.

dump.pcap

This is the network dump generated by tcpdump or any other corresponding network sniffer.

dump_sorted.pcap

This is a sorted version of dump.pcap in the sense that it allows the Web Interface to quickly lookup TCP stream.

memory.dmp

In case you enabled it, this file contains the full memory dump of the analysis machine.

files/

This directory contains all the files the malware operated on and that Cuckoo was able to dump.

files.json

This file contains a JSON-encoded entry for each dropped file available (i.e., all files in files/, shots/, etc). It contains meta information, where available, about all processes that touched the file, its original file path in the Guest, etc.

logs/

This directory contains all the raw logs generated by Cuckoo’s process monitoring.

reports/

This directory contains all the reports generated by Cuckoo as explained in the 配置 chapter.

shots/

This directory contains all the screenshots of the guest’s desktop taken during the malware execution.

tlsmaster.txt

This file contains the TLS Master Secrets that were captured during the analysis. TLS Master Secrets can be used to decrypt SSL/TLS traffic and are thus used to decrypt HTTPS streams.

Clean all Tasks and Samples

在 2.0.0 版更改: Turned into a proper Cuckoo App rather than a standalone script.

Since Cuckoo 1.2 a built-in clean feature has been featured, it drops all associated information of the tasks and samples in the database, on the harddisk, from MongoDB, and from ElasticSearch. If you submit a task after running clean you’ll start over with Task #1 again.

To clean your setup, run:

$ cuckoo clean

To sum up, this command does the following:

  • Delete analysis results.
  • Delete submitted binaries.
  • Delete all associated information of the tasks and samples in the configured database.
  • Delete all data in the configured MongoDB database (if configured and enabled in $CWD/conf/reporting.conf).
  • Delete all data in the configured ElasticSearch database (if configured and enabled in $CWD/conf/reporting.conf).

警告

If you use this command you will permanently delete all data stored by Cuckoo in all available storages: the file system, the SQL database, the MongoDB database, and the ElasticSearch database. Use it only if you are sure you would clean up all the data.

Customization

This chapter explains how to customize Cuckoo. Cuckoo is written in a modular architecture built to be as customizable as it can, to fit the needs of all users.

Auxiliary Modules

Auxiliary modules define some procedures that need to be executed in parallel to every single analysis process. All auxiliary modules should be placed under the cuckoo/cuckoo/auxiliary/ directory, that way the module will fall under the cuckoo.auxiliary module.

The skeleton of a module would look something like this:

1
2
3
4
5
6
7
8
9
from cuckoo.common.abstracts import Auxiliary

class MyAuxiliary(Auxiliary):

    def start(self):
        # Do something.

    def stop(self):
        # Stop the execution.

The function start() will be executed before starting the analysis machine and effectively executing the submitted malicious file, while the stop() function will be launched at the very end of the analysis process, before launching the processing and reporting procedures.

For example, an auxiliary module provided by default in Cuckoo is called sniffer.py and takes care of executing tcpdump in order to dump the generated network traffic.

Machinery Modules

Machinery modules define how Cuckoo should interact with your virtualization software (or potentially even with physical disk imaging solutions). Since we decided to not enforce any particular vendor, from release 0.4 you are able to use your preferred solution and, in case it’s not supported by default, write a custom Python module that defines how to make Cuckoo use it.

Every machinery module should be located inside the cuckoo/cuckoo/machinery/ directory so that it will fall under the cuckoo.machinery module.

A basic machinery module would look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from cuckoo.common.abstracts import Machinery
from cuckoo.common.exceptions import CuckooMachineError

class MyMachinery(Machinery):
    def start(self, label):
        try:
            revert(label)
            start(label)
        except SomethingBadHappens:
            raise CuckooMachineError("oops!")

    def stop(self, label):
        try:
            stop(label)
        except SomethingBadHappens:
            raise CuckooMachineError("oops!")

The only requirements for Cuckoo are that:

  • The class inherits from Machinery.
  • You have a start() and stop() functions.
  • You raise CuckooMachineError when something fails.

As you understand, the machinery module is a core part of a Cuckoo setup, therefore make sure to spend enough time debugging your code and make it solid and resistant to any unexpected error.

Configuration

Every machinery module should come with a dedicated configuration file located in $CWD/conf/<machinery module name>.conf (which translates to cuckoo/data/conf/<machinery>conf in the Git repository). For example for cuckoo/cuckoo/machinery/kvm.py we have a $CWD/conf/kvm.conf.

The configuration file should follow the default structure:

[kvm]
# Specify a comma-separated list of available machines to be used. For each
# specified ID you have to define a dedicated section containing the details
# on the respective machine. (E.g. cuckoo1,cuckoo2,cuckoo3)
machines = cuckoo1

[cuckoo1]
# Specify the label name of the current machine as specified in your
# libvirt configuration.
label = cuckoo1

# Specify the operating system platform used by current machine
# [windows/darwin/linux].
platform = windows

# Specify the IP address of the current machine. Make sure that the IP address
# is valid and that the host machine is able to reach it. If not, the analysis
# will fail.
ip = 192.168.122.105

A main section called [<name of the module>] with a machines field containing a comma-separated list of machines IDs.

For each machine you should specify a label, a platform and its ip.

These fields are required by Cuckoo in order to use the already embedded initialize() function that generates the list of available machines.

If you plan to change the configuration structure you should override the initialize() function (inside your own module, no need to modify Cuckoo’s core code). You can find its original code in the Machinery abstract inside cuckoo/common/abstracts.py.

LibVirt

Starting with Cuckoo 0.5 developing new machinery modules based on LibVirt is easy. Inside cuckoo/common/abstracts.py you can find LibVirtMachinery that already provides all the functionality for a LibVirt module. Just inherit this base class and specify your connection string, as in the example below:

1
2
3
4
5
from cuckoo.common.abstracts import LibVirtMachinery

class MyMachinery(LibVirtMachinery):
    # Set connection string.
    dsn = "my:///connection"

This works for all the virtualization technologies supported by LibVirt. Just remember to check if your LibVirt package (if you are using one, for example from your Linux distribution) is compiled with the support for the technology you need.

You can check it with the following command:

$ virsh -V
Virsh command line tool of libvirt 0.9.13
See web site at http://libvirt.org/

Compiled with support for:
 Hypervisors: QEmu/KVM LXC UML Xen OpenVZ VMWare Test
 Networking: Remote Daemon Network Bridging Interface Nwfilter VirtualPort
 Storage: Dir Disk Filesystem SCSI Multipath iSCSI LVM
 Miscellaneous: Nodedev AppArmor Secrets Debug Readline Modular

If you don’t find your virtualization technology in the list of Hypervisors, you will need to recompile LibVirt with the specific support for the missing one.

Analysis Packages

As explained in Analysis Packages, analysis packages are structured Python classes that describe how Cuckoo’s analyzer component should conduct the analysis procedure for a given file inside the guest environment.

As you already know, you can create your own packages and add them along with the default ones. Designing new packages is very easy and requires just a minimal understanding of programming and of the Python language.

Getting started

As an example we’ll take a look at the default package for analyzing generic Windows executables, located at $CWD/analyzer/windows/packages/exe.py (which translates to cuckoo/data/analyzer/windows/packages/exe.py in the Git repository):

1
2
3
4
5
6
7
8
from lib.common.abstracts import Package

class Exe(Package):
    """EXE analysis package."""

    def start(self, path):
        args = self.options.get("arguments")
        return self.execute(path, args)

It seems really easy, thanks to all method inherited by Package object. Let’s have a look as some of the main methods an analysis package inherits from Package object:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from lib.api.process import Process
from lib.common.exceptions import CuckooPackageError

class Package(object):
    def start(self):
        raise NotImplementedError

    def check(self):
        return True

    def execute(self, path, args):
        dll = self.options.get("dll")
        free = self.options.get("free")
        suspended = True
        if free:
            suspended = False

        p = Process()
        if not p.execute(path=path, args=args, suspended=suspended):
            raise CuckooPackageError(
                "Unable to execute the initial process, analysis aborted."
            )

        if not free and suspended:
            p.inject(dll)
            p.resume()
            p.close()
            return p.pid

    def finish(self):
        if self.options.get("procmemdump"):
            for pid in self.pids:
                p = Process(pid=pid)
                p.dump_memory()
        return True
Let’s walk through the code:
  • Line 1: import the Process API class, which is used to create and manipulate Windows processes.
  • Line 2: import the CuckooPackageError exception, which is used to notify issues with the execution of the package to the analyzer.
  • Line 4: define the main class, inheriting object.
  • Line 5: define the start() function, which takes as argument the path to the file to execute. It should be implemented by each analysis package.
  • Line 8: define the check() function.
  • Line 13: acquire the free option, which is used to define whether the process should be monitored or not.
  • Line 18: initialize a Process instance.
  • Line 19: try to execute the malware, if it fails it aborts the execution and notify the analyzer.
  • Line 24: check if the process should be monitored.
  • Line 25: inject the process with our DLL.
  • Line 26: resume the process from the suspended state.
  • Line 28: return the PID of the newly created process to the analyzer.
  • Line 30: define the finish() function.
  • Line 31: check if the procmemdump option was enabled.
  • Line 32: loop through the currently monitored processes.
  • Line 33: open a Process instance.
  • Line 34: take a dump of the process memory.
start()

In this function you have to place all the initialization operations you want to run. This may include running the malware process, launching additional applications, taking memory snapshots and more.

check()

This function is executed by Cuckoo every second while the malware is running. You can use this function to perform any kind of recurrent operation.

For example if in your analysis you are looking for just one specific indicator to be created (e.g., a file) you could place your condition in this function and if it returns False, the analysis will terminate right away.

Think of it as “should the analysis continue or not?”.

For example:

def check(self):
    if os.path.exists("C:\\config.bin"):
        return False
    else:
        return True

This check() function will cause Cuckoo to immediately terminate the analysis whenever C:\\config.bin is created.

execute()

Wraps the malware execution and deal with DLL injection.

finish()

This function is simply called by Cuckoo before terminating the analysis and powering off the machine. By default, this function contains an optional feature to dump the process memory of all the monitored processes.

Options

Every package have automatically access to a dictionary containing all user-specified options (see 提交分析).

Such options are made available in the attribute self.options. For example let’s assume that the user specified the following string at submission:

foo=1,bar=2

The analysis package selected will have access to these values:

from lib.common.abstracts import Package

class Example(Package):

    def start(self, path):
        foo = self.options["foo"]
        bar = self.options["bar"]

    def check():
        return True

    def finish():
        return True

These options can be used for anything you might need to configure inside your package.

Process API

The Process class provides access to different process-related features and functions. You can import it in your analysis packages with:

from lib.api.process import Process

You then initialize an instance with:

p = Process()

In case you want to open an existing process instead of creating a new one, you can specify multiple arguments:

  • pid: PID of the process you want to operate on.
  • h_process: handle of a process you want to operate on.
  • thread_id: thread ID of a process you want to operate on.
  • h_thread: handle of the thread of a process you want to operate on.

This class implements several methods that you can use in your own scripts.

Methods
Process.open()

Opens an handle to a running process. Returns True or False in case of success or failure of the operation.

返回类型:boolean

Example Usage:

1
2
3
p = Process(pid=1234)
p.open()
handle = p.h_process
Process.exit_code()

Returns the exit code of the opened process. If it wasn’t already done before, exit_code() will perform a call to open() to acquire an handle to the process.

返回类型:ulong

Example Usage:

1
2
p = Process(pid=1234)
code = p.exit_code()
Process.is_alive()

Calls exit_code() and verify if the returned code is STILL_ACTIVE, meaning that the given process is still running. Returns True or False.

返回类型:boolean

Example Usage:

1
2
3
p = Process(pid=1234)
if p.is_alive():
    print("Still running!")
Process.get_parent_pid()

Returns the PID of the parent process of the opened process. If it wasn’t already done before, get_parent_pid() will perform a call to open() to acquire an handle to the process.

返回类型:int

Example Usage:

1
2
p = Process(pid=1234)
ppid = p.get_parent_pid()
Process.execute(path[, args=None[, suspended=False]])

Executes the file at the specified path. Returns True or False in case of success or failure of the operation.

参数:
  • path (string) – path to the file to execute
  • args (string) – arguments to pass to the process command line
  • suspended (boolean) – enable or disable suspended mode flag at process creation
返回类型:

boolean

Example Usage:

1
2
p = Process()
p.execute(path="C:\\WINDOWS\\system32\\calc.exe", args="Something", suspended=True)
Process.resume()

Resumes the opened process from a suspended state. Returns True or False in case of success or failure of the operation.

返回类型:boolean

Example Usage:

1
2
3
p = Process()
p.execute(path="C:\\WINDOWS\\system32\\calc.exe", args="Something", suspended=True)
p.resume()
Process.terminate()

Terminates the opened process. Returns True or False in case of success or failure of the operation.

返回类型:boolean

Example Usage:

1
2
3
4
5
p = Process(pid=1234)
if p.terminate():
    print("Process terminated!")
else:
    print("Could not terminate the process!")
Process.inject([dll[, apc=False]])

Injects our DLL into the opened process. Returns True or False in case of success or failure of the operation.

参数:
  • dll (string) – path to the DLL to inject into the process
  • apc (boolean) – enable to use QueueUserAPC() injection instead of CreateRemoteThread(), beware that if the process is in suspended mode, Cuckoo will always use QueueUserAPC()
返回类型:

boolean

Example Usage:

1
2
3
4
p = Process()
p.execute(path="C:\\WINDOWS\\system32\\calc.exe", args="Something", suspended=True)
p.inject()
p.resume()
Process.dump_memory()

Takes a snapshot of the given process’ memory space. Returns True or False in case of success or failure of the operation.

返回类型:boolean

Example Usage:

1
2
p = Process(pid=1234)
p.dump_memory()

Processing Modules

Cuckoo’s processing modules are Python scripts that let you define custom ways to analyze the raw results generated by the sandbox and append some information to a global container that will be later used by the signatures and the reporting modules.

You can create as many modules as you want, as long as they follow a predefined structure that we will present in this chapter.

Global Container

After an analysis is completed, Cuckoo will invoke all the processing modules available in the cuckoo/processing/ directory, all of which fall under the cuckoo.processing module. Any additional module you decide to create must be placed inside that directory.

Every module should also have a dedicated section in the $CWD/conf/processing.conf file: for example if you create a module cuckoo/processing/foobar.py you will have to append the following section to $CWD/conf/processing.conf:

[foobar]
enabled = yes

Every module will then be initialized and executed and the data returned will be appended in a data structure that we’ll call global container.

This container is simply just a big Python dictionary that includes the abstracted results produced by all the modules classified by their identification key.

Cuckoo already provides a default set of modules which will generate a standard global container. It’s important for the existing reporting modules (HTML report etc.) that these default modules are not modified, otherwise the resulting global container structure would change and the reporting modules wouldn’t be able to recognize it and extract the information used to build the final reports.

The currently available default processing modules are:
  • AnalysisInfo (cuckoo/processing/analysisinfo.py) - generates some basic information on the current analysis, such as timestamps, version of Cuckoo and so on.
  • ApkInfo (cuckoo/processing/apkinfo.py) - generates some basic information on the current APK analysis (Android analysis).
  • Baseline (cuckoo/processing/baseline.py) - baseline results from gathered information.
  • BehaviorAnalysis (cuckoo/processing/behavior.py) - parses the raw behavioral logs and perform some initial transformations and interpretations, including the complete processes tracing, a behavioral summary and a process tree.
  • Buffer (cuckoo/processing/buffer.py) - dropped buffer analysis.
  • Debug (cuckoo/processing/debug.py) - includes errors and the analysis.log generated by the analyzer.
  • Droidmon (cuckoo/processing/droidmon.py) - extract Dynamic API calls Info From Droidmon logs.
  • Dropped (cuckoo/processing/dropped.py) - includes information on the files dropped by the malware and dumped by Cuckoo.
  • DumpTls (cuckoo/processing/dumptls.py) - cross-references TLS master secrets extracted from the monitor and key information extracted from the PCAP to dump a master secrets file.
  • GooglePlay (cuckoo/processing/googleplay.py) - Google Play information about the analysis session.
  • Irma (cuckoo/processing/irma.py) - IRMA connector.
  • Memory (cuckoo/processing/memory.py) - executes Volatility on a full memory dump.
  • Misp (cuckoo/processing/misp.py) - MISP connector.
  • NetworkAnalysis (cuckoo/processing/network.py) - parses the PCAP file and extracts some network information, such as DNS traffic, domains, IPs, HTTP requests, IRC and SMTP traffic.
  • ProcMemory (cuckoo/processing/procmemory.py) - performs analysis of process memory dump. Note: the module is able to process user defined Yara rules from data/yara/memory/index_memory.yar. Just edit this file to add your Yara rules.
  • ProcMon (cuckoo/processing/procmon.py) - extracts events from procmon.exe output.
  • Screenshots (cuckoo/processing/screenshots.py) - screenshot and OCR analysis.
  • Snort (cuckoo/processing/snort.py) - Snort processing module.
  • StaticAnalysis (cuckoo/processing/static.py) - performs some static analysis of PE32 files.
  • Strings (cuckoo/processing/strings.py) - extracts strings from the analyzed binary.
  • Suricata (cuckoo/processing/suricata.py) - Suricata processing module.
  • TargetInfo (cuckoo/processing/targetinfo.py) - includes information on the analyzed file, such as hashes.
  • VirusTotal (cuckoo/processing/virustotal.py) - searches on VirusTotal.com for antivirus signatures of the analyzed file. Note: the file is not uploaded on VirusTotal.com, if the file was not previously uploaded on the website no results will be retrieved.
Getting started

In order to make them available to Cuckoo, all processing modules must be placed inside the cuckoo/processing/ directory.

A basic processing module could look like:

1
2
3
4
5
6
7
8
from cuckoo.common.abstracts import Processing

class MyModule(Processing):

    def run(self):
        self.key = "key"
        data = do_something()
        return data
Every processing module should contain:
  • A class inheriting Processing.
  • A run() function.
  • A self.key attribute defining the name to be used as a sub container for the returned data.
  • A set of data (list, dictionary, string, etc.) that will be appended to the global container.

You can also specify an order value, which allows you to run the available processing modules in an ordered sequence. By default all modules are set with an order value of 1 and are executed in alphabetical order.

If you want to change this value your module would look like:

1
2
3
4
5
6
7
8
9
from cuckoo.common.abstracts import Processing

class MyModule(Processing):
    order = 2

    def run(self):
        self.key = "key"
        data = do_something()
        return data

You can also manually disable a processing module by setting the enabled attribute to False:

1
2
3
4
5
6
7
8
9
from cuckoo.common.abstracts import Processing

class MyModule(Processing):
    enabled = False

    def run(self):
        self.key = "key"
        data = do_something()
        return data

The processing modules are provided with some attributes that can be used to access the raw results for the given analysis:

  • self.analysis_path: path to the folder containing the results (e.g., $CWD/storage/analysis/1)
  • self.log_path: path to the analysis.log file.
  • self.file_path: path to the analyzed file.
  • self.dropped_path: path to the folder containing the dropped files.
  • self.logs_path: path to the folder containing the raw behavioral logs.
  • self.shots_path: path to the folder containing the screenshots.
  • self.pcap_path: path to the network pcap dump.
  • self.memory_path: path to the full memory dump, if created.
  • self.pmemory_path: path to the process memory dumps, if created.

With these attributes you should be able to easily access all the raw results stored by Cuckoo and perform your analytic operations on them.

As a last note, a good practice is to use the CuckooProcessingError exception whenever the module encounters an issue you want to report to Cuckoo. This can be done by importing the class like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from cuckoo.common.exceptions import CuckooProcessingError
from cuckoo.common.abstracts import Processing

class MyModule(Processing):

    def run(self):
        self.key = "key"

        try:
            data = do_something()
        except SomethingFailed:
            raise CuckooProcessingError("Failed")

        return data

Signatures

With Cuckoo you’re able to create some customized signatures that you can run against the analysis results in order to identify some predefined pattern that might represent a particular malicious behavior or an indicator you’re interested in.

These signatures are very useful to give a context to the analyses: both because they simplify the interpretation of the results as well as for automatically identifying malware samples of interest.

Some examples of what you can use Cuckoo’s signatures for:

  • Identify a particular malware family you’re interested in by isolating some unique behaviors (like file names or mutexes).
  • Spot interesting modifications the malware performs on the system, such as installation of device drivers.
  • Identify particular malware categories, such as Banking Trojans or Ransomware by isolating typical actions commonly performed by those.
  • Classify samples into the categories malware/unknown (it is not possible to identify clean samples)

You can find signatures created by us and by other Cuckoo users on our Community repository.

Getting started

Creation of signatures is a fairly simple process and requires just a decent understanding of Python programming.

First things first, all signatures must be located inside the cuckoo/cuckoo/signatures/ directory in Cuckoo or the modules/signatures/ directory of the Community repository (the Community repository is still using legacy directory structuring).

The following is a basic example signature:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from cuckoo.common.abstracts import Signature

class CreatesExe(Signature):
    name = "creates_exe"
    description = "Creates a Windows executable on the filesystem"
    severity = 2
    categories = ["generic"]
    authors = ["Cuckoo Developers"]
    minimum = "2.0"

    def on_complete(self):
        return self.check_file(pattern=".*\\.exe$", regex=True)

As you can see the structure is really simple and consistent with the other modules. We’re going to get into details later, but since version 1.2 Cuckoo provides some helper functions that make the process of creating signatures much easier.

In this example we just walk through all the accessed files in the summary and check if there is anything ending with “.exe”: in that case it will return True, meaning that the signature matched, otherwise return False.

The function on_complete is called at the end of the cuckoo signature process. Other function will be called before on specific events and help you to write more sophisticated and faster signatures.

In case the signature gets matched, a new entry in the “signatures” section will be added to the global container roughly as follows:

"signatures": [
    {
        "severity": 2,
        "description": "Creates a Windows executable on the filesystem",
        "alert": false,
        "references": [],
        "data": [
            {
                "file_name": "C:\\d.exe"
            }
        ],
        "name": "creates_exe"
    }
]
Creating your new signature

In order to make you better understand the process of creating a signature, we are going to create a very simple one together and walk through the steps and the available options. For this purpose, we’re simply going to create a signature that checks whether the malware analyzed opened a mutex named “i_am_a_malware”.

The first thing to do is import the dependencies, create a skeleton and define some initial attributes. These are the ones you can currently set:

  • name: an identifier for the signature.
  • description: a brief description of what the signature represents.
  • severity: a number identifying the severity of the events matched (generally between 1 and 3).
  • categories: a list of categories that describe the type of event being matched (for example “banker”, “injection” or “anti-vm”).
  • families: a list of malware family names, in case the signature specifically matches a known one.
  • authors: a list of people who authored the signature.
  • references: a list of references (URLs) to give context to the signature.
  • enable: if set to False the signature will be skipped.
  • alert: if set to True can be used to specify that the signature should be reported (perhaps by a dedicated reporting module).
  • minimum: the minimum required version of Cuckoo to successfully run this signature.
  • maximum: the maximum required version of Cuckoo to successfully run this signature.

In our example, we would create the following skeleton:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from cuckoo.common.abstracts import Signature

class BadBadMalware(Signature): # We initialize the class inheriting Signature.
    name = "badbadmalware" # We define the name of the signature
    description = "Creates a mutex known to be associated with Win32.BadBadMalware" # We provide a description
    severity = 3 # We set the severity to maximum
    categories = ["trojan"] # We add a category
    families = ["badbadmalware"] # We add the name of our fictional malware family
    authors = ["Me"] # We specify the author
    minimum = "2.0" # We specify that in order to run the signature, the user will simply need Cuckoo 2.0

    def on_complete(self):
        return

This is a perfectly valid signature. It doesn’t really do anything yet, so now we need to define the conditions for the signature to be matched.

As we said, we want to match a particular mutex name, so we proceed as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from cuckoo.common.abstracts import Signature

class BadBadMalware(Signature):
    name = "badbadmalware"
    description = "Creates a mutex known to be associated with Win32.BadBadMalware"
    severity = 3
    categories = ["trojan"]
    families = ["badbadmalware"]
    authors = ["Me"]
    minimum = "2.0"

    def on_complete(self):
        return self.check_mutex("i_am_a_malware")

Simple as that, now our signature will return True whether the analyzed malware was observed opening the specified mutex.

If you want to be more explicit and directly access the global container, you could translate the previous signature in the following way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from cuckoo.common.abstracts import Signature

class BadBadMalware(Signature):
    name = "badbadmalware"
    description = "Creates a mutex known to be associated with Win32.BadBadMalware"
    severity = 3
    categories = ["trojan"]
    families = ["badbadmalware"]
    authors = ["Me"]
    minimum = "2.0"

    def on_complete(self):
        for process in self.get_processes_by_pid():
            if "summary" in process and "mutexes" in process["summary"]:
                for mutex in process["summary"]["mutexes"]:
                    if mutex == "i_am_a_malware":
                        return True

        return False
Evented Signatures

Since version 1.0, Cuckoo provides a way to write more high performance signatures. In the past every signature was required to loop through the whole collection of API calls collected during the analysis. This was unnecessarily causing performance issues when such collection would be of a large size.

Since 1.2 Cuckoo only supports the so called “evented signatures”. The old signatures based on the run function can be ported to using on_complete. The main difference is that with this new format, all the signatures will be executed in parallel and a callback function called on_call() will be invoked for each signature within one single loop through the collection of API calls.

An example signature using this technique is the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from cuckoo.common.abstracts import Signature

class SystemMetrics(Signature):
    name = "generic_metrics"
    description = "Uses GetSystemMetrics"
    severity = 2
    categories = ["generic"]
    authors = ["Cuckoo Developers"]
    minimum = "2.0"

    # Evented signatures can specify filters that reduce the amount of
    # API calls that are streamed in. One can filter Process name, API
    # name/identifier and category. These should be sets for faster lookup.
    filter_processnames = set()
    filter_apinames = set(["GetSystemMetrics"])
    filter_categories = set()

    # This is a signature template. It should be used as a skeleton for
    # creating custom signatures, therefore is disabled by default.
    # The on_call function is used in "evented" signatures.
    # These use a more efficient way of processing logged API calls.
    enabled = False

    def on_complete(self):
        # In the on_complete method one can implement any cleanup code and
        #  decide one last time if this signature matches or not.
        #  Return True in case it matches.
        return False

    # This method will be called for every logged API call by the loop
    # in the RunSignatures plugin. The return value determines the "state"
    # of this signature. True means the signature matched and False it did not this time.
    # Use self.deactivate() to stop streaming in API calls.
    def on_call(self, call, pid, tid):
        # This check would in reality not be needed as we already make use
        # of filter_apinames above.
        if call["api"] == "GetSystemMetrics":
            # Signature matched, return True.
            return True

        # continue
        return None

The inline comments are already self-explanatory.

Another event is triggered when a signature matches.

1
2
3
4
5
6
def on_signature(self, matched_sig):
    required = ["creates_exe", "badmalware"]
    for sig in required:
        if not sig in self.list_signatures():
            return
    return True

This kind of signature can be used to combine several signatures identifying anomalies into one signature classifying the sample (malware alert).

Marks & Helpers

Starting from version 1.2, signatures are able to log exactly what triggered the signature. This allows users to better understand why this signature is present in the log, and to be able to better focus malware analysis.

For examples on marks and helpers please refer to the Cuckoo Community for now - until we write some thorough up-to-date documentation on that.

Reporting Modules

After the raw analysis results have been processed and abstracted by the processing modules and the global container is generated (ref. Processing Modules), it is passed over by Cuckoo to all the reporting modules available, which will make use of it and will make it accessible and consumable in different formats.

Getting Started

All reporting modules must be placed inside the cuckoo/cuckoo/reporting/ directory (which translates to the cuckoo.reporting module).

Every module must also have a dedicated section in the $CWD/conf/reporting.conf file: for example if you create a module cuckoo/cuckoo/reporting/foobar.py you will have to append the following section to $CWD/conf/reporting.conf (and thus cuckoo/data/conf/reporting.conf in the Git repository):

[foobar]
enabled = on

Every additional option you add to your section will be available to your reporting module in the self.options dictionary.

Following is an example of a working JSON reporting module:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import os
import json
import codecs

from cuckoo.common.abstracts import Report
from cuckoo.common.exceptions import CuckooReportError

class JsonDump(Report):
    """Saves analysis results in JSON format."""

    def run(self, results):
        """Writes report.
        @param results: Cuckoo results dict.
        @raise CuckooReportError: if fails to write report.
        """
        try:
            report = codecs.open(os.path.join(self.reports_path, "report.json"), "w", "utf-8")
            json.dump(results, report, sort_keys=False, indent=4)
            report.close()
        except (UnicodeError, TypeError, IOError) as e:
            raise CuckooReportError("Failed to generate JSON report: %s" % e)

This code is very simple, it receives the global container produced by the processing modules, converts it into JSON and writes it to a file.

There are few requirements for writing a valid reporting module:

  • Declare your class inheriting from Report.
  • Have a run() function performing the main operations.
  • Try to catch most exceptions and raise CuckooReportError to notify the issue.

All reporting modules have access to some attributes:

  • self.analysis_path: path to the folder containing the raw analysis results (e.g. storage/analyses/1/)
  • self.reports_path: path to the folder where the reports should be written (e.g. storage/analyses/1/reports/)
  • self.options: a dictionary containing all the options specified in the report’s configuration section in conf/reporting.conf.

Development

This chapter explains how to write Cuckoo’s code and how to contribute.

Development Notes

Git branches

Cuckoo Sandbox source code is available in our official Git repository.

Up until version 1.0 we used to coordinate all ongoing development in a dedicated “development” branch and we’ve been exclusively merging pull requests in such branch. Since version 1.1 we moved development to the traditional “master” branch and we make use of GitHub’s tags and release system to reference development milestones in time.

Release Versioning

At the moment we utilize three types of releases: * 1.2.3, an official release, preferably accompanied by a blogpost * 1.2.4a1, an alpha release that showcases functionality that will be present in the upcoming release * 1.2.3.1, a hotfix release, meant to fix critical issues, usually found in the latest official release

Ticketing system

To submit bug reports or feature requests, please use GitHub’s Issue tracking system.

Contribute

To submit your patch just create a Pull Request from your GitHub fork. If you don’t now how to create a Pull Request take a look to GitHub help.

Coding Style

In order to contribute code to the project, you must diligently follow the style rules describe in this chapter. Having a clean and structured code is very important for our development lifecycle. We do help out with code refactoring where required, but please try to do as much as possible on your own.

Essentially Cuckoo’s code style is based on PEP 8 - Style Guide for Python Code and PEP 257 – Docstring Conventions.

Formatting
Indentation

The code must have a 4-spaces-tabs indentation. Since Python enforce the indentation, make sure to configure your editor properly or your code might cause malfunctioning.

Maximum Line Length

Limit all lines to a maximum of 79 characters.

Blank Lines

Separate the class definition and the top level function with one blank line. Methods definitions inside a class are separated by a single blank line:

class MyClass:
    """Doing something."""

    def __init__(self):
        """Initialize"""
        pass

    def do_it(self, what):
        """Do it.
        @param what: do what.
        """
        pass

Use blank lines in functions, sparingly, to isolate logic sections. Import blocks are separated by a single blank line, import blocks are separated from classes by one blank line.

Imports

Imports must be on separate lines. If you’re importing multiple objects from a package, use a single line:

from lib import a, b, c

NOT:

from lib import a
from lib import b
from lib import c

Always specify explicitly the objects to import:

from lib import a, b, c

NOT:

from lib import *
Strings

Strings must be delimited by double quotes (“).

Printing and Logging

We discourage the use of print(): if you need to log an event please use Python’s logging which is already initialized by Cuckoo.

In your module add:

import logging
log = logging.getLogger(__name__)

And use the log handle. More details can be found in the Python documentation, but as follows is an example:

log.info("Log message")
Exceptions

Custom exceptions must be defined in the cuckoo/common/exceptions.py file.

The following is the current Cuckoo exceptions chain:

.-- CuckooCriticalError
|   |-- CuckooStartupError
|   |-- CuckooDatabaseError
|   |-- CuckooMachineError
|   `-- CuckooDependencyError
|-- CuckooOperationalError
|   |-- CuckooAnalysisError
|   |-- CuckooProcessingError
|   `-- CuckooReportError
`-- CuckooGuestError

Beware that the use of CuckooCriticalError and its child exceptions will cause Cuckoo to terminate.

Naming

Custom exception names must start with “Cuckoo” and end with “Error” if it represents an unexpected malfunction.

Exception handling

When catching an exception and accessing its handle, use as e:

try:
    foo()
except Exception as e:
    bar()

NOT:

try:
    foo()
except Exception, something:
    bar()

It’s a good practice use “e” instead of “e.message”.

Documentation

All code must be documented in docstring format, see PEP 257 – Docstring Conventions. Additional comments may be added in logical blocks to make the code easier to understand.

Automated testing

We believe in automated testing to provide high quality code and avoid easily overlooked mistakes.

When possible, all code must be committed with proper unit tests. Particular attention must be placed when fixing bugs: it’s good practice to write unit tests to reproduce the bug. All unit tests and fixtures are placed in the tests folder in the Cuckoo root. We have adopted Pytest as unit testing framework.

Development with the Python Package

With the new Python package developing and testing code now works slightly different than it used to be. As one will first have to 安装 Cuckoo before being able to use it in the first place, a simple modify-and-test development sequence doesn’t work out-of-the-box as it used to do.

Following we outline how to develop and test new features while using the Cuckoo Package.

  • Initialize a new virtualenv. Note that any virtualenv’s in /tmp won’t survive a reboot and as such a more convenient location may be, e.g., ~/venv/cuckoo-development (i.e., place the cuckoo-development virtualenv in a generic ~/venv/ directory for all your virtualenv’s).

    $ virtualenv /tmp/cuckoo-development
    
  • Activate the virtualenv. This has to be done every time you start a new shell session (unless you put the command in ~/.bashrc or similar, of course).

    $ . /tmp/cuckoo-development/bin/activate
    
  • In order to create a Cuckoo distribution package it is required to obtain the matching monitoring binaries from our Community repository for this version of Cuckoo. Fortunately we provide a simple-to-use script to fetch them semi-automatically for you. From the repository root directory one may run as follows to automatically grab the binaries.

    (cuckoo-development)$ python stuff/monitor.py
    
  • Install Cuckoo in development mode, in which files from the current directory (a git clone’d Cuckoo repository on the package branch) will be used during execution.

    (cuckoo-development)$ python setup.py sdist develop
    

You will now be ready to modify and test files. Note that the code files are located in the cuckoo/ directory of the Git repository and the fact that, even though you will be testing a development version of the repository, all the rules from the Cuckoo 工作目录 and Cuckoo 工作目录使用说明 are still in-place.

Happy development! Please reach out to us if you require additional help to get up-and-running with the latest development tricks.

Frontend

警告

This documentation is WIP.

The Javascript code in Cuckoo web is developed in ECMASript 6. For browser compatibility, this will need to be transpiled back to ECMAScript 5.

Cuckoo makes use of Gulp to build from source to static (frontend sources). Before you can use this, make sure that the following dependencies are installed (required for Node and following assets):

On a Debian based system the package requirements are:

apt-get install build-essential
curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
sudo apt-get install nodejs
sudo npm install gulp -g
sudo apt-get install ruby-full rubygems
sudo gem update --system
sudo gem install sass
npm install
# At this point the required libs are installed.
# Run the following command to build new JS and CSS:
gulp build

After these packages have been installed, navigate to the source folder (cd cuckoo/web/src) and run npm install. This will install all node modules as listed in package.json. After this you should be good to go!

NPM Executables

While in the cuckoo/web/src directory (or the directory where package.json is located) you can run the following commands:

gulp OR npm start
runs build processes and starts watcher
gulp build OR npm run-script build
build source to static ONCE.
gulp styles
only runs the ‘styles’ task (compiles SCSS to static/css)
Transpiling/Compiling

Try modifying one of the .js files in the cuckoo/web/src/scripts/ directory and confirm that Pycharm transpiles the Javascript to ECMAScript 5.

Creating new tasks

You can easily plug in new tasks by creating a new javascript file in cuckoo/web/src/tasks. Gulpfile.js automagically loads these tasks and will be available throughout the npm session gulp uses (this means you don’t have to do a lot more.).

A gulp module in its basic form looks like this:

- coming!

Developing with Pycharm

Within this section we will cover a vast array of Pycharm configuration options in the context of Cuckoo development. We will try to cover all aspects of running and developing Cuckoo under this IDE.

Cuckoo Web

This section covers the Cuckoo Web interface that runs on Django. The code is quite easy to modify and creating custom features is simple.

Locations and concepts
  • Cuckoo Web provides the web interface and a REST API
  • The Django project root is located at cuckoo/web
  • The configuration is located at cuckoo/web/web/settings.py
  • URL dispatchers are in cuckoo/web/web/urls.py, as well as other locations such as (but not limited to) cuckoo/web/analysis/urls.py
  • The HTML templates use the Django Templating Language.
  • The front-end uses cuckoo/web/static/js/cuckoo/ for the Cuckoo related JavaScript includes, while their sources are in cuckoo/web/static/js/cuckoo/src/ (ECMAScript 5/6) - See paragraph ‘JavaScript transpiling’.
  • So called ‘controllers’ are used instead of class-based views, where a controller is responsible for (usually back-end) actions that don’t belong in view functions. Example: cuckoo/web/controllers/analysis/analysis.py
  • View functions are functions used by views, located in routes.py. Example: cuckoo/web/controllers/analysis/routes.py
  • API functions are functions used by the API, located in api.py. Example: cuckoo/web/controllers/analysis/api.py
Running and debugging

Running and debugging Cuckoo web straight from Pycharm comes down to circumventing the cuckoo launcher and using Pycharm’s built-in Django server. Thankfully, no modifications are neccesary to the Cuckoo code in order to do this.

Firstly, It is recommended that you work in a virtualenv to keep the dependencies required by Cuckoo seperate from your system-wide installed Python. Secondly, you should install Cuckoo in development mode; python setup.py develop.

Assuming Cuckoo is installed correctly (and has an active working directory; see Cuckoo Working Directory Installation ); Start Pycharm and open the Cuckoo directory. Go to Run->Edit Configurations and click the + button. Pick ‘Django server’. Use the following values:

  • Name - web
  • Host - 127.0.0.1
  • Port - 8080
  • Environment variables - Click ... and add 2 new values: CUCKOO_APP: web and CUCKOO_CWD: /home/test/.cuckoo/, where the path is the location of your CWD (Cuckoo Working Directory).
  • Python interpreter - Pick the virtualenv you made earlier. If it’s not there, add the virtualenv to this project under File->Settings->Project: Cuckoo->Project Interpreter
  • Working directory - This absolute path to the Django project root. For me this is /home/test/PycharmProjects/virtualenv/cuckoo/cuckoo/web/

Cuckoo web can now be run (and debugged) from Pycharm. Go to Run->Run->web from the menu and the webserver shall start.

JavaScript transpiling

警告

Transpiling JavaScript through Pycharm file watchers is not recommended. The recommended way is explained in the ‘Frontend’ section of the documentation.

The Javascript code in Cuckoo web is developed in ECMASript 6. For browser compatibility, this will need to be transpiled back to ECMAScript 5.

Firstly, make Pycharm regonize and understand the ECMAScript 6 syntax. Go to File->Settings->Languages & Frameworks->Javascript and pick ‘ECMAScript 6’ from the ‘Javascript language version’ dropdown. Hit Apply.

Then, use Babel to transpile the Javascript code. Install Babel in the Cuckoo project root (requires npm):

(cuckoo)    test:$ pwd
/home/test/PycharmProjects/virtualenv/cuckoo
(cuckoo)    test:$ npm install --save-dev babel-cl

Which will create a folder called node_modules in the Cuckoo project root.

Switch back to Pycharm and open any .js file in cuckoo/web/static/js/cuckoo/src/. Pycharm will ask you if you want to configure a File watcher for this file. Click Add watcher (if this option is not available to you, find the ‘file watcher’ configuration under File->Settings->Tools->File watchers).

In the following pop-up screen ‘Edit Watcher’, enter these values.

  • Name - Babel ES6->ES5
  • Description - Transpiles ECMAScript 6 code to ECMAScript 5
  • Output filters - None
  • Show console - Error
  • Immediate file synchronisation - yes
  • Track only root files - yes
  • Trigger watcher regardless of syntax errors - no
  • File type - Javascript
  • Scope - Click ... -> Click + (add scope) -> Click local -> Press OK. In the file browser, browse to cuckoo/web/static/js/cuckoo/src/ and whilst selecting the src folder, click include. The files containing in src should now turn green. Press OK.
  • Program - Should be the absolute path to node_modules/.bin/babel, for me this is /home/test/PycharmProjects/virtualenv/cuckoo/node_modules/.bin/babel. Double check that the path you enter reflects the actual location of the node_modules/.bin/babel file.
  • Arguments - --source-maps --out-file $FileNameWithoutExtension$.js $FilePath$
  • Working directory - Browse and select cuckoo/web/static/js/cuckoo
  • Output paths to refresh $FileNameWithoutExtension$-compiled.js:$FileNameWithoutExtension$-compiled.js.map

Finally; a mock manage.py file needs to be created in order for Pycharm to see it as a Django project. Create the following file cuckoo/web/web/manage.py with the contents:

#!/usr/bin/env python
import sys

if __name__ == "__main__":
   from django.core.management import execute_from_command_line
   execute_from_command_line(sys.argv)

Go to File->Settings->Langauges & Frameworks->Django and;

  • Django Project root - cuckoo/web
  • Settings - web/settings.py
  • Manage script - web/manage.py
Testing

The configuration should now be complete. Try running Cuckoo from within Pycharm & happy coding!

Final Remarks

Join the discussion

If you are encountering an issue you can’t solve and are looking for some help, go to our Discussion page and pick a platform of your choice. This is where you can get in contact with the Cuckoo Developers and users (our preference go to the Slack & IRC).

Please read the following rules before posting:

  • Before posting, read our Github issue tracker, the Cuckoo blog, the documentation and Google about your issue. DO NOT post questions that have already been answered over and over everywhere.
  • Posting messages saying just something like “Doesn’t work, help me” are completely useless. If something is not working report the error, paste the logs, the configuration files, the information on the virtual machine, the results of the troubleshooting, etc. Give context. We are not wizards and we don’t have a crystal ball.
  • Use a proper title. Stuff like “Doesn’t work”, “Help me”, “Error” are not proper titles.

Support Us

Cuckoo Sandbox is a completely open source software, released freely to the public and developed mostly during free time by volunteers. If you enjoy it and want to see it kept developed and updated, please consider supporting us.

We are always looking for financial support, hardware support and contributions of any sort. If you’re interested in cooperating, feel free to contact us.

People

Cuckoo Sandbox is an open source project result of the efforts and contributions of a lot of people who enjoyed volunteering some of their time for a greater good :).

Active Developers
Name Role Contact
Claudio nex Guarnieri Project Founder nex at nex dot sx
Alessandro jekil Tanasi Core Developer alessandro at tanasi dot it
Jurriaan skier Bremer Lead Developer jbr at cuckoo dot sh
Mark rep Schloesser Core Developer ms at mwcollect dot org
Contributors

It’s hard at this point to keep track of all individual contributions. In the Cuckoo Contributors page there is the list of people who contributed code to our GitHub repository.

There is a number of friends who provided feedback, ideas and support during the years of development of this project, including but not limited to:

  • Felix Leder
  • Tillmann Werner
  • Georg Wicherski
  • David Watson
  • Christian Seifert