Welcome to Sigil2¶
Sigil2 is a framework for observing and analyzing applications.
Quickstart¶
This document will go through building and running Sigil2.
Building Sigil2¶
Note
The default compiler for CentOS 7 and older (gcc <5) does not support C++14. Install and enable the offical Devtoolset before compiling.
Clone and build Sigil2 from source:
$ git clone https://github.com/VANDAL/sigil2
$ cd sigil2
$ mkdir build && cd build
$ cmake{3} .. # CentOS 7 requires cmake3 package
$ make -j
This creates a build/bin
folder containing the sigil2 executable.
It can be run in place, or the entire bin
folder can be moved,
although it’s not advised to move it to a system location.
Running Sigil2¶
Sigil2 requires at least two arguments: the backend
analysis tool,
and the executable
application to measure:
$ bin/sigil2 --backend=stgen --executable=./mybinary
The backend
is the analysis tool that will process all the events
in mybinary
. In this example, stgen
is the backend that processes
events into a special event trace that is used in SynchroTrace.
More information on backends are in The Analysis Backend.
A third option frontend
will change the underlying method
for observing the application. By default, this is Valgrind:
$ bin/sigil2 --frontend=valgrind --backend=stgen --executable=./mybinary
Available frontends are discussed in The Profiling Frontend.
Dependencies¶
PACKAGE | VERSION |
---|---|
gcc/g++ | 5+ |
cmake | 3.1.3+ |
make | 3.8+ |
automake | 1.13+ |
autoconf | 2.69+ |
zlib | 1.27+ |
git | 1.8+ |
Overview¶
Sigil2 is a framework designed to help analyze the dynamic behavior of applications. We call this dynamic behavior, with its given inputs and state, the workload. Having this workload is very useful for debugging, performance profiling, and simulation on hardware. Sigil2 was born from the need to generate application traces for trace-driven simulation, so low-level, detailed traces are the primary use-case.
Workloads¶
One of the main goals behind Sigil2 is providing a straightforward interface to represent and analyze workloads. A workload can be represented in many ways, and each way has different requirements.
...you might represent a workload as a simple assembly instruction trace:
push %rbp
push %rbx
mov %rsi,%rbp
mov %edi,%ebx
sub $0x8,%rsp
callq 4377b0 <_Z17myfuncv>
callq 4261e0 <_ZN5myotherfunc>
mov %rbp,%rdx
mov %ebx,%esi
mov %rax,%rdi
callq 422460 <_ZN5GO>
add $0x8,%rsp
xor %eax,%eax
pop %rbx
pop %rbp
retq
...or you might represent a workload as a call graph:
...or you might represent a workload as a memory trace:
ADDR BYTES
0xdeadbeef 8
0x12345678 4
0x00000000 1
...
...or more complex representations. Each of these representations are made up of the same event categories, albeit at different levels of granularity.
Event Primitives¶
Because of the variety of use-cases for analyzing workloads, Sigil2 decided to present workloads as a set of extensible primitives.
Event Primitive | Description |
---|---|
Compute | some transformation of data |
Memory | some movement of data |
Control Flow | divergence in an event stream |
Synchronization | ordering between separate event streams |
Context | grouping of events |
The format of these events is not defined, but you can imagine that events would look like:
...
compute FLOP, add, SIMD4
memory write, 4B, <addr1>
memory read, 16B, <addr2>
context func, enter, hello_world_thread
sync create, <TID1>
...
Todo
More detail is discussed futher in ???
Event Generation¶
Many tools exist to capture workloads:
- static instrumentation tools
- dynamic binary instrumentation tools
- hardware performance counter sampling
- architecture-specific
- simulation probes
- and others
Each tool has its merits depending on the desired granularity and source of the event trace. Execution-driven simulators are great for fine-grained, low-level traces, but may be impractical for a large workload. Most DBI tools do a good job of obvserving the instruction stream of general purpose CPU workloads, but may not be useful when looking at workloads that use peripheral devices like GPUs or third-party IP.
Sigil2 recognizes this and creates an abstraction to the underlying tool that observes the workload. Events are translated into Sigil2 event primitives that are then presented to the user for further processing. The tool used for event generation is a Sigil2 frontend, and the user-defined processing on those events is a Sigil2 backend. Currently, backends are written as C++ static plugins to Sigil2, although there is room for expansion, given enough interest.
User Documentation¶
The Analysis Backend¶
Note
This documentation is still a WIP
Getting Started with Profiling¶
This example will demonstrate how to get started analyzing a workload. Typically it’s easier to analyze a trace file than to directly analyze a workload. That is, it’s easier to generate a trace and post-process it multiple times, instead of analyzing the application on-the-fly. Parsing a trace file containing relevant data is going to be faster and more straightforward than running a workload multiple times and having Sigil2 filter all the potential metadata repeatedly.
Let’s do a simple example that counts each of the event primitives:
Todo
simplecount example
- creating the backend (design to be multithreaded)
- registering as a static plugin
- running
The Profiling Frontend¶
A frontend is the component that is generating the event stream. By default, this is Valgrind (mostly due to historical reasons).
While it’s tempting to assume that the event generation just works™ you should be aware of the intrinsic nature of the chosen frontend before making any large assumptions.
Valgrind¶
Valgrind is the default frontend. No additional options are required. The following two command lines are equivalent.
$ bin/sigil2 --backend=simplecount --executable=ls -lah
$ bin/sigil2 --frontend=valgrind --backend=simplecount --executable=ls -lah
Valgrind is a copy & annotate dynamic binary instrumentation tool. This means that the dynamic instruction stream is grouped into blocks, disassembled into Valgrind’s VEX IR, instrumented, and then recompiled just-in-time.
DynamoRIO¶
DynamoRIO is not built with Sigil2 by default. To enable DynamoRIO as a frontend, build Sigil2 using the following cmake build command:
$ cmake .. -DCMAKE_BUILD_TYPE=release -DENABLE_DRSIGIL:bool=true
DynamoRIO can now be invoked as a frontend:
$ bin/sigil2 --frontend=dynamorio --backend=simplecount --executable=ls -lah
DynamoRIO’s IR exists closer to the ISA than the IR used by Valgrind. Sigil2 converts DynamoRIO IR to event primitives by inspection of each opcode.
Todo
mmm475 to fill in more details
FAQ¶
Backend Documentation¶
SimpleCount¶
Synopsis¶
$ bin/sigil2 --frontend=FRONTEND --backend=simplecount --executable=mybinary -myoptions
Description¶
SimpleCount is a demonstrative backend that counts each event type received from a given frontend. These events are aggregated across all threads.
Options¶
No available options
SynchroTraceGen¶
Synopsis¶
$ bin/sigil2 --frontend=FRONTEND --backend=stgen OPTIONS --executable=mybinary -myoptions
Description¶
SynchroTraceGen is a frontend for generating trace files for the SynchroTrace simulation framework.
Each thread detected by SynchroTraceGen is given its own output trace file, named sigil.events-#.out
.
By default, the output is directly compressed since the trace files can grow very large.
Options¶
Frontend Documentation¶
Each frontend generates one or more event streams to a Sigil2 backend analysis tool. Each frontend has it’s own internal representation (IR) of events, so the process of converting frontend IR to Sigil2 event primitives is different for each frontend. For example, Valgrind will disassemble each machine instruction into multiple VEX IR statements and expressions; DynamoRIO annotates each instruction in a basic block with specific attributes; the current Perf frontend only supports x86_64 decoding via the Intel XED library.
Valgrind¶
Synopsis¶
$ bin/sigil2 --frontend=valgrind OPTIONS --backend=BACKEND --executable=mybinary -myoptions
Description¶
Uses a heavily modified Callgrind tool, Sigrind, to observe Sigil2 event primitives and pass them to the backend. Valgrind serializes all threads in the target executable, so only one thread’s event stream is passed to the backend at a time. A context switch is signaled with a Sigil2 context event. Because threads are serialized by Valgrind, the target executable is mostly deterministic.
Options¶
Multithreaded Application Support¶
The Valgrind frontend automatically supports synchronization events in applications that use the POSIX threads library and/or the OpenMP library by intercepting relevant API calls.
Pthreads¶
Pthreads should be supported for most versions of GCC/libc, because the Pthread API is quite stable.
Pthreads support exists for any application dynamically linked to the Pthreads library.
See Static Library Support for applicatons that are statically linked.
OpenMP¶
Only GCC 4.9.2 is officially supported for synchronization event capture, because the implementation of the library is more likely to change between GCC versions.
Dynamically linked OpenMP applications are not supported. Only Static Library Support exists.
Static Library Support¶
Applications that use a static Pthreads or OpenMP library must be manually linked with the
sigil2-valgrind wrapper archive.
This can be found in BUILD_DIR/bin/libsglwrapper.a
.
For example:
$CC $CFLAGS main.c -Wl,--whole-archive $BUILD_DIR/bin/libsglwrapper.a -Wl,--no-whole-archive
DynamoRIO¶
Synopsis¶
$ bin/sigil2 --num-threads=N --frontend=dynamorio OPTIONS --backend=BACKEND --executable=mybinary -myoptions
Description¶
Note
-DDYNAMORIO_ENABLE=ON must be passed to cmake during configuration to build with DynamoRIO support.
DynamoRIO is a cross-platform dynamic binary instrumentation tool. DynamoRIO runs multithreaded applications natively. This makes results less reproducible than Valgrind, however analysis is potentially faster on a multi-core architecture. This enables multiple event streams to be processed at once, by setting –num-threads > 1.
Intel Process Trace¶
Synopsis¶
$ bin/sigil2 --frontend=perf --backend=BACKEND --executable=perf.data
Description¶
Note
-DPERF_ENABLE=ON must be passed to cmake during configuration to build with Perf PT support.
Intel Process Trace is a new CPU feature available on Intel processors that are Broadwell or more recent. The trace is captured via branch results. The entire trace is then reconstructed by perf by replaying the binary, including all shared library loading and context switches. A side effect of only capturing branch results is that all runtime information within the trace is lost, such as some memory access addresses; e.g. the Perf ‘replay’ mechanism does not support replaying malloc results.
For more usage details, see: perf design document for Intel PT
For more technical details see: Intel Software Developer’s Manual Volume Three
Options¶
Note
The perf.data
file is generated with: perf record -e intel_pt//u ./myexec
If you receive ‘AUX data lost N times out of M!‘, try increasing the size of the AUX
buffer. Otherwise a significant of the portion of the trace may not be reproduced:
perf record -m,AUXTRACE_PAGES -e intel_pt//u ./myexec
Todo
options
Developer Documentation¶
Todo
section for developers
event primitives in depth
section for backend writing/multithreaded
frontend event generation
frontend IPC
frontend synchronization capture
About¶
Sigil2 comes from Drexel University’s VLSI & Architecture Lab, headed by Dr. Baris Taskin and in collaboration with Tufts University’s Dr. Mark Hempstead.
The goal of Sigil2 is modular application analysis. It was formed from the need to support multiple projects that study application traces, aimed at data-driven architecture design. This has included early hardware accelerator co-design [SIGIL], as well as uncore design space exploration with multi-threaded workloads [SYNCHROTRACE] [UNCORERPD]. Sigil2 is not interested in instrumenting the behavior of an application, but instead aims to classify events in the application and present those events for further analysis. In this way, Sigil2 does not require that each researcher have an in depth understanding of the binary instrumentation tools.
Why call it Sigil2?¶
The initial incarnation of Sigil was developed by Dr. Siddharth Nilakantan for his research into software-hardware co-design [SIGIL]. He named it after Sigil, a city in Planescape: Torment. He also pronounced it “sih-gul”. The current maintainer and developer of Sigil2, Michael Lui, has kept the name and pronunciation for historical purposes. However, all of the underlying code and infrastructure has been rewritten and enhanced.
Features¶
- Flexible application analysis
- Use multiple frontends for capturing software workloads like Valgrind and DynamoRIO
- Use custom C++14 libraries for analyzing event streams
- Platform-independent events
- Straight-forward and extensible format, simplifying analysis
Installation¶
See the Quickstart for information installation instructions.
Contribute¶
Source Code: https://git.io/sigil2
Issue Tracker: https://github.com/VANDAL/sigil2/issues
Support¶
Please contact our mailing list for any issues or concerns: sigil2@googlegroups.com
License¶
This project is licensed under the BSD3 license.