Python bindings for libgccjit¶
This document describes the Python bindings to libgccjit.
The bindings support both CPython 2 and CPython 3 (using Cython).
Note that both libgccjit and the bindings are of “Alpha” quality; the APIs are not yet set in stone, and they shouldn’t be used in production yet.
Contents:
Tutorial¶
Creating a trivial machine code function¶
Consider this C function:
int square(int i) { return i * i; }
How can we construct this from within Python using libgccjit?
First we need to import the Python bindings to libgccjit:
>>> import gccjit
All state associated with compilation is associated with a gccjit.Context:
>>> ctxt = gccjit.Context()
The JIT library has a system of types. It is statically-typed: every expression is of a specific type, fixed at compile-time. In our example, all of the expressions are of the C int type, so let’s obtain this from the context, as a gccjit.Type:
>>> int_type = ctxt.get_type(gccjit.TypeKind.INT)
The various objects in the API have reasonable __str__ methods:
>>> print(int_type)
int
Let’s create the function. To do so, we first need to construct its single parameter, specifying its type and giving it a name:
>>> param_i = ctxt.new_param(int_type, b'i')
>>> print(param_i)
i
Now we can create the function:
>>> fn = ctxt.new_function(gccjit.FunctionKind.EXPORTED,
... int_type, # return type
... b"square", # name
... [param_i]) # params
>>> print(fn)
square
To define the code within the function, we must create basic blocks containing statements.
Every basic block contains a list of statements, eventually terminated by a statement that either returns, or jumps to another basic block.
Our function has no control-flow, so we just need one basic block:
>>> block = fn.new_block(b'entry')
>>> print(block)
entry
Our basic block is relatively simple: it immediately terminates by returning the value of an expression. We can build the expression:
>>> expr = ctxt.new_binary_op(gccjit.BinaryOp.MULT,
... int_type,
... param_i, param_i)
>>> print(expr)
i * i
This in itself doesn’t do anything; we have to add this expression to a statement within the block. In this case, we use it to build a return statement, which terminates the basic block:
>>> block.end_with_return(expr)
OK, we’ve populated the context. We can now compile it:
>>> jit_result = ctxt.compile()
and get a gccjit.Result.
We can now look up a specific machine code routine within the result, in this case, the function we created above:
>>> void_ptr = jit_result.get_code(b"square")
We can now use ctypes.CFUNCTYPE to turn it into something we can call from Python:
>>> import ctypes
>>> int_int_func_type = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int)
>>> callable = int_int_func_type(void_ptr)
It should now be possible to run the code:
>>> callable(5)
25
Options¶
To get more information on what’s going on, you can set debugging flags on the context using gccjit.Context.set_bool_option().
Setting gccjit.BoolOption.DUMP_INITIAL_GIMPLE will dump a C-like representation to stderr when you compile (GCC’s “GIMPLE” representation):
>>> ctxt.set_bool_option(gccjit.BoolOption.DUMP_INITIAL_GIMPLE, True)
>>> jit_result = ctxt.compile()
square (signed int i)
{
signed int D.260;
entry:
D.260 = i * i;
return D.260;
}
We can see the generated machine code in assembler form (on stderr) by setting gccjit.BoolOption.DUMP_GENERATED_CODE on the context before compiling:
>>> ctxt.set_bool_option(gccjit.BoolOption.DUMP_GENERATED_CODE, True)
>>> jit_result = ctxt.compile()
.file "fake.c"
.text
.globl square
.type square, @function
square:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
.L14:
movl -4(%rbp), %eax
imull -4(%rbp), %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE6:
.size square, .-square
.ident "GCC: (GNU) 4.9.0 20131023 (Red Hat 0.2-0.5.1920c315ff984892399893b380305ab36e07b455.fc20)"
.section .note.GNU-stack,"",@progbits
By default, no optimizations are performed, the equivalent of GCC’s -O0 option. We can turn things up to e.g. -O3 by calling gccjit.Context.set_int_option() with gccjit.IntOption.OPTIMIZATION_LEVEL:
>>> ctxt.set_int_option(gccjit.IntOption.OPTIMIZATION_LEVEL, 3)
>>> jit_result = ctxt.compile()
.file "fake.c"
.text
.p2align 4,,15
.globl square
.type square, @function
square:
.LFB7:
.cfi_startproc
.L16:
movl %edi, %eax
imull %edi, %eax
ret
.cfi_endproc
.LFE7:
.size square, .-square
.ident "GCC: (GNU) 4.9.0 20131023 (Red Hat 0.2-0.5.1920c315ff984892399893b380305ab36e07b455.fc20)"
.section .note.GNU-stack,"",@progbits
Naturally this has only a small effect on such a trivial function.
Full example¶
Here’s what the above looks like as a complete program:
import ctypes import gccjit def create_fn(): # Create a compilation context: ctxt = gccjit.Context() # Turn these on to get various kinds of debugging: if 0: ctxt.set_bool_option(gccjit.BoolOption.DUMP_INITIAL_TREE, True) ctxt.set_bool_option(gccjit.BoolOption.DUMP_INITIAL_GIMPLE, True) ctxt.set_bool_option(gccjit.BoolOption.DUMP_GENERATED_CODE, True) # Adjust this to control optimization level of the generated code: if 0: ctxt.set_int_option(gccjit.IntOption.OPTIMIZATION_LEVEL, 3) int_type = ctxt.get_type(gccjit.TypeKind.INT) # Create parameter "i": param_i = ctxt.new_param(int_type, b'i') # Create the function: fn = ctxt.new_function(gccjit.FunctionKind.EXPORTED, int_type, b"square", [param_i]) # Create a basic block within the function: block = fn.new_block(b'entry') # This basic block is relatively simple: block.end_with_return( ctxt.new_binary_op(gccjit.BinaryOp.MULT, int_type, param_i, param_i)) # Having populated the context, compile it. jit_result = ctxt.compile() # This is what you get back from ctxt.compile(): assert isinstance(jit_result, gccjit.Result) return jit_result def test_calling_fn(i): jit_result = create_fn() # Look up a specific machine code routine within the gccjit.Result, # in this case, the function we created above: void_ptr = jit_result.get_code(b"square") # Now use ctypes.CFUNCTYPE to turn it into something we can call # from Python: int_int_func_type = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int) code = int_int_func_type(void_ptr) # Now try running the code: return code(i) if __name__ == '__main__': print(test_calling_fn(5))
Loops and variables¶
Consider this C function:
int loop_test (int n) { int sum = 0; for (int i = 0; i < n; i++) sum += i * i; return sum; }
This example demonstrates some more features of libgccjit, with local variables and a loop.
Let’s construct this from Python. To break this down into libgccjit terms, it’s usually easier to reword the for loop as a while loop, giving:
int loop_test (int n) { int sum = 0; int i = 0; while (i < n) { sum += i * i; i++; } return sum; }
Here’s what the final control flow graph will look like:
As before, we import the libgccjit Python bindings and make a gccjit.Context:
>>> import gccjit
>>> ctxt = gccjit.Context()
The function works with the C int type:
>>> the_type = ctxt.get_type(gccjit.TypeKind.INT)
though we could equally well make it work on, say, double:
>>> the_type = ctxt.get_type(gccjit.TypeKind.DOUBLE)
Let’s build the function:
>>> return_type = the_type
>>> param_n = ctxt.new_param(the_type, b"n")
>>> fn = ctxt.new_function(gccjit.FunctionKind.EXPORTED,
... return_type,
... b"loop_test",
... [param_n])
>>> print(fn)
loop_test
The base class of expression is the gccjit.RValue, representing an expression that can be on the right-hand side of an assignment: a value that can be computed somehow, and assigned to a storage area (such as a variable). It has a specific gccjit.Type.
Anothe important class is gccjit.LValue. A gccjit.LValue is something that can of the left-hand side of an assignment: a storage area (such as a variable).
In other words, every assignment can be thought of as:
LVALUE = RVALUE;
Note that gccjit.LValue is a subclass of gccjit.RValue, where in an assignment of the form:
LVALUE_A = LVALUE_B;
the LVALUE_B implies reading the current value of that storage area, assigning it into the LVALUE_A.
So far the only expressions we’ve seen are i * i:
ctxt.new_binary_op(gccjit.BinaryOp.MULT,
int_type,
param_i, param_i)
which is a gccjit.RValue, and the various function parameters: param_i and param_n, instances of gccjit.Param, which is a subclass of gccjit.LValue (and, in turn, of gccjit.RValue): we can both read from and write to function parameters within the body of a function.
Our new example has a couple of local variables. We create them by calling gccjit.Function.new_local(), supplying a type and a name:
>>> local_i = fn.new_local(the_type, b"i")
>>> print(local_i)
i
>>> local_sum = fn.new_local(the_type, b"sum")
>>> print(local_sum)
sum
These are instances of gccjit.LValue - they can be read from and written to.
Note that there is no precanned way to create and initialize a variable like in C:
int i = 0;
Instead, having added the local to the function, we have to separately add an assignment of 0 to local_i at the beginning of the function.
This function has a loop, so we need to build some basic blocks to handle the control flow. In this case, we need 4 blocks:
- before the loop (initializing the locals)
- the conditional at the top of the loop (comparing i < n)
- the body of the loop
- after the loop terminates (return sum)
so we create these as gccjit.Block instances within the gccjit.Function:
>>> entry_block = fn.new_block(b'entry')
>>> cond_block = fn.new_block(b"cond")
>>> loop_block = fn.new_block(b"loop")
>>> after_loop_block = fn.new_block(b"after_loop")
We now populate each block with statements.
The entry block consists of initializations followed by a jump to the conditional. We assign 0 to i and to sum, using gccjit.Block.add_assignment() to add an assignment statement, and using gccjit.Context.zero() to get the constant value 0 for the relevant type for the right-hand side of the assignment:
>>> entry_block.add_assignment(local_i, ctxt.zero(the_type))
>>> entry_block.add_assignment(local_sum, ctxt.zero(the_type))
We can then terminate the entry block by jumping to the conditional:
>>> entry_block.end_with_jump(cond_block)
The conditional block is equivalent to the line while (i < n) from our C example. It contains a single statement: a conditional, which jumps to one of two destination blocks depending on a boolean gccjit.RValue, in this case the comparison of i and n. We build the comparison using gccjit.Context.new_comparison():
>>> guard = ctxt.new_comparison(gccjit.Comparison.LT, local_i, param_n)
>>> print(guard)
i < n
and can then use this to add cond_block‘s sole statement, via gccjit.Block.end_with_conditional():
>>> cond_block.end_with_conditional(guard,
... loop_block, # on true
... after_loop_block) # on false
Next, we populate the body of the loop.
The C statement sum += i * i; is an assignment operation, where an lvalue is modified “in-place”. We use gccjit.Block.add_assignment_op() to handle these operations:
>>> loop_block.add_assignment_op(local_sum,
... gccjit.BinaryOp.PLUS,
... ctxt.new_binary_op(gccjit.BinaryOp.MULT,
... the_type,
... local_i, local_i))
The i++ can be thought of as i += 1, and can thus be handled in a similar way. We use gccjit.Context.one() to get the constant value 1 (for the relevant type) for the right-hand side of the assignment:
>>> loop_block.add_assignment_op(local_i,
... gccjit.BinaryOp.PLUS,
... ctxt.one(the_type))
The loop body completes by jumping back to the conditional:
>>> loop_block.end_with_jump(cond_block)
Finally, we populate the after_loop block, reached when the loop conditional is false. At the C level this is simply:
return sum;
so the block is just one statement:
>>> after_loop_block.end_with_return(local_sum)
Note
You can intermingle block creation with statement creation, but given that the terminator statements generally include references to other blocks, I find it’s clearer to create all the blocks, then all the statements.
We’ve finished populating the function. As before, we can now compile it to machine code:
>>> jit_result = ctxt.compile()
>>> void_ptr = jit_result.get_code(b'loop_test')
and use ctypes to turn it into a Python callable:
>>> import ctypes
>>> int_int_func_type = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int)
>>> callable = int_int_func_type(void_ptr)
Now we can call it:
>>> callable(10)
285
Visualizing the control flow graph¶
You can see the control flow graph of a function using gccjit.Function.dump_to_dot():
>>> fn.dump_to_dot('/tmp/sum-of-squares.dot')
giving a .dot file in GraphViz format.
You can convert this to an image using dot:
$ dot -Tpng /tmp/sum-of-squares.dot -o /tmp/sum-of-squares.png
or use a viewer (my preferred one is xdot.py; see https://github.com/jrfonseca/xdot.py; on Fedora you can install it with yum install python-xdot):
Full example¶
Here’s what the above looks like as a complete program:
import ctypes import gccjit def populate_ctxt(ctxt): the_type = ctxt.get_type(gccjit.TypeKind.INT) return_type = the_type param_n = ctxt.new_param(the_type, b"n") fn = ctxt.new_function(gccjit.FunctionKind.EXPORTED, return_type, b"loop_test", [param_n]) # Build locals local_i = fn.new_local(the_type, b"i") local_sum = fn.new_local(the_type, b"sum") assert str(local_i) == 'i' # Build blocks entry_block = fn.new_block(b'entry') cond_block = fn.new_block(b"cond") loop_block = fn.new_block(b"loop") after_loop_block = fn.new_block(b"after_loop") # entry_block: ######################################### # sum = 0 entry_block.add_assignment(local_sum, ctxt.zero(the_type)) # i = 0 entry_block.add_assignment(local_i, ctxt.zero(the_type)) entry_block.end_with_jump(cond_block) ### cond_block: ######################################## # while (i < n) cond_block.end_with_conditional(ctxt.new_comparison(gccjit.Comparison.LT, local_i, param_n), loop_block, after_loop_block) ### loop_block: ######################################## # sum += i * i loop_block.add_assignment_op(local_sum, gccjit.BinaryOp.PLUS, ctxt.new_binary_op(gccjit.BinaryOp.MULT, the_type, local_i, local_i)) # i++ loop_block.add_assignment_op(local_i, gccjit.BinaryOp.PLUS, ctxt.one(the_type)) # goto cond_block loop_block.end_with_jump(cond_block) ### after_loop_block: ################################## # return sum after_loop_block.end_with_return(local_sum) def create_fn(): # Create a compilation context: ctxt = gccjit.Context() if 0: ctxt.set_bool_option(gccjit.BoolOption.DUMP_INITIAL_TREE, True) ctxt.set_bool_option(gccjit.BoolOption.DUMP_INITIAL_GIMPLE, True) ctxt.set_bool_option(gccjit.BoolOption.DUMP_EVERYTHING, True) ctxt.set_bool_option(gccjit.BoolOption.KEEP_INTERMEDIATES, True) if 0: ctxt.set_int_option(gccjit.IntOption.OPTIMIZATION_LEVEL, 3) populate_ctxt(ctxt) jit_result = ctxt.compile() return jit_result def test_calling_fn(i): jit_result = create_fn() int_int_func_type = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int) code = int_int_func_type(jit_result.get_code(b"loop_test")) return code(i) if __name__ == '__main__': print(test_calling_fn(10))
Implementing a “brainf” compiler¶
In this example we use libgccjit to construct a compiler for an esoteric programming language that we shall refer to as “brainf”.
The compiler can run the generated code in-process (JIT compilation), or write the generated code as a machine code executable (classic ahead-of-time compilation).
The “brainf” language¶
brainf scripts operate on an array of bytes, with a notional data pointer within the array.
brainf is hard for humans to read, but it’s trivial to write a parser for it, as there is no lexing; just a stream of bytes. The operations are:
Character | Meaning |
---|---|
> | idx += 1 |
< | idx -= 1 |
+ | data[idx] += 1 |
- | data[idx] -= 1 |
. | output (data[idx]) |
, | data[idx] = input () |
[ | loop until data[idx] == 0 |
] | end of loop |
Anything else | ignored |
Unlike the previous example, we’ll implement an ahead-of-time compiler, which reads .bf scripts and outputs executables (though it would be trivial to have it run them JIT-compiled in-process).
Here’s what a simple .bf script looks like:
[ Emit the uppercase alphabet ] cell 0 = 26 ++++++++++++++++++++++++++ cell 1 = 65 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< while cell#0 != 0 [ > . emit cell#1 + increment cell@1 <- decrement cell@0 ]
Note
This example makes use of whitespace and comments for legibility, but could have been written as:
++++++++++++++++++++++++++
>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
[>.+<-]
It’s not a particularly useful language, except for providing compiler-writers with a test case that’s easy to parse.
Converting a brainf script to libgccjit IR¶
We write simple code to populate a gccjit.Context.
class Paren: def __init__(self, b_test, b_body, b_after): self.b_test = b_test self.b_body = b_body self.b_after = b_after class CompileError(Exception): def __init__(self, compiler, msg): self.filename = compiler.filename self.line = compiler.line self.column = compiler.column self.msg = msg def __str__(self): return ("%s:%i:%i: %s" % (self.filename, self.line, self.column, self.msg)) class Compiler: def __init__(self): #, filename): self.ctxt = gccjit.Context() if 1: self.ctxt.set_int_option(gccjit.IntOption.OPTIMIZATION_LEVEL, 3); self.ctxt.set_bool_option(gccjit.BoolOption.DUMP_INITIAL_GIMPLE, 0); self.ctxt.set_bool_option(gccjit.BoolOption.DUMP_GENERATED_CODE, 0); self.ctxt.set_bool_option(gccjit.BoolOption.DEBUGINFO, 1); self.ctxt.set_bool_option(gccjit.BoolOption.DUMP_EVERYTHING, 0); self.ctxt.set_bool_option(gccjit.BoolOption.KEEP_INTERMEDIATES, 0); self.void_type = self.ctxt.get_type(gccjit.TypeKind.VOID) self.int_type = self.ctxt.get_type(gccjit.TypeKind.INT) self.byte_type = self.ctxt.get_type(gccjit.TypeKind.UNSIGNED_CHAR) self.array_type = self.ctxt.new_array_type(self.byte_type, 30000) self.func_getchar = ( self.ctxt.new_function(gccjit.FunctionKind.IMPORTED, self.int_type, b"getchar", [])) self.func_putchar = ( self.ctxt.new_function(gccjit.FunctionKind.IMPORTED, self.void_type, b"putchar", [self.ctxt.new_param(self.int_type, b"c")])) self.func = self.ctxt.new_function(gccjit.FunctionKind.EXPORTED, self.void_type, b'func', []) self.curblock = self.func.new_block(b"initial") self.int_zero = self.ctxt.zero(self.int_type) self.int_one = self.ctxt.one(self.int_type) self.byte_zero = self.ctxt.zero(self.byte_type) self.byte_one = self.ctxt.one(self.byte_type) self.data_cells = self.ctxt.new_global(gccjit.GlobalKind.INTERNAL, self.array_type, b"data_cells") self.idx = self.func.new_local(self.int_type, b"idx") self.open_parens = [] self.curblock.add_comment(b"idx = 0;") self.curblock.add_assignment(self.idx, self.int_zero) def get_current_data(self, loc): """Get 'data_cells[idx]' as an lvalue. """ return self.ctxt.new_array_access(self.data_cells, self.idx, loc) def current_data_is_zero(self, loc): """Get 'data_cells[idx] == 0' as a boolean rvalue.""" return self.ctxt.new_comparison(gccjit.Comparison.EQ, self.get_current_data(loc), self.byte_zero, loc) def compile_char(self, ch): """Compile one bf character.""" loc = self.ctxt.new_location(self.filename, self.line, self.column) # Turn this on to trace execution, by injecting putchar() # of each source char. if 0: arg = self.ctxt.new_rvalue_from_int (self.int_type, ch) call = self.ctxt.new_call (self.func_putchar, [arg], loc) self.curblock.add_eval (call, loc) if ch == '>': self.curblock.add_comment(b"'>': idx += 1;", loc) self.curblock.add_assignment_op(self.idx, gccjit.BinaryOp.PLUS, self.int_one, loc) elif ch == '<': self.curblock.add_comment(b"'<': idx -= 1;", loc) self.curblock.add_assignment_op(self.idx, gccjit.BinaryOp.MINUS, self.int_one, loc) elif ch == '+': self.curblock.add_comment(b"'+': data[idx] += 1;", loc) self.curblock.add_assignment_op(self.get_current_data (loc), gccjit.BinaryOp.PLUS, self.byte_one, loc) elif ch == '-': self.curblock.add_comment(b"'-': data[idx] -= 1;", loc) self.curblock.add_assignment_op(self.get_current_data(loc), gccjit.BinaryOp.MINUS, self.byte_one, loc) elif ch == '.': arg = self.ctxt.new_cast(self.get_current_data(loc), self.int_type, loc) call = self.ctxt.new_call(self.func_putchar, [arg], loc) self.curblock.add_comment(b"'.': putchar ((int)data[idx]);", loc) self.curblock.add_eval(call, loc) elif ch == ',': call = self.ctxt.new_call(self.func_getchar, [], loc) self.curblock.add_comment(b"',': data[idx] = (unsigned char)getchar ();", loc) self.curblock.add_assignment(self.get_current_data(loc), self.ctxt.new_cast(call, self.byte_type, loc), loc) elif ch == '[': loop_test = self.func.new_block() on_zero = self.func.new_block() on_non_zero = self.func.new_block() self.curblock.end_with_jump(loop_test, loc) loop_test.add_comment(b"'['", loc) loop_test.end_with_conditional(self.current_data_is_zero(loc), on_zero, on_non_zero, loc) self.open_parens.append(Paren(loop_test, on_non_zero, on_zero)) self.curblock = on_non_zero; elif ch == ']': self.curblock.add_comment(b"']'", loc) if not self.open_parens: raise CompileError(self, "mismatching parens") paren = self.open_parens.pop() self.curblock.end_with_jump(paren.b_test) self.curblock = paren.b_after elif ch == '\n': self.line +=1; self.column = 0; if ch != '\n': self.column += 1 def parse_into_ctxt(self, filename): """ Parse the given .bf file into the gccjit.Context, containing a single "main" function suitable for compiling into an executable. """ self.filename = filename; self.line = 1 self.column = 0 with open(filename) as f_in: for ch in f_in.read(): self.compile_char(ch) self.curblock.end_with_void_return() # Compiling to an executable
Compiling a context to a file¶
In previous examples, we compiled and ran the generated machine code in-process. We can do that:
def run(self): import ctypes result = self.ctxt.compile() py_func_type = ctypes.CFUNCTYPE(None) py_func = py_func_type(result.get_code(b'func')) py_func()
but this time we’ll also provide a way to compile the context directly to an executable, using gccjit.Context.compile_to_file().
To do so, we need to export a main function. A helper function for doing so is provided by the JIT API:
def make_main(ctxt): """ Make "main" function: int main (int argc, char **argv) { ... } Return (func, param_argc, param_argv) """ int_type = ctxt.get_type(TypeKind.INT) param_argc = ctxt.new_param(int_type, b"argc") char_ptr_ptr_type = ( ctxt.get_type(TypeKind.CHAR).get_pointer().get_pointer()) param_argv = ctxt.new_param(char_ptr_ptr_type, b"argv") func_main = ctxt.new_function(FunctionKind.EXPORTED, int_type, b"main", [param_argc, param_argv]) return (func_main, param_argc, param_argv)
which we can use (as gccjit.make_main) to compile the function to an executable:
def compile_to_file(self, output_path): # Wrap "func" up in a "main" function mainfunc, argv, argv = gccjit.make_main(self.ctxt) block = mainfunc.new_block() block.add_eval(self.ctxt.new_call(self.func, [])) block.end_with_return(self.int_zero) self.ctxt.compile_to_file(gccjit.OutputKind.EXECUTABLE, output_path)
Finally, here’s the top-level of the program:
def main(argv): from optparse import OptionParser parser = OptionParser() parser.add_option("-o", "--output", dest="outputfile", help="compile to FILE", metavar="FILE") (options, args) = parser.parse_args() if len(args) != 1: raise ValueError('No input file') inputfile = args[0] c = Compiler() c.parse_into_ctxt(inputfile) if options.outputfile: c.compile_to_file(options.outputfile) else: c.run() if __name__ == '__main__': try: main(sys.argv) except Exception as exc: print(exc) sys.exit(1)
The overall script examples/bf.py is thus a bf-to-machine-code compiler, which we can use to compile .bf files, either to run in-process,
$ PYTHONPATH=. python examples/bf.py \
emit-alphabet.bf
ABCDEFGHIJKLMNOPQRSTUVWXYZ
or to compile into machine code executables:
$ PYTHONPATH=. python examples/bf.py \
emit-alphabet.bf \
-o a.out
which we can run independently:
$ ./a.out
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Success!
We can also inspect the generated executable using standard tools:
$ objdump -d a.out |less
which shows that libgccjit has managed to optimize the function somewhat (for example, the runs of 26 and 65 increment operations have become integer constants 0x1a and 0x41):
0000000000400620 <main>:
400620: 80 3d 39 0a 20 00 00 cmpb $0x0,0x200a39(%rip) # 601060 <data
400627: 74 07 je 400630 <main
400629: eb fe jmp 400629 <main+0x9>
40062b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
400630: 48 83 ec 08 sub $0x8,%rsp
400634: 0f b6 05 26 0a 20 00 movzbl 0x200a26(%rip),%eax # 601061 <data_cells+0x1>
40063b: c6 05 1e 0a 20 00 1a movb $0x1a,0x200a1e(%rip) # 601060 <data_cells>
400642: 8d 78 41 lea 0x41(%rax),%edi
400645: 40 88 3d 15 0a 20 00 mov %dil,0x200a15(%rip) # 601061 <data_cells+0x1>
40064c: 0f 1f 40 00 nopl 0x0(%rax)
400650: 40 0f b6 ff movzbl %dil,%edi
400654: e8 87 fe ff ff callq 4004e0 <putchar@plt>
400659: 0f b6 05 01 0a 20 00 movzbl 0x200a01(%rip),%eax # 601061 <data_cells+0x1>
400660: 80 2d f9 09 20 00 01 subb $0x1,0x2009f9(%rip) # 601060 <data_cells>
400667: 8d 78 01 lea 0x1(%rax),%edi
40066a: 40 88 3d f0 09 20 00 mov %dil,0x2009f0(%rip) # 601061 <data_cells+0x1>
400671: 75 dd jne 400650 <main+0x30>
400673: 31 c0 xor %eax,%eax
400675: 48 83 c4 08 add $0x8,%rsp
400679: c3 retq
40067a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
We also set up debugging information (via gccjit.Context.new_location() and gccjit.BoolOption.DEBUGINFO), so it’s possible to use gdb to singlestep through the generated binary and inspect the internal state idx and data_cells:
(gdb) break main
Breakpoint 1 at 0x400790
(gdb) run
Starting program: a.out
Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448)
(gdb) stepi
0x0000000000400797 in main (argc=1, argv=0x7fffffffe448)
(gdb) stepi
0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448)
(gdb) stepi
9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
(gdb) list
4
5 cell 0 = 26
6 ++++++++++++++++++++++++++
7
8 cell 1 = 65
9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
10
11 while cell#0 != 0
12 [
13 >
(gdb) n
6 ++++++++++++++++++++++++++
(gdb) n
9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
(gdb) p idx
$1 = 1
(gdb) p data_cells
$2 = "\032", '\000' <repeats 29998 times>
(gdb) p data_cells[0]
$3 = 26 '\032'
(gdb) p data_cells[1]
$4 = 0 '\000'
(gdb) list
4
5 cell 0 = 26
6 ++++++++++++++++++++++++++
7
8 cell 1 = 65
9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<
10
11 while cell#0 != 0
12 [
13 >
Other forms of ahead-of-time-compilation¶
The above demonstrates compiling a gccjit.Context directly to an executable. It’s also possible to compile it to an object file, and to a dynamic library. See the documentation of gccjit.Context.compile_to_file() for more information.
Topic Reference¶
Compilation contexts¶
- class gccjit.Context¶
The top-level of the API is the gccjit.Context class.
A gccjit.Context instance encapsulates the state of a compilation.
You can set up options on it, and add types, functions and code. Invoking gccjit.Context.compile() on it gives you a gccjit.Result.
- dump_to_file(path, update_locations)¶
- get_first_error()¶
- new_location(filename, line, column)¶
Make a gccjit.Location representing a source location, for use by the debugger:
loc = ctxt.new_location('web.js', 5, 0)
Note
You need to enable gccjit.BoolOption.DEBUGINFO on the context for these locations to actually be usable by the debugger:
ctxt.set_bool_option(gccjit.BoolOption.DEBUGINFO, True)
Return type: gccjit.Location
- new_global(Type type_, name, Location loc=None)¶
Return type: gccjit.LValue
- new_array_type(Type element_type, int num_elements, Location loc=None)¶
Return type: gccjit.Type
- new_field(Type type_, name, Location loc=None)¶
Return type: gccjit.Field
- new_struct(name, fields=None, Location loc=None)¶
Return type: gccjit.Struct
- new_union(name, fields=None, Location loc=None)¶
Construct a new “union” type.
Return type: Parameters: - field – The fields that make up the union
- loc (gccjit.Location) – The source location, if any, or None
For example, to create the equivalent of:
union u { int as_int; float as_float; };
you can use:
ctxt = gccjit.Context() int_type = ctxt.get_type(gccjit.TypeKind.INT) float_type = ctxt.get_type(gccjit.TypeKind.FLOAT) as_int = ctxt.new_field(int_type, b'as_int') as_float = ctxt.new_field(float_type, b'as_float') u = ctxt.new_union(b'u', [as_int, as_float])
- new_function_ptr_type(return_type, param_types, loc=None, is_variadic=False)¶
Parameters: - return_type (gccjit.Type) – The return type of the function
- param_types (A sequence of gccjit.Type) – The types of the parameters
- loc (gccjit.Location) – The source location, if any, or None
- is_variadic (bool) – Is the function variadic (i.e. accepts a variable number of arguments)
Return type: For example, to create the equivalent of:
typedef void (*fn_ptr_type) (int, int int);
you can use:
>>> ctxt = gccjit.Context() >>> void_type = ctxt.get_type(gccjit.TypeKind.VOID) >>> int_type = ctxt.get_type(gccjit.TypeKind.INT) >>> fn_ptr_type = ctxt.new_function_ptr_type (void_type, [int_type, int_type, int_type]) >>> print(fn_ptr_type) void (*) (int, int, int)
- new_param(Type type_, name, Location loc=None)¶
Return type: gccjit.Param
- new_function(kind, Type return_type, name, params, Location loc=None, is_variadic=False)¶
Return type: gccjit.Function
- get_builtin_function(name)¶
Return type: gccjit.Function
- zero(type_)¶
Given a gccjit.Type, which must be a numeric type, get the constant 0 as a gccjit.RValue of that type.
Return type: gccjit.RValue
- one(type_)¶
Given a gccjit.Type, which must be a numeric type, get the constant 1 as a gccjit.RValue of that type.
Return type: gccjit.RValue
- new_rvalue_from_double(numeric_type, value)¶
Given a gccjit.Type, which must be a numeric type, get a floating-point constant as a gccjit.RValue of that type.
Return type: gccjit.RValue
- new_rvalue_from_int(type_, value)¶
Given a gccjit.Type, which must be a numeric type, get an integer constant as a gccjit.RValue of that type.
Return type: gccjit.RValue
- new_rvalue_from_ptr(pointer_type, value)¶
Given a gccjit.Type, which must be a pointer type, and an address, get a gccjit.RValue representing that address as a pointer of that type:
ptr = ctxt.new_rvalue_from_ptr(int_star, 0xDEADBEEF)
Return type: gccjit.RValue
- null(pointer_type)¶
Given a gccjit.Type, which must be a pointer type, get a gccjit.RValue representing the NULL pointer of that type:
ptr = ctxt.null(int_star)
Return type: gccjit.RValue
- new_string_literal(value)¶
Make a gccjit.RValue for the given string literal value (actually bytes):
msg = ctxt.new_string_literal(b'hello world\n')
Parameters: value (bytes) – the bytes of the string literal Return type: gccjit.RValue
- new_unary_op(op, result_type, rvalue, loc=None)¶
Make a gccjit.RValue for the given unary operation.
Parameters: - op (gccjit.UnaryOp) – Which unary operation
- result_type (gccjit.Type) – The type of the result
- rvalue (gccjit.RValue) – The input expression
- loc (gccjit.Location) – The source location, if any, or None
Return type:
- new_binary_op(op, result_type, a, b, loc=None)¶
Make a gccjit.RValue for the given binary operation.
Parameters: - op (gccjit.BinaryOp) – Which binary operation
- result_type (gccjit.Type) – The type of the result
- a (gccjit.RValue) – The first input expression
- b (gccjit.RValue) – The second input expression
- loc (gccjit.Location) – The source location, if any, or None
Return type:
- new_comparison(op, a, b, loc=None)¶
Make a gccjit.RValue of boolean type for the given comparison.
Parameters: - op (gccjit.Comparison) – Which comparison
- a (gccjit.RValue) – The first input expression
- b (gccjit.RValue) – The second input expression
- loc (gccjit.Location) – The source location, if any, or None
Return type:
- new_child_context(self)¶
Return type: gccjit.Context
- new_cast(RValue rvalue, Type type_, Location loc=None)¶
Return type: gccjit.RValue
- new_array_access(ptr, index, loc=None)¶
Parameters: - ptr (gccjit.RValue) – The pointer or array
- index (gccjit.RValue) – The index within the array
- loc (gccjit.Location) – The source location, if any, or None
Return type:
- new_call(Function func, args, Location loc=None)¶
Return type: gccjit.RValue
- new_call_through_ptr(fn_ptr, args, loc=None)¶
Parameters: - fn_ptr (gccjit.RValue) – A function pointer
- args (A sequence of gccjit.RValue) – The arguments to the function call
- loc (gccjit.Location) – The source location, if any, or None
Return type: For example, to create the equivalent of:
typedef void (*fn_ptr_type) (int, int, int); fn_ptr_type fn_ptr; fn_ptr (a, b, c);
you can use:
block.add_eval (ctxt.new_call_through_ptr(fn_ptr, [a, b, c]))
Debugging¶
- gccjit.Context.dump_reproducer_to_file(self, path)¶
Write C source code into path that can be compiled into a self-contained executable (i.e. with libgccjit as the only dependency). The generated code will attempt to replay the API calls that have been made into the given context, at the C level, eliminating any dependency on Python or on client code or data.
This may be useful when debugging the library or client code, for reducing a complicated recipe for reproducing a bug into a simpler form.
Typically you need to supply -Wno-unused-variable when compiling the generated file (since the result of each API call is assigned to a unique variable within the generated C source, and not all are necessarily then used).
- gccjit.Context.set_logfile(self, f)¶
To help with debugging; enable ongoing logging of the context’s activity to the given file object.
For example, the following will enable logging to stderr:
ctxt.set_logfile(sys.stderr)
Examples of information logged include:
- API calls
- the various steps involved within compilation
- activity on any gccjit.Result instances created by the context
- activity within any child contexts
The precise format and kinds of information logged is subject to change.
Unfortunately, doing so creates a leak of an underlying FILE * object.
There may a performance cost for logging.
Options¶
String options¶
- gccjit.Context.set_str_option(self, opt, val)¶
Set a string option of the context; see gccjit.StrOption for notes on the options and their meanings.
Parameters: - opt (gccjit.StrOption) – Which option to set
- val (str) – The new value
Boolean options¶
- gccjit.Context.set_bool_option(self, opt, val)¶
Set a boolean option of the context; see gccjit.BoolOption for notes on the options and their meanings.
Parameters: - opt (gccjit.BoolOption) – Which option to set
- val (str) – The new value
- class gccjit.BoolOption¶
- DEBUGINFO¶
If true, gccjit.Context.compile() will attempt to do the right thing so that if you attach a debugger to the process, it will be able to inspect variables and step through your code.
Note that you can’t step through code unless you set up source location information for the code (by creating and passing in gccjit.Location instances).
- DUMP_INITIAL_TREE¶
If true, gccjit.Context.compile() will dump its initial “tree” representation of your code to stderr (before any optimizations).
Here’s some sample output (from the square example):
<statement_list 0x7f4875a62cc0 type <void_type 0x7f4875a64bd0 VOID align 8 symtab 0 alias set -1 canonical type 0x7f4875a64bd0 pointer_to_this <pointer_type 0x7f4875a64c78>> side-effects head 0x7f4875a761e0 tail 0x7f4875a761f8 stmts 0x7f4875a62d20 0x7f4875a62d00 stmt <label_expr 0x7f4875a62d20 type <void_type 0x7f4875a64bd0> side-effects arg 0 <label_decl 0x7f4875a79080 entry type <void_type 0x7f4875a64bd0> VOID file (null) line 0 col 0 align 1 context <function_decl 0x7f4875a77500 square>>> stmt <return_expr 0x7f4875a62d00 type <integer_type 0x7f4875a645e8 public SI size <integer_cst 0x7f4875a623a0 constant 32> unit size <integer_cst 0x7f4875a623c0 constant 4> align 32 symtab 0 alias set -1 canonical type 0x7f4875a645e8 precision 32 min <integer_cst 0x7f4875a62340 -2147483648> max <integer_cst 0x7f4875a62360 2147483647> pointer_to_this <pointer_type 0x7f4875a6b348>> side-effects arg 0 <modify_expr 0x7f4875a72a78 type <integer_type 0x7f4875a645e8> side-effects arg 0 <result_decl 0x7f4875a7a000 D.54> arg 1 <mult_expr 0x7f4875a72a50 type <integer_type 0x7f4875a645e8> arg 0 <parm_decl 0x7f4875a79000 i> arg 1 <parm_decl 0x7f4875a79000 i>>>>>
- DUMP_INITIAL_GIMPLE¶
If true, gccjit.Context.compile() will dump the “gimple” representation of your code to stderr, before any optimizations are performed. The dump resembles C code:
square (signed int i) { signed int D.56; entry: D.56 = i * i; return D.56; }
- DUMP_GENERATED_CODE¶
If true, gccjit.Context.compile() will dump the final generated code to stderr, in the form of assembly language:
.file "fake.c" .text .globl square .type square, @function square: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movl %edi, -4(%rbp) .L2: movl -4(%rbp), %eax imull -4(%rbp), %eax popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size square, .-square .ident "GCC: (GNU) 4.9.0 20131023 (Red Hat 0.1-%{gcc_release})" .section .note.GNU-stack,"",@progbits
- DUMP_SUMMARY¶
If true, gccjit.Context.compile() will print information to stderr on the actions it is performing, followed by a profile showing the time taken and memory usage of each phase.
- DUMP_EVERYTHING¶
If true, gccjit.Context.compile() will dump copious amount of information on what it’s doing to various files within a temporary directory. Use gccjit.BoolOption.KEEP_INTERMEDIATES (see below) to see the results. The files are intended to be human-readable, but the exact files and their formats are subject to change.
- SELFCHECK_GC¶
If true, libgccjit will aggressively run its garbage collector, to shake out bugs (greatly slowing down the compile). This is likely to only be of interest to developers of the library. It is used when running the selftest suite.
- KEEP_INTERMEDIATES¶
If true, the gccjit.Context will not clean up intermediate files written to the filesystem, and will display their location on stderr.
Integer options¶
- gccjit.Context.set_int_option(seld, opt, val)¶
Set an integer option of the context; see gccjit.IntOption for notes on the options and their meanings.
Parameters: - opt (gccjit.IntOption) – Which option to set
- val (str) – The new value
Types¶
Types can be created in several ways:
fundamental types can be accessed using gccjit.Context.get_type():
int_type = ctxt.get_type(gccjit.TypeKind.INT)
See gccjit.TypeKind for the available types.
You can get int types of specific sizes (in bytes) using gccjit.Context.get_int_type():
int_type = ctxt.get_int_type(4, is_signed=True)
derived types can be accessed by calling methods on an existing type:
const_int_star = int_type.get_const().get_pointer() int_const_star = int_type.get_pointer().get_const()
by creating structures (see below).
- class gccjit.Type¶
- get_pointer()¶
Given type T get type T*.
Return type: gccjit.Type
- get_const()¶
Given type T get type const T.
Return type: gccjit.Type
- get_volatile()¶
Given type T get type volatile T.
Return type: gccjit.Type
Standard types¶
- gccjit.Context.get_type(self, type_enum)¶
Look up one of the standard types (see gccjit.TypeKind):
int_type = ctxt.get_type(gccjit.TypeKind.INT)
Parameters: type_enum (gccjit.TypeKind) – Which type to lookup
- class gccjit.TypeKind¶
- VOID¶
C’s “void” type.
- VOID_PTR¶
C’s “void *”.
- BOOL¶
C++’s bool type; also C99’s “_Bool” type, aka “bool” if using stdbool.h.
- CHAR¶
- SIGNED_CHAR¶
- UNSIGNED_CHAR¶
C’s “char” (of some signedness) and the variants where the signedness is specified.
- SHORT¶
- UNSIGNED_SHORT¶
C’s “short” (signed) and “unsigned short”.
- INT¶
- UNSIGNED_INT¶
C’s “int” (signed) and “unsigned int”:
int_type = ctxt.get_type(gccjit.TypeKind.INT)
- LONG¶
- UNSIGNED_LONG¶
C’s “long” (signed) and “unsigned long”.
- LONG_LONG¶
- UNSIGNED_LONG_LONG¶
C99’s “long long” (signed) and “unsigned long long”.
- FLOAT¶
- DOUBLE¶
- LONG_DOUBLE¶
Floating-point types
- CONST_CHAR_PTR¶
C type: (const char *):
const_char_p = ctxt.get_type(gccjit.TypeKind.CONST_CHAR_PTR)
- SIZE_T¶
The C “size_t” type.
- FILE_PTR¶
C type: (FILE *)
- gccjit.Context.get_int_type(self, num_bytes, is_signed)¶
Look up an integet type of the given size:
int_type = ctxt.get_int_type(4, is_signed=True)
Structures¶
You can model C struct types by creating gccjit.Struct and gccjit.Field instances, in either order:
by creating the fields, then the structure. For example, to model:
struct coord {double x; double y; };
you could call:
field_x = ctxt.new_field(double_type, b'x') field_y = ctxt.new_field(double_type, b'y') coord = ctxt.new_struct(b'coord', [field_x, field_y])
(see gccjit.Context.new_field() and gccjit.Context.new_struct()), or
by creating the structure, then populating it with fields, typically to allow modelling self-referential structs such as:
struct node { int m_hash; struct node *m_next; };
like this:
node = ctxt.new_struct(b'node') node_ptr = node.get_pointer() field_hash = ctxt.new_field(int_type, b'm_hash') field_next = ctxt.new_field(node_ptr, b'm_next') node.set_fields([field_hash, field_next])
- class gccjit.Field¶
Expressions¶
- class gccjit.RValue¶
- dereference_field(Field field, Location loc=None)¶
- dereference(loc=None)¶
- get_type()¶
- class gccjit.LValue¶
- get_address(loc=None)¶
Get the address of this lvalue, as a gccjit.RValue of type T*.
Unary Operations¶
Unary operations are gccjit.RValue instances built using gccjit.Context.new_unary_op() with an operation from one of the following:
Unary Operation | C equivalent |
---|---|
gccjit.UnaryOp.MINUS | -(EXPR) |
gccjit.UnaryOp.BITWISE_NEGATE | ~(EXPR) |
gccjit.UnaryOp.LOGICAL_NEGATE | !(EXPR) |
Binary Operations¶
Unary operations are gccjit.RValue instances built using gccjit.Context.new_binary_op() with an operation from one of the following:
Binary Operation | C equivalent |
---|---|
gccjit.BinaryOp.PLUS | x + y |
gccjit.BinaryOp.MINUS | x - y |
gccjit.BinaryOp.MULT | x * y |
gccjit.BinaryOp.DIVIDE | x / y |
gccjit.BinaryOp.MODULO | x % y |
gccjit.BinaryOp.BITWISE_AND | x & y |
gccjit.BinaryOp.BITWISE_XOR | x ^ y |
gccjit.BinaryOp.BITWISE_OR | x | y |
gccjit.BinaryOp.LOGICAL_AND | x && y |
gccjit.BinaryOp.LOGICAL_OR | x || y |
- class gccjit.BinaryOp¶
- PLUS¶
Addition of arithmetic values; analogous to:
(EXPR_A) + (EXPR_B)
in C.
For pointer addition, use gccjit.Context.new_array_access().
- MINUS¶
Subtraction of arithmetic values; analogous to:
(EXPR_A) - (EXPR_B)
in C.
- MULT¶
Multiplication of a pair of arithmetic values; analogous to:
(EXPR_A) * (EXPR_B)
in C.
- DIVIDE¶
Quotient of division of arithmetic values; analogous to:
(EXPR_A) / (EXPR_B)
in C.
The result type affects the kind of division: if the result type is integer-based, then the result is truncated towards zero, whereas a floating-point result type indicates floating-point division.
- MODULO¶
Remainder of division of arithmetic values; analogous to:
(EXPR_A) % (EXPR_B)
in C.
- BITWISE_AND¶
Bitwise AND; analogous to:
(EXPR_A) & (EXPR_B)
in C.
- BITWISE_XOR¶
Bitwise exclusive OR; analogous to:
(EXPR_A) ^ (EXPR_B)
in C.
- BITWISE_OR¶
Bitwise inclusive OR; analogous to:
(EXPR_A) | (EXPR_B)
in C.
- LOGICAL_AND¶
Logical AND; analogous to:
(EXPR_A) && (EXPR_B)
in C.
- LOGICAL_OR¶
Logical OR; analogous to:
(EXPR_A) || (EXPR_B)
in C.
Comparisons¶
Comparisons are gccjit.RValue instances of boolean type built using gccjit.Context.new_comparison() with an operation from one of the following:
Comparison | C equivalent |
---|---|
gccjit.Comparison.EQ | x == y |
gccjit.Comparison.NE | x != y |
gccjit.Comparison.LT | x < y |
gccjit.Comparison.LE | x <= y |
gccjit.Comparison.GT | x > y |
gccjit.Comparison.GE | x >= y |
Functions¶
- class gccjit.Param¶
- class gccjit.Function¶
- new_local(type_, name, loc=None)¶
Add a new local variable to the function:
i = fn.new_local(int_type, b'i')
Return type: gccjit.LValue
- new_block(name)¶
Create a gccjit.Block.
The name can be None, or you can give it a meaningful name, which may show up in dumps of the internal representation, and in error messages:
entry = fn.new_block('entry') on_true = fn.new_block('on_true')
- get_param(index)¶
- dump_to_dot(path)¶
Write a dump in GraphViz format to the given path.
- class gccjit.Block¶
A gccjit.Block is a basic block within a function, i.e. a sequence of statements with a single entry point and a single exit point.
The first basic block that you create within a function will be the entrypoint.
Each basic block that you create within a function must be terminated, either with a conditional, a jump, or a return.
It’s legal to have multiple basic blocks that return within one function.
- add_eval(rvalue, loc=None)¶
Add evaluation of an rvalue, discarding the result (e.g. a function call that “returns” void), for example:
call = ctxt.new_call(some_fn, args) block.add_eval(call)
This is equivalent to this C code:
(void)expression;
- add_assignment(lvalue, rvalue, loc=None)¶
Add evaluation of an rvalue, assigning the result to the given lvalue, for example:
# i = 0 entry_block.add_assignment(local_i, ctxt.zero(the_type))
This is roughly equivalent to this C code:
lvalue = rvalue;
- add_assignment_op(lvalue, op, rvalue, loc=None)¶
Add evaluation of an rvalue, using the result to modify an lvalue via the given gccjit.BinaryOp. For example:
# i++ loop_block.add_assignment_op(local_i, gccjit.BinaryOp.PLUS, ctxt.one(the_type))
This is analogous to “+=” and friends:
lvalue += rvalue; lvalue *= rvalue; lvalue /= rvalue; /* etc */
- add_comment(text, Location loc=None)¶
Add a no-op textual comment to the internal representation of the code. It will be optimized away, but will be visible in the dumps seen via gccjit.BoolOption.DUMP_INITIAL_TREE and gccjit.BoolOption.DUMP_INITIAL_GIMPLE and thus may be of use when debugging how your project’s internal representation gets converted to the libgccjit IR.
- end_with_conditional(boolval, on_true, on_false=None, loc=None)¶
Terminate a block by adding evaluation of an rvalue, branching on the result to the appropriate successor block.
This is roughly equivalent to this C code:
if (boolval) goto on_true; else goto on_false;
Example:
# while (i < n) cond_block.end_with_conditional( ctxt.new_comparison(gccjit.Comparison.LT, local_i, param_n), loop_block, after_loop_block)
- end_with_jump(target, loc=None)¶
Terminate a block by adding a jump to the given target block.
This is roughly equivalent to this C code:
goto target;
Example:
loop_block.end_with_jump(cond_block)
- end_with_return(RValue rvalue, loc=None)¶
Terminate a block by adding evaluation of an rvalue, returning the value.
This is roughly equivalent to this C code:
return expression;
Example:
# return sum after_loop_block.end_with_return(local_sum)
- end_with_void_return(loc=None)¶
Terminate a block by adding a valueless return, for use within a function with “void” return type.
This is equivalent to this C code:
return;
- get_function()¶
Get the gccjit.Function that this block is within.
Source Locations¶
- class gccjit.Location¶
A gccjit.Location encapsulates a source code location, so that you can (optionally) associate locations in your language with statements in the JIT-compiled code, allowing the debugger to single-step through your language.
You can construct them using gccjit.Context.new_location().
You need to enable gccjit.BoolOption.DEBUGINFO on the gccjit.Context for these locations to actually be usable by the debugger:
ctxt.set_bool_option(gccjit.BoolOption.DEBUGINFO, True)
gccjit.Location instances are optional; most API entrypoints accepting one default to None.
Faking it¶
If you don’t have source code for your internal representation, but need to debug, you can generate a C-like representation of the functions in your context using gccjit.Context.dump_to_file():
ctxt.dump_to_file(b'/tmp/something.c', True)
This will dump C-like code to the given path. If the update_locations argument is True, this will also set up gccjit.Location information throughout the context, pointing at the dump file as if it were a source file, giving you something you can step through in the debugger.
Compiling a context¶
Once populated, a gccjit.Context can be compiled to machine code, either in-memory via gccjit.Context.compile() or to disk via gccjit.Context.compile_to_file().
You can compile a context multiple times (using either form of compilation), although any errors that occur on the context will prevent any future compilation of that context.
In-memory compilation¶
- gccjit.Context.compile(self)¶
rtype: gccjit.Result This calls into GCC and builds the code, returning a gccjit.Result.
- class gccjit.Result¶
A gccjit.Result encapsulates the result of compiling a gccjit.Context in-memory, and the lifetimes of any machine code functions or globals that are within the result.
- get_code(funcname)¶
Locate the given function within the built machine code.
Functions are looked up by name. For this to succeed, a function with a name matching funcname must have been created on result‘s context (or a parent context) via a call to gccjit.Context.new_function() with kind gccjit.FunctionKind.EXPORTED.
The returned value is an int, actually a pointer to the machine code within the address space of the process. This will need to be wrapped up with ctypes to be callable:
import ctypes # "[int] -> int" functype: int_int_func_type = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_int) code = int_int_func_type(jit_result.get_code(b"square")) assert code(5) == 25
The code has the same lifetime as the gccjit.Result instance; the pointer becomes invalid when the result instance is cleaned up.
Ahead-of-time compilation¶
Although libgccjit is primarily aimed at just-in-time compilation, it can also be used for implementing more traditional ahead-of-time compilers, via the gccjit.Context.compile_to_file() API entrypoint.
- gccjit.Context.compile_to_file(self, kind, path)¶
Compile the context to a file of the given kind:
ctxt.compile_to_file(gccjit.OutputKind.EXECUTABLE, 'a.out')
gccjit.Context.compile_to_file() ignores the suffix of path, and insteads uses kind to decide what to do.
Note
This is different from the gcc program, which does make use of the suffix of the output file when determining what to do.
The available kinds of output are:
Output kind Typical suffix gccjit.OutputKind.ASSEMBLER .s gccjit.OutputKind.OBJECT_FILE .o gccjit.OutputKind.DYNAMIC_LIBRARY .so or .dll gccjit.OutputKind.EXECUTABLE None, or .exe
- class gccjit.OutputKind¶
- ASSEMBLER¶
Compile the context to an assembler file.
- OBJECT_FILE¶
Compile the context to an object file.
- DYNAMIC_LIBRARY¶
Compile the context to a dynamic library.
There is currently no support for specifying other libraries to link against.
- EXECUTABLE¶
Compile the context to an executable.
There is currently no support for specifying libraries to link against.