In the first two articles of this series, I introduced the JitBuilder library and described how to use it to compile native x86-64 code on the Linux platform using a Docker image I built with the Eclipse OMR project. In this article, I show you how to build your own JitBuilder library from the source code in the Eclipse OMR project, and I will switch platforms to OS X. Along the way, I’ll give you a tour of all the various parts of the JitBuilder library and where they reside in the Eclipse OMR project.
PrerequisitesBefore you read this post, you should read the first post to understand the basics of how the library works and to download the JitBuilder docker image and get it up and running.
And read the second post to build more complicated methods using the JitBuilder library.
Get the code
Let’s get started! The first step is to clone the Eclipse OMR repository and get the code onto your system:[code language=”C”] $ git clone https://github.com/eclipse/omr Cloning into ‘omr’… remote: Counting objects: 10694, done. remote: Compressing objects: 100% (2/2), done. remote: Total 10694 (delta 0), reused 0 (delta 0), pack-reused 10692 Receiving objects: 100% (10694/10694), 11.21 MiB | 1.54 MiB/s, done. Resolving deltas: 100% (7054/7054), done. [/code]
OMR configuration and build
At this point, you’ve got an
omr directory containing the complete Eclipse OMR source code repository. Let’s take a quick look at what’s in there:
$ cd omr && ls CONTRIBUTING.md config.guess gc omr_glue_static_lib third_party GNUmakefile config.sub glue omr_static_lib thread INSTALL configure include_core omrmakefiles tools LICENSE configure.ac install-sh omrsigcompat util README.md ddr_artifacts.mk jitbuilder omrtrace artwork doc lib perftest build example nls port compiler fvtest omr run_configure.mk
That’s a lot of stuff! The Eclipse OMR project currently weighs in at a little over 800,000 lines of code, but you don’t need to pay attention to most of it. I’ll just touch upon a few of the highlights.
There are a number of files and directories that play a part in the OMR project’s configuration (i.e. detecting the platform and system capabilities) and build system. OMR currently uses autotools
configure to generate static makefiles to build.
run_configure.mk are all part of the OMR configuration and build story. If you’re interested, there are some more details in the omr/README.md file, but for this article and for most users of JitBuilder you won’t need to delve into these details. If you want to jump directly to configuring the OMR project, skip ahead a few paragraphs to “Configuring the OMR project“.
The top-level omr directory contains the usual README.md, LICENSE, and CONTRIBUTING.md files to give you some background and practical information about the project if you want to get involved (and you’re very welcome to!). There are lots of different components, mostly obviously named like
gc is the garbage collection framework, in the top level directory, but for this article I’ll focus on only two directories:
compiler directory contains the OMR compiler component code that will be built into the JitBuilder library. It represents a little bit more than half of the compiler that powers the IBM J9 JIT compiler for Java. The
- Its own intermediate representation (IR), a topic I touched on in the last article
- Native instruction code generators for three major hardware platforms (X86, POWER, and IBM Z)
- An optimization framework with approximately 70 different optimizations and analyses that should be readily familiar to compiler developers
- Runtime support for executing code and for managing the process of dynamically compiling code
jitbuilder directory contains an extension of the OMR compiler that augments a few of the OMR compiler classes and implements a simple default lifecycle infrastructure around the use of the OMR compiler, as described in earlier articles in this series. Its internal directory structure purposefully mimics that of the compiler directory so that it can extend the OMR compiler component. That’s all the detail I’ll go into about that in this post. Let’s get that library built!
Configuring the OMR project
To build and then use the JitBuilder library, you first need to run the OMR top-level configuration step. Here’s the command you need to run:[code language=”C”] $ make -f run_configure.mk SPEC=osx OMRGLUE=./example/glue [ lots of output omitted ] [/code]
run_configure.mk is a makefile that handles most of the complexity of system configuration for you. It only requires two pieces of input:
SPEC= what platform to configure for, and
OMRGLUE= where to find the glue code that should be compiled with OMR. What is glue code? The OMR project contains what are essentially incomplete technology components. To complete each OMR component you decide to use, you need to write glue code to connect the component into your language runtime and to configure it to respect your language semantics and runtime mechanisms.
In fact, JitBuilder doesn’t really need the
OMRGLUE directory (because
jitbuilder is like a glue directory itself), but
run_configure.mk expects it so you cannot leave it out. The OMR project comes with an example glue implementation in the
./example/glue directory, so the easiest thing to do is simply point OMRGLUE at that directory as we did in the above command.
SPEC is the platform you want to configure for. Since I’m writing this article for OS X, the option I wrote is
SPEC=osx. If you want to run on a 64-bit x86 Linux variant, you could instead set
SPEC=linux_x86-64. With either platform
SPEC setting, the rest of this article should work smoothly for you. Please let me know if it doesn’t!
Once you’ve configured OMR, you are ready to build the JitBuilder library.
Building the JitBuilder library
The code specific to the JitBuilder library can be found in the
jitbuilder top-level directory. Let’s take a look at what’s in there:
$ cd jitbuilder && ls Makefile codegen control ilgen p runtime z build compile env optimizer release x
Because the JitBuilder library is actually a compiler project itself that extends the OMR compiler, most of these directories simply mimic the directory layout of the OMR compiler. Because of the way the OMR compiler is constructed, you can create new compiler projects that extend, or subclass, the functionality in OMR to implement a compiler for a specific language (like Java) or for a particular use case (like creating the JitBuilder library). I won’t talk much more about that here. For those who are really interested, there is some documentation in
The main pieces you need to pay attention to in this article are in the
release directory. But first, let’s build the JitBuilder library:
That will take a while since it builds all the main source files in the
jitbuilder/ directories as well as in the OMR compiler directories, but once it’s done, you’ll have successfully built the JitBuilder library!
JitBuilder release directory
To verify that the library has been built, let’s descend into the
$ cd release; ls Dockerfile-JitBuilder-x86_64 README.md jitbuilder.tgz LICENSE ReplayMethod.cpp libjitbuilder.a Makefile include src
This directory contains a standalone JitBuilder library distribution,
jitbuilder.tgz, which contains everything needed to work against a particular build of the JitBuilder library. Because we configured and built on OS X, you could move
jitbuilder.tgz to any OS X system, expand it, and then use the library to dynamically compile native code using the JitBuilder API on that system. It has a very basic
README.md, a copy of the same
LICENSE file as the Eclipse OMR project (a dual license: Eclipse Public License v1.0 and Apache License 2.0), a
Makefile, a Docker file for building a docker image for x86_64 (the same one that’s used in the first two articles of this series), the
libjitbuilder.a (the JitBuilder static library itself), an
include directory defining the JitBuilder library API, and a
src directory that contains a number of code samples for how to use different parts of the JitBuilder API. There is also
ReplayMethod.cpp which is part of an experimental and incomplete feature of the JitBuilder library, so you can ignore that for now.
Quick tour of the JitBuilder API
As the earlier articles in this series introduced to you, JitBuilder is a library consisting of a number of useful classes whose API can be primarily found in the
jitbuilder/release/include/compiler/ilgen directory. If you’re interested to poke around, the implementation for these classes can be found in the
omr/compiler/ilgen directory. In fact, as part of building the
jitbuilder/release/include directory, the header files are copied from
Here’s a high-level description of the various parts:
jitbuilder/release/include/Jit.hpp contains the high level interface to the JitBuilder library, which currently describes only the three calls introduced in the first article in this series:
MethodBuilder is an object that represents the ABI (parameters, return value) and the operations that should be performed when a compiled method is called. To compile your own methods, you subclass
MethodBuilder and implement its constructor to specify the ABI and then specify the operations the compiled method should perform by implementing a
MethodBuilder member function called
buildIL() (usually using either
BytecodeBuilder, see below). After passing a
MethodBuilder object to
compileMethodBuilder(), you get back a pointer to a function that you can call by casting that pointer to a C function prototype. Each
MethodBuilder has a hash map of names (C strings) representing the method’s local variables, and you can define them all at once or make them up as you go.
TypeDictionary is an object that you can use to describe the shape of structures, unions, and other types so that JitBuilder knows how to access their fields and values.
TypeDictionary provides some useful primitives for constructing pointers and for converting C types (like unsigned long) into JitBuilder types (like
Int64) according to how your C/C++ compiler represents those types. In other words, if you build the JitBuilder library with the same C++ compiler used to compile your program, it is very easy to make sure JitBuilder uses type definitions that are consistent with the rest of your code base.
IlBuilder is an object that represents the operations to perform when a particular control flow path is reached. The method entry point is one particular control flow path, and so a
MethodBuilder object is also an
IlBuilder object. But you can create arbitrary
IlBuilder objects and connect them together (even nest them) in arbitrary ways using an ever-growing list of services like
WhileDoLoop. Each service you call on an
IlBuilder appends operations to the control flow path and so describes the intended order of operations to the compiler. There is no restriction on how you add operations to different
IlBuilder objects. You can create all the
IlBuilder objects up front and then add operations to them individually, or you can start from the
MethodBuilder object (function entry) and create
IlBuilder objects as you need them to represent the different paths of execution as you require them.
IlInjector is the superclass of
IlBuilder, providing an even more precise and flexible interface to create OMR compiler IL, but it also requires intimate knowledge of the OMR compiler IL.
IlInjector is considered a highly advanced topic for JitBuilder based JITs and, frankly, this API has not been fleshed out as much as
IlBuilder. The code samples provided with JitBuilder (see below) all use
IlBuilder and do not reference
IlValue is an object that represents the values that are created and consumed by expressions. If you load a local variable by name, the value that’s loaded is an
IlValue. If you create a constant integer, that’s an
IlValue. If you pass both of those IlValues into the
Add operation, you’ll get back another
IlValue that represents their sum. There is an ever growing list of operations you can apply to create arbitrarily complex IlValue expressions.
BytecodeBuilder is a special kind of
IlBuilder that’s designed to simplify writing JIT compilers for bytecode based languages. You allocate a
BytecodeBuilder object for each bytecode in a method and translate the operations needed for that bytecode using that object. You have to specify every flow edge (even fall through edges) between bytecode builders.
BytecodeBuilder also taps into a handy and mostly automatic built-in worklist algorithm provided by
MethodBuilder that can traverse all flow edges in a control flow graph, visiting each bytecode at most once. Any bytecodes in the method that aren’t reachable (many parsers generate unreachable code to simplify the process of initial code generation from a parse tree) will not be visited. For a little more background about BytecodeBuilders, you can take a look at the second half of the video from my dwOpen Tech Talk presentation from July, 2016.
ThunkBuilder is a subclass of
MethodBuilder that creates functions that can call any native function with a particular signature by passing an array of appropriately typed arguments (kind of like libffi). You can compile a thunk builder object and then pass to it the address of a C function along with the array of arguments and it will call that function directly for you passing those arguments.
VirtualMachineState is an interface (abstract class) used to describe parts of a Virtual Machine (VM) architecture to JitBuilder so that it can interact with that state. A simple example would be a register holding some virtual state value like a program counter or stack pointer. A more complicated example would be the VM’s operand stack which is manipulated by the VM while simulating the execution of bytecodes. Naive JIT compilers replicate the steps to push and pop operands onto the virtual operand stack, even though the compiled code does not really need to use an operand stack and would benefit from using registers or local stack frame variables. The VirtualMachineOperandStack class allows a JitBuilder based JIT to push and pop IlValues just like the virtual machine pushes and pops actual values while executing bytecodes, but the code generated by JitBuilder will not actually use an operand stack. There are implementations of
VirtualMachineState for several kinds of state variables including an operand stack (
VirtualMachineOperandStack). This family of classes is typically used by JIT compiler implementations.
JitBuilder code samplesThe JitBuilder library comes with, at the time of this writing, 22 code samples to help you understand how various aspects of the API work or can be used. You can build all of the JitBuilder code samples by simply running make in the
jitbuilder/releasedirectory: [code language=”C”] $ make g++ -o AtomicOperations.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/AtomicOperations.cpp g++ -g -fno-rtti -o atomicoperations AtomicOperations.o -L. -ljitbuilder -ldl g++ -o Call.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Call.cpp g++ -g -fno-rtti -o call Call.o -L. -ljitbuilder -ldl g++ -o ConstString.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/ConstString.cpp g++ -g -fno-rtti -o conststring ConstString.o -L. -ljitbuilder -ldl g++ -o ControlFlowTests.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/ControlFlowTests.cpp g++ -o FieldAddress.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/FieldAddress.cpp g++ -g -fno-rtti -o fieldaddress FieldAddress.o -L. -ljitbuilder -ldl g++ -o IsSupportedType.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/IsSupportedType.cpp g++ -g -fno-rtti -o issupportedtype IsSupportedType.o -L. -ljitbuilder -ldl g++ -o IterativeFib.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/IterativeFib.cpp g++ -g -fno-rtti -o iterfib IterativeFib.o -L. -ljitbuilder -ldl g++ -o LinkedList.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/LinkedList.cpp g++ -g -fno-rtti -o linkedlist LinkedList.o -L. -ljitbuilder -ldl g++ -o LocalArray.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/LocalArray.cpp g++ -g -fno-rtti -o localarray LocalArray.o -L. -ljitbuilder -ldl g++ -o NestedLoop.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/NestedLoop.cpp g++ -g -fno-rtti -o nestedloop NestedLoop.o -L. -ljitbuilder -ldl g++ -o OperandStackTests.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/OperandStackTests.cpp g++ -g -fno-rtti -o operandstacktests OperandStackTests.o -L. -ljitbuilder -ldl g++ -o Pointer.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Pointer.cpp g++ -g -fno-rtti -o pointer Pointer.o -L. -ljitbuilder -ldl g++ -o Pow2.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Pow2.cpp g++ -g -fno-rtti -o pow2 Pow2.o -L. -ljitbuilder -ldl g++ -o RecursiveFib.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/RecursiveFib.cpp g++ -g -fno-rtti -o recfib RecursiveFib.o -L. -ljitbuilder -ldl g++ -o Simple.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Simple.cpp g++ -g -fno-rtti -o simple Simple.o -L. -ljitbuilder -ldl g++ -o StructArray.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/StructArray.cpp g++ -g -fno-rtti -o structarray StructArray.o -L. -ljitbuilder -ldl g++ -o Switch.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Switch.cpp g++ -g -fno-rtti -o switch Switch.o -L. -ljitbuilder -ldl g++ -o Thunk.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Thunk.cpp g++ -g -fno-rtti -o thunks Thunk.o -L. -ljitbuilder -ldl g++ -o ToIlType.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/ToIlType.cpp g++ -g -fno-rtti -o toiltype ToIlType.o -L. -ljitbuilder -ldl g++ -o TransactionalOperations.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/TransactionalOperations.cpp g++ -g -fno-rtti -o transactionaloperations TransactionalOperations.o -L. -ljitbuilder -ldl g++ -o Union.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Union.cpp g++ -g -fno-rtti -o union Union.o -L. -ljitbuilder -ldl g++ -o Worklist.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Worklist.cpp g++ -g -fno-rtti -o worklist Worklist.o -L. -ljitbuilder -ldl [/code] You’ve now built the tests! Not all of these tests, however, currently run on all platforms.
Code samples that should work on all platforms
Running the 7 tests that should work everywhere is easy:[code language=”C”] $ make test [ output of 7 tests omitted ] [/code]
The tests are:
IsSupportedTypeshows how to tell if a particular C or C++ type can be expressed as a JitBuilder type
IterativeFibis the code sample used in the second article in this series: it compiles a function that computes Fibonaccinumbers using an iterative algorithm.
NestedLoopis one of the language shootout benchmarks with a 6 deep loop nest containing a variable increment
Pow2is a trivial iterative approach to computing a power of 2
Simpleis the code sample from the [first article in this series](URL to first article): a function that simply returns its argument + 1
ToIlTypeshows how to use the toIlType template functions in TypeDictionary to map C/C++ types to TypeDictionary types
Worklistgives a (convoluted) example of using the worklist facility in MethodBuilder to traverse a flow graph of bytecodes
Code samples that may not work on all platforms
Not all the tests work on other platforms because JitBuilder is currently (a point in time limitation) missing automatic method trampolines for function calls on platforms with limited direct addressability. X86-64 and OS X, however, aren’t in this class of platforms, so you can actually run everything with ‘testall’ :[code language=”C”] $ make testall [ output of 22 tests omitted ] [/code]
In addition to the code samples that run on all platforms, these samples include:
Callshowing how to call an arbitrary C function from a compiled method
ConstStringshows how to use ConstString to manipulate C strings
DotProductshows how to use arrays by computing a simple vector dot product
FieldAddressshows how to work with structs that are embedded in other structs
LinkedListimplements the lookup function for a linked list, showing how to work with structs and pointers
LocalArrayshows how to allocate arrays of data on the stack
OperandStackTestsdemonstrates the use of VirtualMachineOperandStack to simulate an operand stack without actually performing pushes and pops but being able to recreate the full operand stack if you need to call back into an interpreter
Pointerdemonstrates the use of pointers to several kinds of primitive data, including pointer to pointer to double
RecursiveFibcomputes Fibonacci numbers the traditional recursive way and shows that recursive calls are possible
StructArrayshows how to create arrays of structs
Switchdemonstrates how to specify a switch control flow structure
Thunkshows how to use ThunkBuilder to call native functions
Unionshows examples of how to represent and reference C union types and their fields using TypeDictionary
Code samples for experimental APIs
On a relatively modern x86-64 system (Linux or OS X) you can probably also run the tests targeting the experimental portion of the JitBuilder API:[code language=”C”] $ make testexperimental [ output of 2 tests omitted ] [/code]
There are only two tests in this category so far:
- AtomicOperations shows how to do atomic addition on 32-bit integers
- TransactionalOperations shows how the Transaction() API can be used to exploit hardware transactional memory (HTM)
This article showed you how to clone and configure the Eclipse OMR project, which contains enterprise quality components for building language runtimes. JitBuilder is a project inside Eclipse OMR, so the next step was to build that library from scratch. I explained all the high-level parts of the current JitBuilder API, and then brief descriptions for the 22 different code samples to help you learn how some of the more details aspects of the JitBuilder API work.
In the next article in this series, I’m going to show how to use JitBuilder to compile a small language, leveraging an existing set of tutorials (the Kaleidoscope tutorials) written for the LLVM compiler project. Many people ask how the Eclipse OMR project compares to LLVM, so the next article will look at some of the similarities and differences between LLVM and Eclipse OMR’s JitBuilder in the context of something you can do with both projects.