In the first two articles of this series, I introduced the JitBuilder library and described how to use it to compile native x86-64 code on the Linux platform using a Docker image I built with the Eclipse OMR project. In this article, I show you how to build your own JitBuilder library from the source code in the Eclipse OMR project, and I will switch platforms to OS X. Along the way, I’ll give you a tour of all the various parts of the JitBuilder library and where they reside in the Eclipse OMR project.

Prerequisites

Before you read this post, you should read the first post to understand the basics of how the library works and to download the JitBuilder docker image and get it up and running.

And read the second post to build more complicated methods using the JitBuilder library.

Get the code

Let’s get started! The first step is to clone the Eclipse OMR repository and get the code onto your system:

$ git clone https://github.com/eclipse/omr
Cloning into 'omr'...
remote: Counting objects: 10694, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 10694 (delta 0), reused 0 (delta 0), pack-reused 10692
Receiving objects: 100% (10694/10694), 11.21 MiB | 1.54 MiB/s, done.
Resolving deltas: 100% (7054/7054), done.

OMR configuration and build

At this point, you’ve got an omr directory containing the complete Eclipse OMR source code repository. Let’s take a quick look at what’s in there:


$ cd omr && ls
CONTRIBUTING.md   config.guess          gc                  omr_glue_static_lib        third_party
GNUmakefile       config.sub            glue                omr_static_lib             thread
INSTALL           configure             include_core        omrmakefiles               tools
LICENSE           configure.ac          install-sh          omrsigcompat               util
README.md         ddr_artifacts.mk      jitbuilder          omrtrace
artwork           doc                   lib                 perftest
build             example               nls                 port
compiler          fvtest                omr                 run_configure.mk

That’s a lot of stuff! The Eclipse OMR project currently weighs in at a little over 800,000 lines of code, but you don’t need to pay attention to most of it. I’ll just touch upon a few of the highlights.

There are a number of files and directories that play a part in the OMR project’s configuration (i.e. detecting the platform and system capabilities) and build system. OMR currently uses autotools configure to generate static makefiles to build. GNumakefile, config.guess, config.sub, configure, configure.ac, omrmakefiles, and run_configure.mk are all part of the OMR configuration and build story. If you’re interested, there are some more details in the omr/README.md file, but for this article and for most users of JitBuilder you won’t need to delve into these details. If you want to jump directly to configuring the OMR project, skip ahead a few paragraphs to “Configuring the OMR project“.

The top-level omr directory contains the usual README.md, LICENSE, and CONTRIBUTING.md files to give you some background and practical information about the project if you want to get involved (and you’re very welcome to!). There are lots of different components, mostly obviously named like gc is the garbage collection framework, in the top level directory, but for this article I’ll focus on only two directories: compiler and jitbuilder.

The compiler directory contains the OMR compiler component code that will be built into the JitBuilder library. It represents a little bit more than half of the compiler that powers the IBM J9 JIT compiler for Java. The compiler defines:

  • Its own intermediate representation (IR), a topic I touched on in the last article
  • Native instruction code generators for three major hardware platforms (X86, POWER, and IBM Z)
  • An optimization framework with approximately 70 different optimizations and analyses that should be readily familiar to compiler developers
  • Runtime support for executing code and for managing the process of dynamically compiling code

The jitbuilder directory contains an extension of the OMR compiler that augments a few of the OMR compiler classes and implements a simple default lifecycle infrastructure around the use of the OMR compiler, as described in earlier articles in this series. Its internal directory structure purposefully mimics that of the compiler directory so that it can extend the OMR compiler component. That’s all the detail I’ll go into about that in this post. Let’s get that library built!

Configuring the OMR project

To build and then use the JitBuilder library, you first need to run the OMR top-level configuration step. Here’s the command you need to run:

$ make -f run_configure.mk SPEC=osx OMRGLUE=./example/glue
[ lots of output omitted ]

run_configure.mk is a makefile that handles most of the complexity of system configuration for you. It only requires two pieces of input: SPEC= what platform to configure for, and OMRGLUE= where to find the glue code that should be compiled with OMR. What is glue code? The OMR project contains what are essentially incomplete technology components. To complete each OMR component you decide to use, you need to write glue code to connect the component into your language runtime and to configure it to respect your language semantics and runtime mechanisms.

In fact, JitBuilder doesn’t really need the OMRGLUE directory (because jitbuilder is like a glue directory itself), but run_configure.mk expects it so you cannot leave it out. The OMR project comes with an example glue implementation in the ./example/glue directory, so the easiest thing to do is simply point OMRGLUE at that directory as we did in the above command.

SPEC is the platform you want to configure for. Since I’m writing this article for OS X, the option I wrote is SPEC=osx. If you want to run on a 64-bit x86 Linux variant, you could instead set SPEC=linux_x64-64. With either platform SPEC setting, the rest of this article should work smoothly for you. Please let me know if it doesn’t!

Once you’ve configured OMR, you are ready to build the JitBuilder library.

Building the JitBuilder library

The code specific to the JitBuilder library can be found in the jitbuilder top-level directory. Let’s take a look at what’s in there:


$ cd jitbuilder && ls
Makefile	codegen		control		ilgen		p		runtime		z
build		compile		env		optimizer	release		x

Because the JitBuilder library is actually a compiler project itself that extends the OMR compiler, most of these directories simply mimic the directory layout of the OMR compiler. Because of the way the OMR compiler is constructed, you can create new compiler projects that extend, or subclass, the functionality in OMR to implement a compiler for a specific language (like Java) or for a particular use case (like creating the JitBuilder library). I won’t talk much more about that here. For those who are really interested, there is some documentation in omr/doc/compiler/extensible_classes.

The main pieces you need to pay attention to in this article are in the release directory. But first, let’s build the JitBuilder library:

$ make
OR
$ make -jN	# if you want to build on N cores
[ lots of output omitted ]

That will take a while since it builds all the main source files in the jitbuilder/ directories as well as in the OMR compiler directories, but once it’s done, you’ll have successfully built the JitBuilder library!

JitBuilder release directory

To verify that the library has been built, let’s descend into the release directory:


$ cd release; ls
Dockerfile-JitBuilder-x86_64	README.md			jitbuilder.tgz
LICENSE				ReplayMethod.cpp		libjitbuilder.a
Makefile			include				src

This directory contains a standalone JitBuilder library distribution, jitbuilder.tgz, which contains everything needed to work against a particular build of the JitBuilder library. Because we configured and built on OS X, you could move jitbuilder.tgz to any OS X system, expand it, and then use the library to dynamically compile native code using the JitBuilder API on that system. It has a very basic README.md, a copy of the same LICENSE file as the Eclipse OMR project (a dual license: Eclipse Public License v1.0 and Apache License 2.0), a Makefile, a Docker file for building a docker image for x86_64 (the same one that’s used in the first two articles of this series), the libjitbuilder.a (the JitBuilder static library itself), an include directory defining the JitBuilder library API, and a src directory that contains a number of code samples for how to use different parts of the JitBuilder API. There is also ReplayMethod.cpp which is part of an experimental and incomplete feature of the JitBuilder library, so you can ignore that for now.

Quick tour of the JitBuilder API

As the earlier articles in this series introduced to you, JitBuilder is a library consisting of a number of useful classes whose API can be primarily found in the jitbuilder/release/include/compiler/ilgen directory. If you’re interested to poke around, the implementation for these classes can be found in the omr/compiler/ilgen directory. In fact, as part of building the jitbuilder/release/include directory, the header files are copied from omr/compiler/ilgen.

Here’s a high-level description of the various parts:

jitbuilder/release/include/Jit.hpp contains the high level interface to the JitBuilder library, which currently describes only the three calls introduced in the first article in this series: initializeJit(), shutdownJit(), and compileMethodBuilder().

MethodBuilder is an object that represents the ABI (parameters, return value) and the operations that should be performed when a compiled method is called. To compile your own methods, you subclass MethodBuilder and implement its constructor to specify the ABI and then specify the operations the compiled method should perform by implementing a MethodBuilder member function called buildIL() (usually using either IlBuilder or BytecodeBuilder, see below). After passing a MethodBuilder object to compileMethodBuilder(), you get back a pointer to a function that you can call by casting that pointer to a C function prototype. Each MethodBuilder has a hash map of names (C strings) representing the method’s local variables, and you can define them all at once or make them up as you go.

TypeDictionary is an object that you can use to describe the shape of structures, unions, and other types so that JitBuilder knows how to access their fields and values. TypeDictionary provides some useful primitives for constructing pointers and for converting C types (like unsigned long) into JitBuilder types (like Int64) according to how your C/C++ compiler represents those types. In other words, if you build the JitBuilder library with the same C++ compiler used to compile your program, it is very easy to make sure JitBuilder uses type definitions that are consistent with the rest of your code base.

IlBuilder is an object that represents the operations to perform when a particular control flow path is reached. The method entry point is one particular control flow path, and so a MethodBuilder object is also an IlBuilder object. But you can create arbitrary IlBuilder objects and connect them together (even nest them) in arbitrary ways using an ever-growing list of services like IfThen or ForLoopUp or WhileDoLoop. Each service you call on an IlBuilder appends operations to the control flow path and so describes the intended order of operations to the compiler. There is no restriction on how you add operations to different IlBuilder objects. You can create all the IlBuilder objects up front and then add operations to them individually, or you can start from the MethodBuilder object (function entry) and create IlBuilder objects as you need them to represent the different paths of execution as you require them.

IlInjector is the superclass of IlBuilder, providing an even more precise and flexible interface to create OMR compiler IL, but it also requires intimate knowledge of the OMR compiler IL. IlInjector is considered a highly advanced topic for JitBuilder based JITs and, frankly, this API has not been fleshed out as much as IlBuilder. The code samples provided with JitBuilder (see below) all use IlBuilder and do not reference IlInjector directly.

IlValue is an object that represents the values that are created and consumed by expressions. If you load a local variable by name, the value that’s loaded is an IlValue. If you create a constant integer, that’s an IlValue. If you pass both of those IlValues into the Add operation, you’ll get back another IlValue that represents their sum. There is an ever growing list of operations you can apply to create arbitrarily complex IlValue expressions.

BytecodeBuilder is a special kind of IlBuilder that’s designed to simplify writing JIT compilers for bytecode based languages. You allocate a BytecodeBuilder object for each bytecode in a method and translate the operations needed for that bytecode using that object. You have to specify every flow edge (even fall through edges) between bytecode builders. BytecodeBuilder also taps into a handy and mostly automatic built-in worklist algorithm provided by MethodBuilder that can traverse all flow edges in a control flow graph, visiting each bytecode at most once. Any bytecodes in the method that aren’t reachable (many parsers generate unreachable code to simplify the process of initial code generation from a parse tree) will not be visited. For a little more background about BytecodeBuilders, you can take a look at the second half of the video from my dwOpen Tech Talk presentation from July, 2016.

ThunkBuilder is a subclass of MethodBuilder that creates functions that can call any native function with a particular signature by passing an array of appropriately typed arguments (kind of like libffi). You can compile a thunk builder object and then pass to it the address of a C function along with the array of arguments and it will call that function directly for you passing those arguments.

VirtualMachineState is an interface (abstract class) used to describe parts of a Virtual Machine (VM) architecture to JitBuilder so that it can interact with that state. A simple example would be a register holding some virtual state value like a program counter or stack pointer. A more complicated example would be the VM’s operand stack which is manipulated by the VM while simulating the execution of bytecodes. Naive JIT compilers replicate the steps to push and pop operands onto the virtual operand stack, even though the compiled code does not really need to use an operand stack and would benefit from using registers or local stack frame variables. The VirtualMachineOperandStack class allows a JitBuilder based JIT to push and pop IlValues just like the virtual machine pushes and pops actual values while executing bytecodes, but the code generated by JitBuilder will not actually use an operand stack. There are implementations of VirtualMachineState for several kinds of state variables including an operand stack (VirtualMachineOperandStack). This family of classes is typically used by JIT compiler implementations.

JitBuilder code samples

The JitBuilder library comes with, at the time of this writing, 22 code samples to help you understand how various aspects of the API work or can be used. You can build all of the JitBuilder code samples by simply running make in the jitbuilder/release directory:
	$ make
	g++ -o AtomicOperations.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/AtomicOperations.cpp
	g++ -g -fno-rtti -o atomicoperations AtomicOperations.o -L. -ljitbuilder -ldl
	g++ -o Call.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Call.cpp
	g++ -g -fno-rtti -o call Call.o -L. -ljitbuilder -ldl
	g++ -o ConstString.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/ConstString.cpp
	g++ -g -fno-rtti -o conststring ConstString.o -L. -ljitbuilder -ldl
	g++ -o ControlFlowTests.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/ControlFlowTests.cpp
	g++ -o FieldAddress.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/FieldAddress.cpp
	g++ -g -fno-rtti -o fieldaddress FieldAddress.o -L. -ljitbuilder -ldl
	g++ -o IsSupportedType.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/IsSupportedType.cpp
	g++ -g -fno-rtti -o issupportedtype IsSupportedType.o -L. -ljitbuilder -ldl
	g++ -o IterativeFib.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/IterativeFib.cpp
	g++ -g -fno-rtti -o iterfib IterativeFib.o -L. -ljitbuilder -ldl
	g++ -o LinkedList.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/LinkedList.cpp
	g++ -g -fno-rtti -o linkedlist LinkedList.o -L. -ljitbuilder -ldl
	g++ -o LocalArray.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/LocalArray.cpp
	g++ -g -fno-rtti -o localarray LocalArray.o -L. -ljitbuilder -ldl
	g++ -o NestedLoop.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/NestedLoop.cpp
	g++ -g -fno-rtti -o nestedloop NestedLoop.o -L. -ljitbuilder -ldl
	g++ -o OperandStackTests.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/OperandStackTests.cpp
	g++ -g -fno-rtti -o operandstacktests OperandStackTests.o -L. -ljitbuilder -ldl
	g++ -o Pointer.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Pointer.cpp
	g++ -g -fno-rtti -o pointer Pointer.o -L. -ljitbuilder -ldl
	g++ -o Pow2.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Pow2.cpp
	g++ -g -fno-rtti -o pow2 Pow2.o -L. -ljitbuilder -ldl
	g++ -o RecursiveFib.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/RecursiveFib.cpp
	g++ -g -fno-rtti -o recfib RecursiveFib.o -L. -ljitbuilder -ldl
	g++ -o Simple.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Simple.cpp
	g++ -g -fno-rtti -o simple Simple.o -L. -ljitbuilder -ldl
	g++ -o StructArray.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/StructArray.cpp
	g++ -g -fno-rtti -o structarray StructArray.o -L. -ljitbuilder -ldl
	g++ -o Switch.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Switch.cpp
	g++ -g -fno-rtti -o switch Switch.o -L. -ljitbuilder -ldl
	g++ -o Thunk.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Thunk.cpp
	g++ -g -fno-rtti -o thunks Thunk.o -L. -ljitbuilder -ldl
	g++ -o ToIlType.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/ToIlType.cpp
	g++ -g -fno-rtti -o toiltype ToIlType.o -L. -ljitbuilder -ldl
	g++ -o TransactionalOperations.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/TransactionalOperations.cpp
	g++ -g -fno-rtti -o transactionaloperations TransactionalOperations.o -L. -ljitbuilder -ldl
	g++ -o Union.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Union.cpp
	g++ -g -fno-rtti -o union Union.o -L. -ljitbuilder -ldl
	g++ -o Worklist.o -g -std=c++0x -O2 -c -fno-rtti -fPIC -I./include/compiler -I./include src/Worklist.cpp
	g++ -g -fno-rtti -o worklist Worklist.o -L. -ljitbuilder -ldl
You’ve now built the tests! Not all of these tests, however, currently run on all platforms.

Code samples that should work on all platforms

Running the 7 tests that should work everywhere is easy:

$ make test
[ output of 7 tests omitted ]

The tests are:

  • IsSupportedType shows how to tell if a particular C or C++ type can be expressed as a JitBuilder type
  • IterativeFib is the code sample used in the second article in this series: it compiles a function that computes Fibonaccinumbers using an iterative algorithm.
  • NestedLoop is one of the language shootout benchmarks with a 6 deep loop nest containing a variable increment
  • Pow2 is a trivial iterative approach to computing a power of 2
  • Simple is the code sample from the [first article in this series](URL to first article): a function that simply returns its argument + 1
  • ToIlType shows how to use the toIlType template functions in TypeDictionary to map C/C++ types to TypeDictionary types
  • Worklist gives a (convoluted) example of using the worklist facility in MethodBuilder to traverse a flow graph of bytecodes

Code samples that may not work on all platforms

Not all the tests work on other platforms because JitBuilder is currently (a point in time limitation) missing automatic method trampolines for function calls on platforms with limited direct addressability. X86-64 and OS X, however, aren’t in this class of platforms, so you can actually run everything with ‘testall’ :

	$ make testall
	[ output of 22 tests omitted ]

In addition to the code samples that run on all platforms, these samples include:

  • Call showing how to call an arbitrary C function from a compiled method
  • ConstString shows how to use ConstString to manipulate C strings
  • DotProduct shows how to use arrays by computing a simple vector dot product
  • FieldAddress shows how to work with structs that are embedded in other structs
  • LinkedList implements the lookup function for a linked list, showing how to work with structs and pointers
  • LocalArray shows how to allocate arrays of data on the stack
  • OperandStackTests demonstrates the use of VirtualMachineOperandStack to simulate an operand stack without actually performing pushes and pops but being able to recreate the full operand stack if you need to call back into an interpreter
  • Pointer demonstrates the use of pointers to several kinds of primitive data, including pointer to pointer to double
  • RecursiveFib computes Fibonacci numbers the traditional recursive way and shows that recursive calls are possible
  • StructArray shows how to create arrays of structs
  • Switch demonstrates how to specify a switch control flow structure
  • Thunk shows how to use ThunkBuilder to call native functions
  • Union shows examples of how to represent and reference C union types and their fields using TypeDictionary

Code samples for experimental APIs

On a relatively modern x86-64 system (Linux or OS X) you can probably also run the tests targeting the experimental portion of the JitBuilder API:

	$ make testexperimental
	[ output of 2 tests omitted ]

There are only two tests in this category so far:

  • AtomicOperations shows how to do atomic addition on 32-bit integers
  • TransactionalOperations shows how the Transaction() API can be used to exploit hardware transactional memory (HTM)

Wrap up

This article showed you how to clone and configure the Eclipse OMR project, which contains enterprise quality components for building language runtimes. JitBuilder is a project inside Eclipse OMR, so the next step was to build that library from scratch. I explained all the high-level parts of the current JitBuilder API, and then brief descriptions for the 22 different code samples to help you learn how some of the more details aspects of the JitBuilder API work.

In the next article in this series, I’m going to show how to use JitBuilder to compile a small language, leveraging an existing set of tutorials (the Kaleidoscope tutorials) written for the LLVM compiler project. Many people ask how the Eclipse OMR project compares to LLVM, so the next article will look at some of the similarities and differences between LLVM and Eclipse OMR’s JitBuilder in the context of something you can do with both projects.

Join The Discussion

Your email address will not be published. Required fields are marked *