Interpreters Part II : A Simple Interpreter

At SourceProbe we’re building code search tools. Several of our systems rely on fast code interpreters. This is the second post in a series exploring interpreters and implementation techniques.

Last time we looked at a rough estimate of relative language performance, using a Mandlebrot renderer as a benchmark.

This time we’ll build a simple interpreter in each language, and benchmark the resulting interpreters.

The Language

We will be implementing our interpreter in several language to compare the results of the compilers and runtime. We will also use the same language in future posts on advanced techniques.

We’d like a language that is both simple, but sufficiently complex as to be suggestive of performance on languages used in production.

We’ll use a somewhat infamous language which resembles a bare Turing machine. The program is a series of characters, each character representing one instruction. The program reads instructions in order, and terminates after reaching the end. The program has access to a long tape, where each cell holds a single byte. It also supports primitive comparison / looping. The full set of commands is described in the table below:

Instruction Meaning
+ Increment value under pointer
- Decrement value under pointer
< Move the pointer one cell to the left
> Move the pointer one cell to the right
[ Jump after to matching ] if the value under pointer is 0
] Jump after to matching [, if value under pointer is not 0
. Print ascii character for the value under pointer
, Read character (unimplemented)

The interpreter loop

Lets begin by looking at a basic interpreter loop. The interpreter needs to keep track of the next instruction to run, as well as any state needed by the language / machine. For us, this means a 30k byte array, an integer for the instruction index, and an integer for the read/write pointer.

And just to make program execution simpler, we’ll do a little bit of preprocessing to figure out the jump destinations. We can pass over the program once, keeping a stack to track any unclosed open brackets. When we encounter a closing bracket, we’ll record this in two maps. One for mapping an open brace to its close, and the other doing the reverse.

Finally the interpreter runs the program. It repeatedly looks at the next instruction and does what it says. Below is the Python implementation. Other implementations are linked from the results table later in this post.

while pc < end:
    c = prog[pc]
    if c == '+':
        tape[ptr] = (tape[ptr]+1) % 256
    elif c == '-':
        tape[ptr] = (tape[ptr]-1) % 256
    elif c == '>':
        ptr += 1
    elif c == '<':
        ptr -= 1
    elif c =='[':
        if tape[ptr] == 0:
            pc = forward[pc]
    elif c == ']':
        if tape[ptr] != 0:
            pc = backward[pc]
    elif c == '.':
        print(chr(tape[ptr]), end='')
    else:
        print("unhandled instn: {}".format(c))
        sys.exit(1)
    pc += 1

The benchmark

Continuing in the tradition of our previous post, we’ll run a Mandelbrot renderer written in this tiny language. The program is both familiar, and long running. The running times are thus more representative of steady state performance for interpreter-like programs.

Benchmark source

Results

This time, fewer configurations are listed. For languages that can run interpreted and compiled, only the compiled time is presented. These programs run much longer, so execution time should dominate any one-time costs.

Language & Link to code Run time (min,sec) Multiple vs C++ (lower is better)
C++, default 3m51s 1.0
C++ (O3) 36.95s 0.16
Python3 30m52s 8.01
ChickenScheme 9m36s 2.49
SBCL 2m35s 0.67

Conclusion

Though the basic interpretter is quite simple, this comes at a cost. For every emulated instruction, our interpretter needs to check the loop condition, branch to the corresponding handler, then finally perform the work requested by the instruction.

Next we’ll start looking at some advanced techniques to reduce the overhead associated with interpretting the language.


Join our newsletter to get notified of new articles and product updates. Unsubscribe at any time.