What type of Machine is the C Preprocessor?

27/01/2014

About a year ago I wrote an implementation of Brainfuck in the C Preprocessor.

This opened up discussion again as to if the C Preprocessor is Turing complete. The main argument against Turing completeness was that my implementation did not support unbounded recursion. It was therefore quite possible to write a computation that my implementation could not simulate; any computation that was intended to run indefinitely.

But more precisely people complained that, due to this, my implementation did not support an infinite tape, and so was not a Turing Complete.

This seems intuitive. The fact that the C preprocessor could not infinitely recurse, seems like it must be tied to the infinite tape of the Turing Machine. But this was also a little confusing. I hadn't thought about it at the time, but my Brainfuck implementation did have an infinite tape. We can see this by looking at how the basic operators were defined.

#define LIST(...) (__VA_ARGS__)

Memory was stored as a tuple. Essentially a whole list of tokens that look like arguments to a function. Because this list looked like arguments to a function (or macro), we could use the variable arguments feature of macros to define our basic tuple operations.

But first we need a little bit of groundwork. We start by defining a function EVAL, which runs another preprocessor scan on its arguments, and evaluates them as a whole. This we use to evaluate a expression when it consists of a macro name next to its arguments.

#define EVAL(...) __VA_ARGS__

Then we can define the CURRY function, which can be used to apply arguments to a function as a tuple. This is done by placing the macro name next to the arguments, and evaluating.

#define CURRY(M, L) EVAL(M L)

Finally this can be used to define our familiar HEAD and TAIL functions which take the first element of a tuple, or the rest.

#define ARGS_HEAD(X, ...) X
#define ARGS_TAIL(X, ...) (__VA_ARGS__)
#define HEAD(L) CURRY(ARGS_HEAD, L)
#define TAIL(L) CURRY(ARGS_TAIL, L)

We can also use CURRY to create a CONS function, for appending an element to the front a list.

#define CONS(X, L) (X, CURRY(EVAL, L))

And that's all there is to it. It should be fairly obvious that at no point hard limits or bounds are introduced. Providing we can infinitely apply CONS to a tuple, our memory is infinite too.

But this memory is different to that of a Turing machine. It is somewhat lazy. With this memory if we go over the edge of the tape we need to generate new cells on the fly. We imagine Turing Machine memory having an infinite tape which is just there, not having a tape that must be generated as the head moves. Perhaps this is the reason why the preprocessor appears to lacks an infinite tape.

But, while Turing Machine memory isn't generated on the fly, it isn't exactly random access. To access the memory at some location we have to move the head a number of times in that direction. Memory access for a Turing machine can still take a number of operations, proportional to various things. If stepping along the tape and reading or writing some memory takes 1 or 10 logical operations it doesn't really matter. This online generation of memory is essentially implementation overhead. When it comes down to it I don't believe Turing Machine memory is really any different to our lazy memory.

So maybe it is because of the recursion limit. We know that the preprocessor cannot perform unbounded recursion. If we wished to access memory at location N, we might have to do f(N) operations to get there. And if f(N) is greater than our upper bound on recursion we can't do it. But something is odd about this too. It is more like we have an infinite tape, but we can't reach it all because of a limit on the number of iterations we can perform in one computation.

Just like a Turing Machine, we have an infinite tape, and a fixed number of states the system can be in; it should all be there.

What we are missing is the ability to re-enter old states.

To achieve recursion in the preprocessor each recursive function is augmented with a recursion depth variable $. This counts the number of times we have recursed, and is used to find a function we have not entered yet in the computational call graph. For a Turing machine this would be a bit like duplicating all the states S times, and then augmenting their identifier with an extra number s. For each transition from a state augmented with s, we change the transition to go to the state augmented with s+1.

This avoids any loops in the state graph, but it does create a troublesome situation when the limit S is exhausted. The machine has no state to enter, and must halt with no transitions.

So the C preprocessor is a Turing Machine who's transition table cannot contain loops, not one lacking an infinite tape.

If this machine has a name I'd love to know it, and what its equivalences are. So if you know anything about what this might be don't hesitate to get in contact.