Last updated 1997-10-31. This paragraph added 1999-02-26.

Running C programs on a Lisp OS

There are a couple of companies that are pushing OSes based on the Lisp pointer security model -- Sun, of course, with Java; Symbolics, with Genera; and there's a group of hackers who are working on building a free, open Lisp OS.

This paper describes a way to run the enormous installed base of programs written in C, assembly, and other similar languages on such an OS. The implementation is by no means without penalty -- every memory access would incur bounds-checking and data-conversion overhead. My wild-ass guess is that this will slow down most programs by a factor of between 10 and 1,000. See the end for why I guess this.

C Pointer Semantics

In order to run a generic C program on an OS which checks all pointer access -- such as most implementations of Lisp or Scheme and all conforming implementations of Java -- C's pointer semantics must somehow be simulated.

In C, every parameter to a function, every local variable, every statically-allocated variable, and every object allocated on the free store has an address. These addresses are interchangeable; it must be possible to write a function that takes the address of any of the above types of variables and writes a value to the pointed-to variable.

Furthermore, most C programs assume that different types of data live in the same memory space, and do things such as copying an array of integers using memcpy().

However, there are two exceptions to this everything-must-go-in-one-memory-space rule: function addresses and return values.

While most C implementations put the addresses of functions in the same namespace as the addresses of data, few C programs -- except those that access hardware functions directly -- assume this to be the case. This is fortunate, since I'm not sure how I could implement this in Java otherwise.

Return values are strictly rvalues, and cannot have their addresses taken.

RAM implementation

In a Lisp or Java OS, C RAM can be implemented as an array of bytes. Pointers into RAM can be simulated by indices into this array; accessing other-than-byte values in memory can be implemented by some small functions that read some number of consecutive bytes and return an integer.

It might be better, under some circumstances, to implement RAM as a sparse array, the way most modern OSes do -- enabling the stack to start at the top of the address space, for instance, and enabling unmapped space to cause errors when accessed. This would be much slower in software than in hardware, however.

However, this approach -- especially if pursued enthusiastically -- would find most pointer bugs sooner or later.

Function call and return

C functions, then, can be compiled into Lisp or Java functions and run directly. They can reasonably be called from Lisp or Java code and return integer values; more-complex return values can be implemented by native-Lisp or native-Java code that unmarshals a more-complex return-value from the C function's memory array.

As an optimization, parameters and locals that never have their addresses taken can be passed directly, rather than being pushed onto the slower stack in the memory array. In general, parameters can be passed directly; if the called function needs the addresses of the parameters, it can push them itself.

In general, the memory array should be passed to all C functions; it should not be bound to an atom. This makes it possible to run more than one C program simultaneously, in separate memory spaces.

Calling native Lisp or Java code is likely to require some sort of argument unmarshalling; it may or may not be necessary to pass the RAM to the native code, and it may be necessary to pass other native objects to it. Also, it's likely to require some sort of return-value marshalling, since C doesn't allow rich return values.

Function pointers

C's memory can hold nothing more than bytes, binary numbers. In an environment where you can't manufacture pointers from numbers, storing function pointers in C's memory is impossible.

However, you can store indices into an array, and (in Lisp) you can store pointers to functions in an array. When linking a C program, you need only insert a reference to each function whose address is taken into this array -- this will work equally well for C and native functions.

In Java, you can't store function pointers, but you can do function indirection through interfaces. Each C function can be declared as a class with a single call() method and no instance variables; an instance of each of these classes can be stored in a function-pointer array. Native-Java methods may require an object to work on; a stub class with a call() function can be generated which takes an object reference as its first argument.

Native objects

Native objects can be inserted into a native-object array at run-time, similar to the function-pointer array, when a pointer to one of them needs to be represented in C's memory array.

This poses a memory-management problem. The environments I've been discussing generally do garbage-collecting memory management; when an object is no longer referenced by any pointers, it is asked to release any resources it controls, and then it gets returned to the pool of free memory.

Once a native object has been inserted into the native-object array, it may not be possible to determine when it is no longer referenced in the C program. I expect the life story of a native-object pointer will go something like this:

C function written to interface with native Lisp/Java facilities calls a native function to get a pointer to a native object.
C function stores it in a data structure in C RAM somewhere.
Another C function, also written to interface with native Lisp/Java facilities, calls some native functions, passing the native object to them. (How this is done is the business of the argument marshalling code.) Repeat as necessary.
Another C function, also written to interface with native Lisp/Java facilities, calls an invalidate-native-object-pointer function, which removes the reference to the object from the native-object array.

If the last step is forgotten, we have a memory leak. C programs are very prone to memory leaks; further discussion of memory leaks and how to fix them could greatly extend this paper. See http://www.geodesic.com/greatcircle/index.html for more information.

Performance

If this method is used to run C programs on an OS that compiles Lisp or Java to native machine code, I expect that the performance penalty for running C programs this way will be comparable to the performance penalty for running a program in an interpreted language. Every memory access by the C code, except to most local variables and parameters, must be bounds-checked.

Interpreted languages vary in efficiency. Pre-8.0 versions of Tcl typically had performance penalty factor of about 1,000; byte-compiled Java typically has a performance penalty factor of about 20; threaded Forth code typically has a performance penalty factor of about 5. All of these are off the top of my head, and I haven't checked any of them.

I'm assuming there's no performance penalty for writing the bounds-checking code in Lisp instead of C or assembly. This is somewhat optimistic.

It might be possible to reduce this performance penalty in a few ways.

Structuring RAM as an array of 32-bit integers would greatly reduce the number of bounds-checks needed to access pieces of memory larger than a byte, and would also speed up access to pointers and 32-bit integers. (Here I assume a 32-bit address architecture.) It would slow down smaller accesses, but I believe them to be less frequent in modern programs.
When running on a machine with an MMU, the MMU could be enlisted to speed the bounds-checking. When running array-access instructions, all accesses could be done directly, without checking in software; if the MMU generates an exception, we can drop into bounds-checking code and perform the appropriate action, which might include adding a mapping for the appropriate page.
There are some problems with this strategy, which is not surprising, since it's essentially the traditional UNIX strategy:
- Other memory-access instructions, such as `car' and `cdr', need to run in privileged mode if we do this. This could slow things down quite a bit.
- The checking it provides is not as strong -- a bad offset into one array might happen to fall in the middle of another, causing havoc.
- It requires messing with the TLB every time you do a function call or return, or weakening the checks further.
The RAM access routines could cache recently-accessed memory elements in a short alist. This slows down cache-misses significantly, though, and may not provide much of a speedup. It might be more worthwhile if one were implementing a sparse memory array.