TAL Overview

  Home TAL Overview Papers Software Members Related Projects

Statically typed intermediate languages are effective tools for staging the compilation of high-level languages. Types express invariants that help programmers understand their programs, and strongly typed languages prevent many common programming errors. Compiler writers can use these properties to debug sophisticated program transformations such as closure conversion and optimizations like datatype specialization. Furthermore, types not only help check the correctness of transformations but enable analyses or optimizations that are extremely difficult without them.

The goal of the TAL project is to extend the paradigm of type-directed compilation to its limit. We compile high-level languages that include features such as higher-order polymorphic functions, datatypes, modules, objects, and subtyping into a series of typed intermediate languages and finally into a Typed Assembly Language (TAL). Unlike any other compiler, we not only use typed intermediate languages but also typed target languages.

TALC, our compiler, generates code for the Intel 32-bit family of assembly languages.   This assembly code is annotated with type information that can be verified by our type checker before the code is assembled by the MASM assembler.  The assembly language type system is powerful enough to express high-level abstractions such as closures and abstract datatypes and yet flexible enough to permit traditional low-level optimizations such as loop invariant removal, strength reduction, and redundant move elimination. Furthermore, just as the more familiar typed source and intermediate languages catch programmer errors, so does our typed assembly language. It helps to ensure that complicated backend transformations such as the optimizations above as well as register allocation and more general instruction selection and scheduling are performed correctly.

The type soundness of our assembly code implies many important security properties such as memory safety. As a result, our assembly code is a form of proof-carrying code. TAL applets, like Java applets, may be downloaded from untrusted sources on the internet, verified, and executed safely without fear they will corrupt the host machine. Furthermore, because TAL is assembly language, there is no interpretation overhead during this process and no just-in-time compiler to run or trust.

The present TAL effort has many applications in the security and compilation domains. Future work involves compiling more expressive typed languages into the language of real machines. We seek to reveal the connections between the typed lambda calculi which give semantics to many modern languages and the Von Neumann machines that implement them.