.@ Tony Finch – blog


I've recently been pondering models of concurrent programming. One of the simplest, most beautiful and most powerful is the π calculus - it's effetively the λ calculus of concurrent programming.

The π calculus is oriented around channels, which can be read from, written to, and passed along channels: that is, they can be passed between processes (though processes are implicit). Reading and writing are synchronous: if a process wants to read or write a channel, it blocks until there is another process wanting to do the opposite, and after the communication both processes can proceed with their sequels. The alternative to this is for writes to be asynchronous: they do not block.

The problem with synchronous writes is that they do not translate well into a distributed setting. An asynchronous write translates directly into a network transmission, but a synchronous write requires a round trip to unblock the writer. As a result it is less efficient and more vulnerable to failure. Even in a simple local setting, asynchronous sends may be preferable: in fact, Pict, a programming language closely based on the π calculus, is restricted to asynchronous writes. On the other hand, with asynchronous writes you may need a queue on the reader's end, and some way to avoid running out of memory because of overflowing queues.

Another design alternative is to focus on processes instead of channels. Messages are sent to processes, and PIDs can be transmitted between processes . In this model the read ends of channels are effectively fixed. This is the actor model as used by Erlang.

The problem with the π calculus's channels, or a distributed-object system like Obliq or E (see §17.4 on p. 126), is that you need distributed garbage collection. If I spawn a process to listen on a channel locally, and send the channel to a remote system, I must keep the process around as long as that remote system may want to write on the channel. Distributed garbage collection is far too difficult. On the other hand, in the actor model a process is the master of its own lifetime, because it lasts until its thread of execution completes. Thus you avoid relying on tricky algorithms, at the cost of needing some way of dealing with failed message sends. But in a distributed system you have to deal with failure anyway, and the same mechanism can deal with the queue overflow problem.

So I conclude that the π calculus is wrong for distributed programming and that Erlang's actor model is right. Those Erlang designers are clever chaps.