Deny Capabilities for Safe, Fast Actors

Deny Capabilities for Safe, Fast Actors – Clebsch et al. 2015

Do you remember Herb Sutter’s ‘The Free Lunch is Over’ article? (Hard to believe that was written over 11 years ago!). Herb Sutter also posted a great update in 2012, ‘Welcome to the Jungle’. In the conclusion of that piece Sutter writes:

Mainstream hardware is becoming permanently parallel, heterogeneous, and distributed. These changes are permanent, and so will permanently affect the way we have to write performance-intensive code on mainstream architectures.

I bring all this up because the publication of the original article caused a wave of interest in programming languages that could make dealing with concurrency easier. Today’s paper is about the fundamental approach to concurrency built into the type system of the Pony programming language that statically ensures freedom from data races, and supports an actor-based programming model.

We provide a type system that ensures data race freedom statically for an actor-model language while also providing a way to type actors themselves, in the mould of active objects, and without placing any restrictions on the structure of messages. In addition, the type system is amenable to efficient implementation, and we have implemented it for the Pony programming language.

This data race freedom is based on a capabilities model – see yesterday’s post for background on capabilities. To be more precise, Pony is based on reference capabilities:

We clarify our use of the term reference capability. Cabilities were introduced to support protection across processes, and have been adopted into several branches of computing since. The term object capabilities has been coined by Mark Miller, to describe the set of operations an object is allowed to apply on some other object. Mark Miller proposes that in order to restrict this set, one should create a new object which only offers these capabilities, and which delegates to the original object. In our work, reference capabilities offer a partition of the operations into those which may read, or write the object, or pass the object on to a different actor. Moreover, our reference capabilities are transitive, e.g. a write capability to an object o grants write access to all its fields, but also to all the objects writeable from o. Pony is both an object capability and reference capability secure language.

The final twist is that Pony’s reference capabilities do not express what a subject is able to do, but what the subject is not able to do (denied). Thus the paper title: “Deny Capabilities for safe, fast actors”.

In the Pony type system, an alias (another reference to the same object) may have a different reference capability to the original reference. The term local alias is used to describe a reference held by an actor, and the term global alias refers to a reference to an object held by another actor. Types are annotated with reference capabilities, and each reference capability indicates what is locally and globally denied. No capability can deny something locally that is permitted globally.

… when the local deny properties and the global deny properties of a reference are the same, the reference can be safely sent as an argument to an asynchronous method call to another actor, i.e. it is sendable. In other words, when the local alias deny properties are the same as the global alias deny properties, it does not matter which actor holds the reference.

The keyword actor is used to indicate a class that can have behaviours (asynchronous methods), and the keyword be is used to define behaviours.

A behaviour is executed asynchronously by the receiving actor, and a given actor executes only one behaviour at a time making behaviours atomic. While executing a behaviour, a receiver sees itself (i.e. this in the behaviour) as ref, and is able to freely read from and write to its own fields.

Here’s an example definition of an actor and behaviour, using iso and tag reference capabilities (we’ll explain what those are in a moment), and also the use of consume .

actor Dataflow
  be step(list:List iso, flow: Dataflow tag) =>
    flow.step(list,this) // NOT ALLOWED
    flow.step(consume list, this) // Allowed

The iso type qualifier on a reference (henceforth, an iso variable) means that no other reference (within the same actor, or globally) can read from or write to that object. The object is isolated iso is used to pass mutable data between actors.

All mutable reference capabilities deny global read/write aliases, allowing them to be written to because no other actor can read from the object. An iso reference also denies local read/write aliases, which means if the iso reference is sent to another actor, we are guaranteed that the sending actor no longer holds either read or write aliases to the object sent.

A new alias to an iso reference must therefore be neither readable nor writeable (since the meaning of iso is to deny that capability to all other references). A reference that is neither readable nor writeable is a tag reference. What use is a tag variable then? We can alias it (pass it as a parameter, assign it to other tag variables), and we can also invoke behaviours on it. tag is the default reference capability for actors (and ref is the default for classes).

What if instead of creating a new alias to an iso reference we wanted to pass ownership to someone else? This is what consume is used for in the code sample above, it is a destructive read which transfers ownership.

I can therefore write code like this:

be step(list1: List iso, list2: List iso, 
        flow: Dataflow tag) =>
 list1.next = consume list2 // give list1.next ownership of list2 iso
 flow.step(consume list1) // give flow.step ownership of list1 iso

Here, we mutate list1 by assigning list2 to its next field, maintaining isolation for both list1 and
list1.next. Similarly, we could read from or write to fields of list1 since path traversal is allowed. This also allows calling methods on isolated references and fields of any path depth.

Isolated references form static regions: mutable references reachable via the iso reference can only be reached via the iso reference, and immutable references reachable via the iso reference are either globally immutable or only reachable via the iso reference.

In addition to iso and tag there are four other reference capabilities: ref, val, box, and trn.

  • ref : a ref variable can be used to read and write the object within an actor, and other variables within the same actor can also be used to read and write the object, but access (both read and write) is denied to any other (global) actor.
  • val : a val variable is a bit like const, it denotes a variable that is globally immutable – it denies write capabilities both locally and globally (and hence by implication, allows reading from anywhere).
  • box : a box variable denies any other actors (global actors) the right to use a variable to write to the object. Other variables within the same actor may be used to write to the object, and other actors may be able to read it (but not both). “The box reference functions as a black box: the underlying object may be mutable (locally) through an alias or it may be immutable through any alias.”
  • trn : a trn variable is in transition from being locally mutable to immutable:

A trn reference makes a novel guarantee: write-uniqueness without read uniqueness. By denying global read/write aliases, but only denying local write aliases, it allows an object to be written to only via the trn reference, but read from via other aliases held by the same actor. This allows the object to be mutable while still allowing it to transition to an immutable reference capability in the future, in order to share it with another actor.

In case you’re finding all of this hard to keep track of, I’ve compiled the table below.

The Pony Language Documentation has a good summary too. In the paper you’ll find the full syntax, type system, and operational semantics.

Actors introduce the question of who may read or update the actor’s fields, the possibility of synchronous calls on actors, and the type required for asynchronous calls. Field read and write requires that the actor should see itself as a ref. As a result, any other actor will see it as tag. Therefore no other actor except the current one will be allowed to observe an actor’s fields – a nice consequence of the type system. By a similar argument, because the actor sees itself as ref, any other paths that point to it will do so as box, ref, or tag, and this means that the actor may call synchronous methods on itself, provided that the receiver reference capability of the method declaration is ref, box, or tag. Interestingly, for asynchronous (behaviour) calls, the receiving actor only needs to be seen as a tag…

Table 4 provides a nice comparison to the type systems of other languages, including Erlang, Scala, and Rust.

To minimise the number of required type annotations, Pony uses default reference capabilities (tag for actors, ref for objects, val for primitives) and the compiler can infer annotations locally. In approximately 10Kloc in the standard library, 89.3% of types require no explicit annotation.

So far we’ve addressed the “Safe” part of the paper title. Now let’s briefly turn our attention to “Fast.”

Deny properties are also amenable to a highly efficient implementation. We have benchmarked our language against other actor-model languages with the CAF benchmark suite and against MPI with the HPCC RandomAccess benchmark. Results are the average of 100 runs, normalised against Erlang performance on a single core such that performance improvement linear to core count would be shown as an straight line sloping up. We chose to normalise against Erlang because it is a mature system with consistent performance across core counts, with little jitter.

The charts below show impressive performance against Scala and Erlang, in a scenario constructed to be worst-case for Pony. (Note that the CAF system is neither garbage collected nor data race free).

The full Pony language as implemented in the compiler includes additional features, such as generic types, traits, structural types, type expressions (unions, intersections and tuples), a non-null type system, sound constructors, pattern matching, exceptions, and garbage collection. The Pony runtime will eventually support distributed computation, without a reduction in single-node performance. The compiler, a web-based development sandbox, and a language tutorial are available at http://ponylang.org.