I’m writing a simulation, or rather, I’m procrastinating, and this blog post is the result of me going off on a side-track from the main quest.
The simulation involves a bunch of tasks that go through a series of
steps with delays in between, and each step can affect some shared
state. I want it to run in fake virtual time so that the delays are
just administrative updates to variables without any real
sleep()ing, and I want to ensure that the mutations happen in the
right order.
I thought about doing this by representing each task as an enum State with a big match state to handle each step. But then I
thought, isn’t async supposed to be able to write the enum State and
match state for me? And then I wondered how much the simulation
would be overwhelmed by boilerplate if I wrote it using async.
Rather than digging around for a crate that solves my problem, I thought I would use this as an opportunity to learn a little about lower-level async Rust.
Turns out, if I strip away as much as possible, the boilerplate can fit on one side of a sheet of paper if it is printed at a normal font size. Not too bad!
But I have questions…
- async fn-damentals
- pin a task
- noop context
- primops, generally
- primops, minimally
- contexts and wakers
- primops, commandingly
- primops, yieldingly
- fake sleep in action
- questions
async fn-damentals
My starting point was to write:
async fn deep_thought() -> u32 {
42
}
fn main() {
deep_thought();
}
When I call deep_thought() I immediately get a Future<Output = u32>.
As the compiler warns, none of the code in deep_thought() runs, it
just constructs a value of an ineffable type which contains the
initial state of deep_thought()’s state machine.
To actually run it, I need to poll() it. The Future::poll()
method has a signature that immediately presents a number of
obstacles:
fn poll(
self: Pin<&mut Self>,
ctx: &mut Context<'_>,
) -> Poll<Self::Output>
pin a task
Unlike normal Rust data structures, a Future can contain references
to itself. (In a Rust function, variables can refer to other
variables, and a Future contains (roughly speaking) function
activation frames, hence it can be self-referential.) So, whereas
normal Rust data types can be moved, a Future must stay at the same
address even when it is not borrowed. The Pin type is used to
immobilize a Future.
For my purposes it’s easiest to Pin the Future in a Box on the
heap. I’ll define a struct Task to wrap the Pinned Future so
that I can define a couple of methods on it. (More elaborate async
frameworks usually have more layers between their version of Task
and its Future.)
This wrapper is generic over the ineffable Fut type and its
ultimate return type Out (which for deep_thought() is u32).
struct Task<Fut> {
future: Pin<Box<Fut>>,
}
impl<Fut, Out> Task<Fut>
where
Fut: Future<Output = Out>,
{
fn spawn(future: Fut) -> Self {
let future = Box::pin(future);
return Task { future };
}
}
Constructing a Task looks like,
let mut task = Task::spawn(deep_thought());
noop context
The second argument to poll() is a Context, which is a
wrapper around a Waker.
The simplest way to make a Context is by using Waker::noop(),
which is enough for us to get deep_thought() to actually run. As in
the fake-time simulation that I am procrastinating, 7.5 million years
pass in the blink of an eye.
let mut ctx = Context::from_waker(Waker::noop());
match task.future.as_mut().poll(&mut ctx) {
Poll::Pending => {
todo!();
}
Poll::Ready(answer) => {
println!("the answer is {answer}");
}
}
primops, generally
An async function can call another async function, and (in async code
just like in normal code) the called async function does nothing but
return a Future. To make it do something, the caller needs to
.await the Future. Under the covers .await compiles down to
poll()ing the Future.
A chain of async .await calls bottoms out in a primitive operation
that interacts with the outside world. A primitive async operation is
an impl Future state machine data structure, written manually
instead of relying on compiler trickery.
Typically, a primitive Future will be poll()ed twice:
-
The first time, it arranges for the operation to happen then returns
Poll::Pending. The async executor suspends thisTaskwhile the operation proceeds. -
After the operation is complete the async executor resumes the
Taskbypoll()ing it, which immediately becomes a secondpoll()on on the primitiveFuture. This time it returnsPoll::Ready()with the result of the operation, which becomes the value returned by.await.
A primitive Future can implement a more complicated state machine
that needs to be poll()ed more, but twice is the minimum necessary
to actually suspend a Task.
primops, minimally
Continuing my approach of doing the least possible thing to illustrate
a point, here’s a stub Future that pretends to sleep. Its trivial
state machine is encoded in the delay value: if it’s zero, the
Future continues without suspending; if it’s non-zero, the Future
suspends, but first resets the delay so that next time it will
continue.
struct Sleep(u32);
impl Future for Sleep {
type Output = ();
fn poll(
mut self: Pin<&mut Self>,
_: &mut Context<'_>
) -> Poll<()> {
if self.0 > 0 {
self.0 = 0;
return Poll::Pending;
} else {
return Poll::Ready(());
}
}
}
As an example of using it, deep_thought() can pretend to spend a
long time by constructing a Sleep() object (which is our minimal
state machine) then .await it to invoke poll().
async fn deep_thought() -> u32 {
Sleep(7_500_000).await;
42
}
And the main loop now needs to poll() the Task twice to run it to
completion.
loop {
match task.future.as_mut().poll(&mut ctx) {
Poll::Pending => {
println!("sleeping for 7.5 million years...");
}
Poll::Ready(answer) => {
println!("the answer is {answer}");
return;
}
}
}
contexts and wakers
In that minimal proof-of-concept, the fake Sleep primitive does not
actually do anything other than suspend the Task, and the top-level
async executor loop blithely assumes it knows why the Task was
suspended.
The purpose of the Context and its inner Waker is to allow a
primitive Future to communicate with the async executor loop: to
arrange for the operation to happen, and suspend the Task while the
operation proceeds.
So for my fake Sleep to account for the passing of fake time, I need
to construct my own Waker that does something more useful than
Waker::noop().
I believe the design intent is that a Waker is roughly speaking a
wrapper round a smart pointer that refers to the current Task. When
a primitive Future suspends a task, it stashes the Waker with the
operation in progress. When the operation completes, the Waker is
told to wake() its Task, which puts it back on the async
executor’s loop to be poll()ed.
To make a Waker, I need to make a RawWaker:
pub const unsafe fn Waker::from_raw(
waker: RawWaker
) -> Waker;
pub const fn RawWaker::new(
data: *const (),
vtable: &'static RawWakerVTable
) -> RawWaker;
This is dismaying, it’s like hand-rolled object-oriented C. Instead
of a type-safe dyn Trait, I have to cruft something together from
a raw pointer, a list of functions in a struct, and unsafe code.
At this point I got stuck, despondently trying to work out how my
Tasks and executor loop should refer to each other, and what kind of
smart pointer I can smuggle through a raw *const() pointer.
Eventually I realised there’s a simpler way.
primops, commandingly
There are a couple of ways that a primitive Future can arrange for
an operation to happen:
-
It can immediately make system calls and mutate global data structures to fire off the operation, before suspending itself by returning
Poll::Pending. This requires that difficult tangle of smart pointers. -
Or instead it can suspend itself first, returning a command that the async executor will carry out on the
Task’s behalf. This is awkward becausePoll::Pendingcannot carry a payload. However, theContextprovides a side-channel that I can use to smuggle out a return value.
In imaginary safe Rust, a Task can return a Command roughly as
follows:
- The executor loop prepares a place-holder variable for the command.
let mut cmd = Command::Run;
- It
poll()s theTask, passing a mutable borrow of the command.
let p = task.future.as_mut().poll(&mut cmd);
- When a primitive
Futurewants to perform an operation, it overwrites the command before suspending theTask.
fn poll(
mut self: Pin<&mut Self>,
cmd: &mut Command
) -> Poll<()> {
*cmd = Command::Example;
return Poll::Pending;
}
- When the async executor loop gets
Poll::Pendingfrompoll(), it looks at the command to decide what to do with theTask.
In real Rust I need to smuggle the borrowed &mut cmd through the
RawWaker’s raw *const() pointer.
Since I’m not using the Waker to revive the Task when its
operation completes, I can reuse the RawWakerVTable from
Waker::noop().
primops, yieldingly
I’ll define a Yield type that combines the primitive commands and
Poll::Ready() in one enum, and I’ll fix the top-level task’s
return type to Future<Output = ()>. (Too much boilerplate is needed
to keep the Output type generic.)
“Yield” has a dual meaning: the result returned (yielded) from an
activity; and the task relinquishing (yielding) the CPU.
#[derive(Copy, Clone, Debug)]
enum Yield {
Run,
Sleep(u32),
// maybe other commands here
Done(),
}
The async executor loop calls poll() on a Task, which creates a
place-holder Yield and stashes a pointer to it in a fresh Context.
impl<Fut> Task<Fut>
where
Fut: Future<Output = ()>,
{
fn poll(&mut self) -> Yield {
let mut yld = Yield::Run;
let data = &mut yld as *mut Yield as *const ();
let vtable = Waker::noop().vtable();
let waker = unsafe { Waker::new(data, vtable) };
let mut ctx = Context::from_waker(&waker);
match self.future.as_mut().poll(&mut ctx) {
Poll::Pending => yld,
Poll::Ready(()) => Yield::Done(),
}
}
}
The Yield type is also used as the direct representation of a
primitive Future. An async function constructs a Yeild and
.awaits it, which causes the Yield to be returned via the
Context to the async executor’s loop. Before suspending, the
Future Yield is reset to Yield::Run so that execution continues
the next time the Task is poll()ed. (Analogous to resetting the
Sleep delay to zero in the previous example.)
impl Future for Yield {
type Output = ();
fn poll(
mut self: Pin<&mut Self>,
ctx: &mut Context<'_>,
) -> Poll<Self::Output> {
if let Yield::Run = *self {
return Poll::Ready(());
} else {
let yld = ctx.waker().data() as *mut Yield;
let yld = unsafe { yld.as_mut().unwrap() };
*yld = *self;
*self = Yield::Run;
return Poll::Pending;
}
}
}
There’s more discussion of the unsafe code below.
fake sleep in action
The async executor loop needs to carry out the commands Yielded by
its tasks. The classic data structure for timers is a min-heap keyed
on the wake-up time; fake time is just a normal timer queue without
any actual sleeping or delays between wake-up times.
After augmenting my Task type with a wake-up time, I can write my
main program roughly like this sketch:
let mut tasks = BinaryHeap::new();
for i in 1..=TASKS {
tasks.push(Task::spawn(activity(i, LIMIT)));
}
while let Some(mut task) = tasks.pop() {
match task.poll() {
Yield::Sleep(delay) => {
task.wake_up += delay;
tasks.push(task);
}
Yield::Done() => {
// drop completed task
}
yld => panic!("unexpected {yld:?}"),
}
}
The main program spawns some activity()s that sleep in a loop for
differing amounts of time. They report their progress to stdout. For
this demo I want the tasks to print synchronously (no async IO!) to
illustrate the progress of their state machines.
This demo is greatly simplified but roughly the same shape as the fake-time simulation that I’m procrastinating.
async fn activity(delay: u32, stop: u32) {
let mut now = 0;
println!("{now} {delay} start");
loop {
Yield::Sleep(delay).await;
now += delay;
if now < stop {
println!("{now} {delay} continue");
continue;
} else {
println!("{now} {delay} return");
return;
}
}
}
You can see the complete demo in action at the Rust playground. Task 1 wakes up every tick, task 2 every other tick, etc.
questions
I don’t know why a Waker isn’t just an abstract generic type
parameter with some trait bounds, so that I could define it using safe
code. As far as I can tell the language and the standard library don’t
depend on its exact shape, so I would expect the details to be punted
to async runtime libraries. I guess there’s something I’m missing that
requires the standard library to partially restrict the shape of a
Waker.
There are some weaknesses in my unsafe code. Miri says the code is
OK, which agrees with my handwavy correctness argument by analogy with
a mutable borrow. However I’m not certain that the compiler is
guaranteed to know that yld can be mutated by poll().
An alternative might be to return the mutated Yield from
Task::poll() by reconstructing the &mut Yield reference from the
Context in the same manner as Yield::poll(). But then I’m not
certain the compiler will know that the borrowed yld needs to live
all the way to the end of the function.
For now I’ve chosen the shorter code.
Having learned how to do it myself, I’m curious to hear of crates that already solve this problem.
Tony Finch – blog