Skip to content

"async" support in node.js

koush edited this page Apr 26, 2013 · 25 revisions

Intro/Disclaimer

I'm not a V8/node.js/JavaScript guru by any means. I just picked up node.js recently, as it was a great solution for a project that needed tens of thousands of long running connections. Please be gentle on me if my code is not wonderful. :) Continuing on...

Non-Blocking Code is Messy

node.js is a nonblocking i/o platform for JavaScript. Non-blocking i/o is great, because it lets you build highly concurrent systems with minimal overhead that run blazing fast. The downside is that the code often turns into a mess of nested callbacks.

For example, let's look at a common scenario where a http request comes in, a database lookup is performed, and some data is returned:

var db = require('somedatabaseprovider');
app.get('/price', function(req, res) {
  db.openConnection('host', 12345, function(err, conn) {
    conn.query('select * from products where id=?', [req.param('product')], function(err, results) {
      conn.close();
      res.send(results[0]);
    });
  });
});

This is a fairly simple scenario, and it already requires 2 levels of nested callbacks. In some of my projects, it's not uncommon to be 7 or 8 indents deep. A common way that this problem is alleviated in other languages is by leveraging the "yield" operator to suspend program flow when an asynchronous operation starts. And then resume it by calling "next" on the iterator when the operation completes. This is a bit of a clunky solution however.

C# version 5 has solved this quite elegantly with their implementation of async/await. This seemed like an excellent language feature to bring to V8/node.js. Here is what the previous code looks like when rewritten with await/async:

var db = require('somedatabaseprovider');
app.get('/price', $function(req, res) {
  await err, conn = db.openConnection('host', 12345);
  await connErr, results = conn.query('select * from products where id=?', [req.param('product')]);
  conn.close();
  res.send(results[0]);
});

The async/await keywords lets you write asynchronous code, in a synchronous manner. Under the hood, the parser still generates the same abstract syntax tree (AST) as the original code. At first glance, you may think "big deal", that's not that much less code. But let's consider what happens when you begin to deal with exception handling across callbacks:

function magic() {
  try {
    // code here
    doSomething(function(foo, bar) {
      try {
        // more code here
        doAnotherThing(function(baz, boo) {
          try {
            // do even more stuff here
          }
          catch (e) {
            // handle the error      
          }
        });
      }
      catch (e) {
        // handle the error      
      }
    });
  }
  catch (e) {
    // handle the error      
  }
}

It quickly turns into an absolute mess. Then consider the awaited version of this code.

$function magic() {
  try {
    // code here
    await foo, bar = doSomething();
    // more code here
    await baz, boo = doAnotherThing();
    // do even more stuff here
  }
  catch (e) {
    // handle the error
  }
}

As mentioned earlier, the modified V8 parser is still creating the (roughly) the same AST as the non async/await version. But it also handles exceptions across the two callbacks.

Another situation where this is quite useful are loops with asynchronous code that must be run in a serialized fashion. A typical example, would be using many of the GDATA APIs, which limit responses to 1000 items at a time, requiring one to do paginated requests. This gets unnecessarily tricky when using normal callback patterns in JavaScript.

function getAllData(callback) {
  var allData = [];
  function getDataChunk(start) {
    getRemoteDataChunk(start, function(next, data) {
      allData = allData.concat(data);
      if (next) {
        getDataChunk(next);
      }
      else {
        callback(allData);
      }
    });
  }
  getDataChunk(0);
}

// and here's the intuitive async/await code
$function getAllData(callback) {
  var allData = [];
  var start = 0;
  while (true) {
    await next, data = getDataChunk(start);
    allData = allData.concat(data);
    if (!next)
      break;
    start = next;
  }
  callback(allData);
}

The Syntax (and how it works)

The way this was implemented is so that one can think of "await" as an asynchronous "var". It is used to declare variables that will be available upon return of the callback. Upon encountering an await, the function will terminate, and resume once the callback completes, with those variables available. These variables are actually the callback function's arguments.

For example, the following pairs of lines are equivalent:

// these two lines are equivalent
await foo = bar();
bar(function(foo) {} );
// these two lines are equivalent
await a, b, c, d = moo(1, 2, 3);
moo(1, 2, 3, function(a, b, c, d) {} );

The "await" keyword assumes that the assignment statement is that of a call function, and that the last argument should be the generated callback. The await keyword can only be used inside of an "async" function:

$function i_can_await() {
  await foo = bar();
}

The Implementation

async/await is not a new concept in JavaScript by any means, as it has been a pain point for quite a while. The ones I am aware of are implemented as source to source translators (see TameJS). This approach has its upsides and downsides. The upside is that it the code can run in any browser just fine (since it is ultimately translated to normal JavaScript). The downside is that it is a disorienting to debug, since you are then debugging generated code, and not your original code.

This implementation is not source to source, but an actual extension of the language (made available by implementing coroutines in v8). So you can use ndb, or node-inspector, etc, to debug it like any other node.js application. Given that one can control a server's deployment environment, deploying a custom version of node.js with the extensions to V8 is not a far fetched scenario.

Building

Assuming you use node.js, you have most likely built it from source. Use the fork found here: https://github.com/koush/node, on the async-v0.10.x branch.

More Tests and Samples

Here are the tests I run to make sure I didn't break node.js/v8 (much). https://gist.github.com/1249927

TODO

Lots of other stuff too. Currently every branch is turned into a coroutine, even if there is no await statement in the code block. This needs to be optimized. I also want to add support for awaiting multiple callbacks at the same time, such as:

await foo = bar(), baz = moo();

That and clean up the code.

Follow Me