Frank DENIS random thoughts.

WebAssembly doesn’t make unsafe languages safe (yet)

WebAssembly is all the rage these days. And for good reasons. It’s pretty exciting to see many compiled languages adopting a common intermediate representation, that can eventually be translated to native code.

WebAssembly was designed with security in mind, and it is a perfect fit for running untrusted code in a web browser.

However, the security guarantees of WebAssembly also make it attractive to run desktop applications, kernel modules, and server-side code.

The WebAssembly memory model

The memory model plays a major role in these security guarantees. Memory is represented as a single linear block, and reading/writing from/to it can only happen through two opcodes, that will systematically check for out-of-bounds access (minus optimizations that can elide checks without altering the guarantees).

From the host perspective, the advantages are obvious. Untrusted code can’t touch anything outside the memory region dedicated to it.

Even inherently unsafe language such as C, when compiled to WebAssembly, immediately become “safer”, as not matter what pointer arithmetic is being done, the code should not be able to escape the memory sandbox.

Ignoring side-channel attacks for a moment, hosts can thus run multiple untrusted WebAssembly applications simultaneously, without having to worry too much about them interfering with each other.

Applications only need to know about two things: how big the linear memory currently is, and how to ask the host to grow it if required.

Address 0, as seen by a WebAssembly guest application, simply represents the first byte of the linear memory. These addresses are effectively offsets that get added to a base address, kept in a dedicated register. And by design, applications don’t have any ways to change the value of that register.

From the host point of view, this is a huge win. From a guest point of view, things are slightly different.

WebAssembly and heap allocations

Guests are constrained to offsets within the big linear memory region. This is the only way they can access memory.

But applications don’t expect a single big linear memory region. Applications typically perform many dynamic allocations in a wide spectrum of sizes. Individual objects lead to individual allocations.

So, they need a memory allocator, i.e. something that will manage individual allocations within a large, linear segment.

WebAssembly hosts currently don’t provide anything like that. Guests have to come with their own memory allocator.

And this is suboptimal, to say the least. Besides having multiple implementations of the same thing, allocators being part of the client code has some serious drawbacks.

First, hosts have no visibility on how memory is being managed within a guest. Want to diagnose memory leaks? Good luck with that.

Allocators optimized for a specific platform are also very likely to be faster than a WebAssembly version. Given that memory allocations are extremely frequent in many common applications, this can be significant.

But more importantly, modern memory allocators implement significant mitigations against common bugs, that can escalate to vulnerabilities.

Part of what makes a language such as C unsafe is the fact that a memory allocation is represented as a single pointer to the first byte. From here, applications are assumed to stay within the allocated range, but there is no barrier preventing them not to:

char *x = malloc(10);
x[15] = 42; // fine!

Since x[15] is not within the allocated area, what do we have at this address? It can be data from other allocations or internal structures from the memory allocator.

As demonstrated by the Heartbleed vulnerability, not staying within the bounds can have dramatic implications. If out-of-bounds data can be replaced with untrusted input, the control flow can change, eventually leading to arbitrary code execution. If out-of-bounds data can be read, this may lead to sensitive information disclosure.

Modern memory allocators have developed excellent mitigations against this class of bugs. As an example, OpenBSD’s omalloc randomizes allocations (leveraging mmap()), keeps data close to guard pages, and has effective detection of double-free and use-after-free.

Applications have also started to adopt stricter allocation strategies for sensitive data. Libsodium’s guarded heap allocations, and their reimplementation in other languages, have become a natural way to protect secrets from Heartbleed-like vulnerabilities.

These exploit mitigation techniques have also proven to be very effective at finding bugs in popular applications. Bugs that could have been turned into actual exploits without early detection.

A major issue with WebAssembly is that most of these mitigations techniques suddenly become impossible to implement. And the few of them that could be implemented are not, for the sake of keeping execution speed acceptable. Applications designed for WebAssembly tend to ship with very primitive malloc() implementations, that are simple, small and fast, but assume bug-free code.

NULL pointers

Another surprising fact about WebAssembly is that, addresses being just offsets, address 0 is a completely valid location, which can be read or written to.

This can be a little bit disturbing, since virtually all operating systems from the past 20 years ensured that accessing NULL would immediately stop the execution flow.

A NULL pointer dereference is a common symptom of a logic error, usually due to uninitialized pointers or flawed pointer arithmetic. There is no valid reason to intentionally dereference NULL ever, and applications crashing when this happens is invaluable for developers to find and fix the relevant logic flaws.

Not to mention that a solid number of vulnerabilities classified as simple denial-of-service would have been way more critical if the application didn’t crash on accessing NULL.

The program is not in an expected state, and its execution has become unpredictable. Immediately stopping its execution is by all means the best thing to do.

NULL pointer deference suddenly becoming a silent operation in WebAssembly is concerning.

Host vs guest safety

There is some paradox here. We have a fantastic and highly secure execution environment from a host perspective. But from the guest perspective, that very same environment looks like MS-DOS, where memory is a giant playground with no rules.

Developers cannot leverage the tools they are used to in order to ensure that heap allocations are used safely any more. This just doesn’t work in WebAssembly, and the WebAssembly host can’t help either.

As a result, applications specifically written for WebAssembly in unsafe languages are likely to be less reliable than if they had been designed for a native environment.

But more importantly, vulnerabilities that would have been mitigated in a native environment are not mitigated any more. To some extent, this is a significant regression.

Is it an issue for web browsers? Not so much, as they already trust foreign code, and the only secrets they might leak are their own.

Is it an issue for desktop applications and kernel modules? Definitely, if they process untrusted data, which is one of the main justifications for going through a WebAssembly transform.

The idea of running WebAssembly server-side relies on the fact that if WebAssembly is safe to run in a web browser, it should be safe to run on servers as well. However, for the reasons listed above, this is not necessarily the case. While escaping the sandbox itself may be difficult to achieve, application security doesn’t benefit from the mitigations commonly found in traditional environments.

So, what can we do?

With the goal of providing better support for languages other than C, C++ and Rust, allocating and manipulating garbage-collected objects is a feature being actively considered for inclusion to the WebAssembly design.

However, there would be clear benefits in also delegating basic dynamic memory management to the host.

Technically, this can be implemented already. However, for successful adoption, a standard interface has to be defined. Granted, I haven’t closely followed the recent WebAssembly proposals, but I don’t think this has been considered so far.

In the meantime, keep in mind that while WebAssembly is a huge step forward, it is not a silver bullet, and running it in a non-browser environment requires extra security considerations.