Meltdown strikes back: the L1 terminal fault vulnerability
Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
The Meltdown CPU vulnerability, first disclosed in early January, was frightening because it allowed unprivileged attackers to easily read arbitrary memory in the system. Spectre, disclosed at the same time, was harder to exploit but made it possible for guests running in virtual machines to attack the host system and other guests. Both vulnerabilities have been mitigated to some extent (though it will take a long time to even find all of the Spectre vulnerabilities, much less protect against them). But now the newly disclosed "L1 terminal fault" (L1TF) vulnerability (also going by the name Foreshadow) brings back both threats: relatively easy attacks against host memory from inside a guest. Mitigations are available (and have been merged into the mainline kernel), but they will be expensive for some users.
Page-table entries
Understanding L1TF requires an understanding of the x86 page-table entry (PTE) format. Remember that, in a virtual-memory system, the memory addresses used by both user space and the kernel do not point directly into physical memory. Instead, the hierarchical page-table structure is used to translate between virtual and physical addresses. At the bottom level of this structure, the PTE tells the processor whether the page is actually present in physical memory, where it is, and a few other details. It looks like this for a 4KB page on an x86-64 system:
The page-frame number (PFN) tells the processor where to find the page in physical memory. The other bits control which memory protection key is assigned to the page, access permissions, whether and how the page is cached, whether it is dirty, and more. All of this, though, depends on the present ("P") bit in the least-significant position. If that bit is not set, the page is not actually present in physical memory, and any attempt to reference it will generate a page fault.
For non-present pages, none of the other bits in the page-table entry are meant to be used by the processor, so the kernel can use those bits to store useful information; for example, for pages that have been swapped out, the location in the swap area is stored in the PTE. In other cases, the data left in non-present PTEs is essentially random.
Ignoring the present bit
If the present bit in a given PTE is not set, the PFN number field of that PTE has no defined meaning and the CPU has no business trying to use it. So, naturally, Intel CPUs do exactly that during speculative execution (it would appear that Intel is the only vendor to make this particular mistake). During speculative execution, non-present PTEs are treated as if they were valid, so non-present PTEs can be used to speculatively read whatever data lives in the indicated PFN — but, importantly, only if that data is in the processor's L1 cache. The access is speculative only; the processor will eventually notice that the page is not actually present and generate a page fault instead. But, by the time that happens, the usual sorts of covert channels can be used to exfiltrate the data in whatever page the PTE might have pointed to.
Since this attack goes directly to a physical address, it can in theory read any memory in the system. Notably, that includes data kept within an SGX encrypted enclave, which is supposed to be protected from this kind of thing.
Exploiting this vulnerability requires the ability to run code on the target system. Even then, on its face, this bug is somewhat hard to exploit. Attackers cannot directly create non-present PTEs pointing to a page of interest, so they must depend on such PTEs already existing in their address space. By filling the address space with pages that will eventually get reclaimed or by playing tricks with PROT_NONE mappings, an attacker can essentially throw darts at the system and hope that one hits in an interesting place, but it's a non-deterministic process where it's even hard to tell if one has succeeded.
Nonetheless, the potential for the extraction of important secrets exists, and thus this bug must be defended against. The approach taken here is to simply invert all of the bits in a PTE when it is marked as being not present; that will cause that PTE to point into a nonexistent region of memory. The fix is easy, and the performance cost is almost zero. A quick kernel upgrade, and this problem is solved.
Virtualization
At least, the problem is solved on systems where virtualization is not in use. On systems with virtualized guests then, at a minimum, those guests must also run a kernel using the PTE-inversion technique to protect against attacks. If guests are trusted, or if they cannot install their own kernels, the problem stops here.
But if the system is running with untrusted guests and, in particular, if that system allows those guests to provide their own kernels (as many hosting services do), the situation changes. An attacker can then run a kernel that creates arbitrary non-present PTEs on demand, turning a shot-in-the-dark attack into something that can be targeted with precision. To make an attacker's life even easier, the speculative data reference bypasses the extended page tables in the guest, allowing direct access to physical memory. So an attacker who can install a kernel in a guest instance can attack the host (or other guests) with relative ease. In this context, L1TF can be seen as a limited form of Meltdown that can escape virtualization.
Protecting against hostile guests is a harder task, and the correct answer will depend on the specifics of the workload being run. The first step is to take advantage of the fact that L1TF can only read data that is in the processor's L1 cache. If that cache is cleared every time the kernel transfers control to a virtual machine, there will be no data available for the attacker to read. That is indeed what the kernel will do. This mitigation will be rather more costly, needless to say; how much it costs will depend on the workload. On systems where entries into (and exits from) guests are relatively rare, the cost will be low. On systems where those events are common, the cost could approach a 50% performance hit.
Unfortunately, just clearing the L1 cache is not a complete solution if the CPU is running symmetric multi-threading (SMT or "hyperthreads"). The threads running on that processor share the L1 cache. So, while the hostile guest is running in one thread, an unrelated process could be repopulating the L1 cache with interesting data in the other thread. That clearly reopens the can of worms.
The obvious solution here is to disable SMT, which can potentially protect against other security issues as well. But that clearly comes with a significant performance cost of it own. It is not as bad as simply removing half of the system's processors, but, in a virtual sense, that is exactly what is happening. An alternative is to use CPU affinities to restrict guests to specific processors and to not allow anything else (including, for example, kernel functionality like interrupt handling) to run on those processors. This approach might gain back some performance for specific workloads, but it clearly requires a lot of administrator knowledge about what those workloads are and a lot of manual configuration. It also seems somewhat error-prone.
There is another approach that can be taken to protect hosts from hostile guests: rather than do all of the above, simply disable the use of the extended page-table feature. That forces the system back to the older "shadow page table" mechanism, where the hypervisor retains the ultimate control over all PTEs. This, too, will slow things down significantly, but it provides complete protection since the attacker is no longer able to create non-present PTEs pointing to pages of interest.
As an aside, it's worth pointing out an interesting implication of this vulnerability. Virtualization is generally seen as being more secure than containers due to the extra level of isolation used. But, as we see here, virtualization also requires an extra level of processor complexity that can be the source of security problems in its own right. Systems running container workloads will be only lightly affected by L1TF, while those running virtualization will pay a heavy cost.
Kernel settings
Patched kernels will perform the inversion on non-present PTEs automatically. Since there is no real cost to this technique, there is no reason (and no ability) to turn it off. The flushing of the L1 cache on entry to virtual guests will be done if extended page tables are enabled. The disabling of SMT, though, will not be done by default; administrators of systems running untrusted guests will have to examine the tradeoffs and decide what the best approach is to protect their systems. For people faced with this kind of choice, some more information can be found in Documentation/admin-guide/l1tf.rst.The 4.19 kernel will contain the mitigations, of course. As of this writing, the 4.18.1, 4.17.15, 4.14.63, 4.9.120, and 4.4.148 updates, containing the fixes, are in the review process with release planned on August 16.
As was the case with the previous rounds, the mitigations for L1TF were
worked out under strict embargo. The process appears to have worked a
little better this time around, with no real leakage of information to
force an early disclosure. One can only wonder how many more of these are
known and under embargo now — and how many are yet to be discovered. It
seems likely that we will be contending with speculative-execution
vulnerabilities for some time yet.
Index entries for this article | |
---|---|
Kernel | Security/Meltdown and Spectre |
Security | Hardware vulnerabilities |
Security | Meltdown and Spectre |
(Log in to post comments)
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 18:29 UTC (Tue) by clopez (guest, #66009) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 18:32 UTC (Tue) by corbet (editor, #1) [Link]
That is my understanding, yes; this only only affects Intel.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 18:58 UTC (Tue) by danpb (subscriber, #4831) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 15:31 UTC (Wed) by Curan (subscriber, #66186) [Link]
Yes, only Intel is affected, according to the kernel documentation.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 23:25 UTC (Wed) by rahvin (guest, #16953) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 19:16 UTC (Tue) by smoogen (subscriber, #97) [Link]
This may be obvious, but are you talking about containers in any environment or only if they are running on baremetal? Many of the container systems I have seen run them inside a virtualized environment sitting on top of baremetal to allow for one thing to do what its best at, and the other to do something else.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 20:31 UTC (Tue) by ssl (guest, #98177) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 11:48 UTC (Wed) by danpb (subscriber, #4831) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 16, 2018 3:08 UTC (Thu) by Rearden (subscriber, #35172) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 16, 2018 18:42 UTC (Thu) by jcm (subscriber, #18262) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 17, 2018 0:01 UTC (Fri) by Rearden (subscriber, #35172) [Link]
Of course some futher privilege escalation vulnerability could expose the VM host OS to this, but a further privilege escalation vulnerability would likely also expose all sorts of other things as well, this vulnerability being just one of many.
Big picture security comes down to risk mitigation through a layered approach, depending on the resources available and the risk associated with a particular breach. Some future, possible "privilege escalation" vulnerability must be planned for outside of the rememdy for this specific vulnerability. What I mean is, if your workflow and risk for a system that you own both the VM and Host OS is high enough that a compromise of one could impact imporant data, you probably need to be taking the steps associated with "untrusted" guest VMs anyway.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Oct 24, 2018 6:48 UTC (Wed) by alejluther (subscriber, #5404) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 19:40 UTC (Tue) by pbonzini (subscriber, #60935) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 20:19 UTC (Tue) by nilsmeyer (guest, #122604) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 21:46 UTC (Tue) by Sesse (subscriber, #53779) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 22:35 UTC (Tue) by nilsmeyer (guest, #122604) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 14, 2018 22:37 UTC (Tue) by Sesse (subscriber, #53779) [Link]
OpenBSD “solves” this by simply not supporting hyperthreading, but that seems too heavy-handed to me.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 18:32 UTC (Wed) by jcm (subscriber, #18262) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 18:34 UTC (Wed) by Sesse (subscriber, #53779) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 8:00 UTC (Wed) by vbabka (subscriber, #91706) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 14:02 UTC (Wed) by nilsmeyer (guest, #122604) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 14:17 UTC (Wed) by JoelWilliamson (guest, #105956) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 18:46 UTC (Wed) by jcm (subscriber, #18262) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 23, 2018 22:56 UTC (Thu) by ssmith32 (subscriber, #72404) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 0:52 UTC (Wed) by ncm (guest, #165) [Link]
I wonder, though. If we zero out the empty page table entries, how do we know where to look for backing store, in swap, when we fault?
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 2:03 UTC (Wed) by corbet (editor, #1) [Link]
PTEs are not zeroed out, they are bitwise inverted, so the information is still there. Sorry if that wasn't clear.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 7:09 UTC (Wed) by HIGHGuY (subscriber, #62277) [Link]
(I'm sure this was thought through, I just couldn't find why this is ok to do)
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 7:23 UTC (Wed) by pbonzini (subscriber, #60935) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 7:55 UTC (Wed) by vbabka (subscriber, #91706) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 18:36 UTC (Wed) by jcm (subscriber, #18262) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 17, 2018 21:47 UTC (Fri) by willy (subscriber, #9762) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 7:41 UTC (Wed) by marcH (subscriber, #57642) [Link]
So for decades hardware has tried really hard to hide from software crazy optimizations like instruction-level parallelism, out of order and of course speculative execution.
Now software is more and more hiding data from hardware to indirectly block some of that.
I just can't stop admiring the irony.
> Sorry if that wasn't clear.
It was all there, just not super mega obvious why.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 10:51 UTC (Wed) by roc (subscriber, #30627) [Link]
> Was the remote attestation protocol affected by Foreshadow?
> Yes. Using Foreshadow we have successfully extracted the attestation keys, used by the Intel Quoting Enclave to vouch for the authenticity of enclaves. As a result, we were able to generate "valid" attestation quotes. Using these counterfeit quotes, successfully "proved" to a remote party that a "genuine" enclave was running while, in fact, the code was running outside of SGX, under our complete control.
> Is SGX long-term storage affected by Foreshadow?
> Yes. As Foreshadow enables an attacker to extract SGX sealing keys, previously sealed data can be modified and re-sealed. With the extracted sealing key, an attacker can trivially calculate a valid Message Authentication Code (MAC), thus depriving the data owner from the ability to detect the modification.
The ecosystem has to be effectively rebooted by distrusting all attestations from enclaves running on non-patched processors, and all sealed data produced by those enclaves.
This attack also allows people to bypass Intel's licensing restrictions and launch arbitrary production enclaves on non-patched processors.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 16, 2018 9:28 UTC (Thu) by nix (subscriber, #2304) [Link]
This attack also allows people to bypass Intel's licensing restrictions and launch arbitrary production enclaves on non-patched processors.Can we keep this valuable feature while blocking the rest, I wonder? (No doubt we can't.)
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 11:06 UTC (Wed) by roc (subscriber, #30627) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 11:15 UTC (Wed) by amw (subscriber, #29081) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 23:38 UTC (Wed) by ms-tg (subscriber, #89231) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 12:13 UTC (Wed) by MarcB (subscriber, #101804) [Link]
So far, it has proven correct about dates - May and August - as well as impact:"Specifically, an attacker could launch exploit code in a virtual machine (VM) and attack the host system from there...Intel's Software Guard Extensions (SGX), which are designed to protect sensitive data on cloud servers, are also not Spectre-safe".
So, this provides a lower boundaty.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 20:09 UTC (Wed) by roc (subscriber, #30627) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 13:26 UTC (Wed) by fuhchee (guest, #40059) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 14:02 UTC (Wed) by adam820 (subscriber, #101353) [Link]
Even better would be to spend more time with more researchers looking for this kind of stuff before these devices ever get released. Can't catch 'em all, though.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 23:32 UTC (Wed) by rahvin (guest, #16953) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 16:29 UTC (Wed) by zdzichu (subscriber, #17118) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 21:52 UTC (Wed) by NAR (subscriber, #1313) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 15, 2018 22:14 UTC (Wed) by marcH (subscriber, #57642) [Link]
I get your actual point, it's just that you could have chosen a real example as opposed to propagating the American security myth that confuses login and password.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 18, 2018 22:55 UTC (Sat) by gus3 (guest, #61103) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 19, 2018 21:36 UTC (Sun) by zlynx (guest, #2285) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 16, 2018 15:51 UTC (Thu) by k8to (guest, #15413) [Link]
This is mostly sane to do if you set policies that control what software can access what data, but this type of exploit is about circumventing that.
So I don't really see going private cloud as a solution for this type of problem.
You could go "non-converged" and isolate workloads, but I don't think that's on the cards.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 23, 2018 10:20 UTC (Thu) by davidgerard (guest, #100304) [Link]
At this stage I think it'd be remarkable if IT in general goes back to in-house hosting from the cloud providers. Renting compute just makes IT management so ridiculously easier. Particularly when you get into Terraform etc, where you can literally program what infrastructure you have. I haven't been in a machine room for over five years now, and have no plans to go back.
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 16, 2018 5:18 UTC (Thu) by jcm (subscriber, #18262) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 23, 2018 21:08 UTC (Thu) by Wol (subscriber, #4433) [Link]
WHOOPS!
Cheers,
Wol
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 16, 2018 18:03 UTC (Thu) by k8to (guest, #15413) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 17, 2018 8:50 UTC (Fri) by marcH (subscriber, #57642) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 18, 2018 17:31 UTC (Sat) by k8to (guest, #15413) [Link]
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 20, 2018 17:45 UTC (Mon) by abatters (✭ supporter ✭, #6932) [Link]
The Cascade Lake server platform, shipping later this year, should contain the first round of Intel's hardware mitigations.Anandtech: Intel at Hot Chips 2018: Showing the Ankle of Cascade Lake
Anandtech: An Interview with Lisa Spelman, VP of Intel’s DCG: Discussing Cooper Lake and Smeltdown
Meltdown strikes back: the L1 terminal fault vulnerability
Posted Aug 28, 2018 20:16 UTC (Tue) by loch (guest, #113644) [Link]
I'm a bit confused, what are the other cases? In what situation would a page be non-present, but also not in swap?
Non-present, not in swap
Posted Aug 28, 2018 21:10 UTC (Tue) by corbet (editor, #1) [Link]
File-backed pages are the most common example of pages that can be non-present but not in swap.