The Short Version
The shortest version of the issue is this: Linux and Windows operating systems are undergoing major reworks to cope with security vulnerabilities that are present on the last ten years of Intel CPUs, with the Spectre exploit discovered on AMD, ARM, and Intel CPUs. Everyone is affected in at least some capacity, but the exploits affecting each vendor vary. The hardware itself is not secure or compromised physically, but requires software changes to close security holes that are present. The concern from the community has been how those security updates will impact performance, as initial reports from GR Security indicated between 5% and 29% performance deficit in some tasks. Note that the commonly-cited 29% number was specifically derived from an i7-6700 test bench with page-table isolation, one of the proposed hardening techniques for increasing security. This was specifically found with a RAP Linux benchmark, and that number is not a blanket number for all performance changes. We’ll get to that momentarily.
What is the vulnerability?
Project Zero, a Google team, reports that there is a system call issue with the Kernel, which could lead to security vulnerabilities when virtual memory allocation is read. Project Zero reported this issue to not only Intel, but AMD and ARM back in June of 2017. There are two separate attacks that have been developed around this security vulnerability, codenamed “Meltdown” and “Spectre.” Meltdown is a breakout attack, capable of exiting the confines of virtual environments, and Spectre is a speculation execution attack. Meltdown is the worse of the two attacks, and is known presently to affect Intel, with indeterminate effect on AMD and ARM. The team is still researching. Spectre affects Intel, AMD, and ARM alike.
Both attacks are capable of intercepting user data that is currently being read, particularly when involving virtual memory allocation. These attacks give access to data stored in memory, which could include passwords, usernames, and other transactions that are being actively transacted between memory and the CPU. This is particularly concerning from an Enterprise or server standpoint, as one of the attacks leverages branch prediction exploits to, as a KVM guest, read the memory of a host kernel. This has severe implications for virtual machine users, primarily those who may slice servers into virtualized environments for customer use. Meltdown, for example, is capable of granting an attacker full access and control over all contents of the physical memory on the machine, breaking out of the boundaries of virtual machines.
Project Zero notes that Meltdown, quote, “breaks the mechanism that keeps applications from accessing arbitrary system memory,” and that Spectre “tricks other applications in to accessing arbitrary memory locations in their memory,” stating further that “both attacks use side channels to obtain information from the accessed memory location.” The researchers behind Meltdown and Spectre have published papers on these exploits, and have also published an FAQ for consumers. The very first question is simple: “Am I affected by the bug?” The answer is even simpler: “Most certainly, yes.”
The researchers note that the Meltdown exploit has worked on Intel CPUs dating back to 2011, and have also noted that they are not yet clear on whether the Meltdown exploit will work on ARM and AMD processors. When we asked Intel for a statement, we were sent this page, where the company alleges that these exploits may also affect AMD and ARM. We have asked AMD for a statement, but are currently in a holding pattern. AMD did, however, publish its own short note about these exploits. As of now, AMD has mostly noted that they’re aware of the vulnerabilities, and that the company is investigating further. Critically, AMD notes that they do not think these exploits have been used in “the public domain,” though the Meltdown researchers state that they are uncertain whether Meltdown-like attacks have been deployed.
As for the Spectre attack, the team notes that this exploit has been verified on Intel, AMD, and ARM processors, and notes that it will work against nearly every type of computer – including smartphones and cloud servers.
Google has confirmed that Android is affected, and has issued a security advisory about the attack, Intel issued a statement, and AMD issued a statement. Neither of the latter have much information, while Google has published some of the most detail on the subject, if you are interested in further reading.
Why Did This Happen?
The Meltdown whitepaper indicates a root-cause being branch prediction on CPUs, particularly speculative prediction – the foundation for Spectre’s name. Speculative prediction is something we have talked about before, primarily with GPU architectures. Branch prediction is the goal of a CPU or GPU to execute commands before those commands are explicitly issued to the CPU or GPU. The idea is to reduce wait time – maybe the command never comes, and so the work is wasted, but the potential upside is worth it, as pipelines are sped-up and tasks can execute with lower latency. In the instance of this attack, some data can get left behind in L1 Cache, which should be protected data. The exploit is able to gain access to this orphaned information, giving attackers access to potentially sensitive data.
This issue doesn’t come down to sacrificing security for speed, either: Kernel-level security would indicate that memory should be protected by other models, like address-space layout randomization. GN Patreon backer Steve Streza was able to provide a great example of this, quoting Steve: “With address-space layout randomization, or ASLR, basically re-links a program at random locations at launch time, so you can’t just say “give me the memory at 0xDEADBEEF because I know that’s the user’s password,” as well as Kernel ASLR, or KASLR, which does the same thing for the Kernel’s memory. Since the kernel’s memory isn’t protected anymore, it can be read at will by the attacker. It’s not a speed versus security thing; the security was supposed to be handled elsewhere.
What Happens Next?
Expect a large dump of information on January 9th, which is when the embargo lifts on everything that’s been kept behind doors. This will be the next major milestone, and will be a point at which the more general consumer community should obtain a better understanding of what’s going on and if it changes the way their PCs perform. The most immediate steps are being taken by the hardware and software manufacturers, with Microsoft fast-tracking updates to Windows for security. This is a software-level solution, as the hardware itself is not physically compromised. If you own affected CPUs, and you do – even if it’s Spectre affecting AMD – you can expect software-level patches to help resolve some of these concerns.
ARM has already developed software solutions that can mitigate the affect of Spectre. Linux kernel virtual memory systems are already being overhauled. Microsoft is working toward a January 9th patch, and has already issued improvements to fast-ring users. The question is whether or not any of these fixes will impact performance.
Early claims by GR Security have gone through a game of telephone. Initial reports showed potential for 5% to 30% performance deficit in some specific tasks, with different tasks suffering in different ways. The performance loss comes by way of introducing more latency, done by nature of adding more layers of security. What this does not mean, however, is that every CPU in every application will instantly be 30% slower. Most of the major performance slowdowns have been reported with enterprise-level software, not necessarily consumer-level software. Phoronix, for instance, has already published some preliminary gaming benchmarks on Linux, and has shown performance deltas that are largely within margin of error. This is only one operating system with a few games, of course, so there is room for other games or for Windows to be impacted in greater ways.
What we’re most curious in is the impact to workstation and production-type applications, as they straddle the line between consumer and enterprise. We’re also unclear on if these performance losses can be mitigated with more time; for example, if these are panicked fixes that were thrown in place, they may be inefficient solutions.
This isn’t to downplay the significance of the exploit, nor the significance of lost performance, but to restore some sanity to the discussion: When citing numbers, know what they represent – it’s not just a flat 30% hit to every application on every CPU, for instance.
We are waiting on further performance testing and information. The updates will coincide with CES, which makes it difficult to get hands-on until after the show.
Editorial: Steve Burke
Video: Andrew Coleman