【Weekly SQLpassion Newsletter】Understanding the Meltdown exploit – in my own simple words

As you might (hopefully) know, there was a serious CPU exploit(vt. 开拓;剥削;开采;漏洞;) exposed this week (called Meltdown(n. 灾难;彻底垮台;熔化;)), which will cause all Intel CPUs from the past 20 years to misbehave(vi. 作弊;行为不礼貌) and return you critical, private data from other applications or even other VMs in a virtualized environment.

The terrible thing about Meltdown is that the exploit is possible because of the underlying CPU architecture, and can’t be fixed within the CPU itself. It is more or less “By Design”. Therefore the CPU exploit must be fixed in the software layer itself, which could also slow down the throughput for specific workloads. Fortunately Microsoft, Linux, Apple and VMware have already provided necessary patches(n. 补丁;斑;小块) to their Operating Systems to make sure that the Meltdown exploit can’t be used.

In this blog posting I want to give you an overview about how the Meltdown exploit works internally. There is already a really great whitepaper available, which describes the inner workings of Meltdown on a deep technical level. I want to show you in this blog posting in my own simple words how Meltdown works, and how private data can be retrieved with it.

CPU Architectures

Before we talk about the Meltdown exploit itself, we have to know some basic things about how a modern CPU is architected and how a CPU is executing instructions. When you look at a high level at a CPU, it contains 2 very important components:

  • Execution Units
  • Registers

Execution Units are the brain of a CPU, because they are doing the real work. One example of an Execution Unit is the ALU – the Arithmetic Logic Unit – which performs arithmetical operations like adding and subtracting numbers. Most of the time when you execute an application (like when I’m writing this blog posting), the ALU of your CPU is heavily involved.

In addition to the various Execution Units, the CPU also needs some scratch space where temporarily data is stored. This scratch(n. 擦伤;抓痕;刮擦声;) space are the various CPU Registers. An x64 CPU has a lot of different registers available in which data is temporily stored to be able to perform actual operations on that data.

As soon some data is needed for an operation, the data is moved from the main memory (RAM) into a register, and after the operation the actual register content is written back into main memory. As I have written, the data in a register is only stored temporily during an operation. Imagine now you want to add 2 numbers which are stored in main memory:

int c = a + b;

If you take that simple C statement and assemble it down to some (pseudo(adj. 假的,虚伪的)) assembly code, the CPU has to execute the following instructions:

  • LOAD Register1 from MemoryLocationA
  • LOAD Register2 from MemoryLocationB
  • Register3 = ADD(Register1, Register2)
  • STORE Register3 in MemoryLocationC

So far so good. The problem with this simple approach is that accessing main memory introduces high latency(n. 潜伏;潜在因素) times. Main memory is based on DRAM cells, and accessing such DRAM cells takes some time. It can take up to 100 – 200 nanoseconds to access main memory. Each memory access will slow down your program execution.

posted @ 2018-03-08 11:34  FH1004322  阅读(98)  评论(0编辑  收藏  举报