Intel processor design flaw forces Linux, Windows redesign

Thom Holwerda 2018-01-03 Intel 57 Comments

A fundamental design flaw in Intel’s processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.
Programmers are scrambling to overhaul the open-source Linux kernel’s virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in an upcoming Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December.

Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we’re looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features – such as PCID – to reduce the performance hit.

That’s one hell of a bug.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

57 Comments

2018-01-03 1:03 am
BushLin
From the article, quoting a developer.
“There is presently an embargoed security bug impacting apparently all contemporary [Intel] CPU architectures that implement virtual memory…”
Maybe this is lost in translation but I understood the 286 to introduce Virtual Memory so this presumably affects every Intel CPU made since the 1970s? Yikes

2018-01-03 1:25 am
dnebdal
From the article, quoting a developer.
“There is presently an embargoed security bug impacting apparently all contemporary [Intel] CPU architectures that implement virtual memory…”
Maybe this is lost in translation but I understood the 286 to introduce Virtual Memory so this presumably affects every Intel CPU made since the 1970s? Yikes
AMD made an LKML post about this, explaining in more detail what sort of bug this is and that they’re not vulnerable.
If I understand it correctly (big if), the problem is in the way intel CPUs do speculative execution. Apparently, the speculative branches don’t fully respect memory protection, and someone has found a way to turn that into real-world effects. It seems to have been surprising that this was possible.
LKML link: https://lkml.org/lkml/2017/12/27/2
Edited 2018-01-03 01:29 UTC

2018-01-03 1:30 am
BushLin
Was just about to edit, yes; so just every Intel x86 from the Pentium Pro onwards, not so bad.
Edited 2018-01-03 01:41 UTC

2018-01-03 1:30 am
bhtooefr
Contemporary CPUs, though, and it’s implied that it’s related to speculative execution (in the old definition where it meant branch prediction and executing ahead of a stall).
That means 286, 386, and 486 cannot be affected, as they don’t have branch prediction – they just stall on branches where the available data isn’t present.
P5 Pentium and Bonnell Atom could be affected, but being in-order execution, are less likely to be affected even if the bug is present, they can’t get as far.
And, the major changes to the memory model, AFAIK, were 286 (added segmented MMU), 386 (32-bit MMU with flat addressing), P6 (36-bit segmentation added to MMU), Dothan (NX), Prescott (hackish 40-bit EM64T implementation), and Core 2 (full 48-bit, IIRC, EM64T implementation).
Here’s my guesses as far as where the bug would’ve likely been introduced:
* P6 (Pentium Pro) – if that’s the case, Pentium 4 and all Atoms/Atom-derived CPUs are likely unaffected, as they were separate clean-sheet redesigns (although elements were exchanged between designs)
* Dothan Pentium M – if that’s the case, Prescott Pentium 4s are possibly unaffected, Atoms/Atom-derived CPUs are likely unaffected, pre-Prescott Pentium 4s are almost certainly unaffected
* Prescott Pentium 4 – if that’s the case, then everything with NX support or everything with AMD64 support is likely affected (it wouldn’t be a big deal if only some old P4s were affected, after all, which would mean that some design reuse from P4 to later CPUs happened in the MMU)
* Core 2 – if that’s the case, then Atoms/Atom-derived CPUs are likely unaffected
* Something later – same deal about Atoms/Atom-derived CPUs likely being unaffected.

2018-01-03 1:14 am
Licaon_Kter
Phoronix already has some benchmarks for this on Linux: https://www.phoronix.com/scan.php?page=article&item=linux-415-x86pti…
2018-01-03 3:22 am
Brendan
Hi,
The minor timing quirk in Intel CPUs (that does not break documented behaviour, expected behaviour or any guarantee, and therefore can NOT be considered a bug in the CPU); allows an attacker to determine which areas of kernel space are used and which aren’t.
It does not allow an attacker to read or modify the contents of any memory used by the kernel, and doesn’t even tell the attacker what the areas of kernel space are being used for, and by itself is not a security problem at all. It only means that if there are massive security holes somewhere else, those massive security holes might or might not be a little bit easier to exploit. In other words; the main effect is that it makes “kernel address space randomisation” more ineffective at providing “security through obscurity” than it previously was.
Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn’t entirely fix the “problem” either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is).
Fortunately the “malicious performance degradation attack” (the ineffective work-around for the non-issue) is easy for end users to disable.
– Brendan

2018-01-03 4:32 am
panzi
In any case, would that be exploitable via JavaScript? If not I don’t care at all. Anything else I run already deliberately on my machine and it can access all my files anyway. And that is what matters to me, my files. Root cannot do more damage to me than a user process.

2018-01-03 4:58 am
Brendan
Hi,
In any case, would that be exploitable via JavaScript? If not I don’t care at all. Anything else I run already deliberately on my machine and it can access all my files anyway. And that is what matters to me, my files. Root cannot do more damage to me than a user process.
If you have poorly designed hardware (e.g. that is susceptible to “rowhammer”) and a poorly designed kernel (e.g. a monolithic kernel where there’s a strong relationship between virtual addresses used by the kernel and physical addresses); then in theory this minor timing quirk can make it a little easier for an attacker to exploit huge gaping security holes that should never of existed.
Javascript probably can’t use the minor timing quirk (it relies on a strict access pattern involving the use of a “dodgy pointer” that will cause a page fault; and Javascript is designed not to support pointers or raw addresses); so an attacker using Javascript will exploit the gaping security holes without using the minor timing quirk.
– Brendan

2018-01-03 5:21 am
Alfman verbose=1
Brendan,
The minor timing quirk in Intel CPUs (that does not break documented behaviour, expected behaviour or any guarantee, and therefore can NOT be considered a bug in the CPU); allows an attacker to determine which areas of kernel space are used and which aren’t.
…
Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn’t entirely fix the “problem” either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is). [/q]
Regarding address layout randomization, that is my position as well; it is nothing more than security by obscurity under a different name. Security that depends on keeping (non cryptographic) secrets like memory addresses is inherently flawed. However to play devil’s advocate, it’s proponents would argue it’s intended as a secondary security measure and not a primary one. Security by obscurity, as flawed as it is, is arguably more secure than not having a second line of defense at all when an exploit for the primary security is found. ASLR was introduced as a very low overhead way to increase the failure modes for code exploits. However given the increasingly known failures of ASLR on non-theoretical hardware and the costs of repairing ASLR’s broken assumptions, I’d say it’s days as an effective countermeasure are numbered. In other words I don’t think the community can justify the high costs that would be required to make ASLR strong against side channel attacks (like cache-hit testing).
This was the topic of a 2016 blackhat paper:
https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Ke…
Countermeasures are discussed at the very end…
Modifying CPU to eliminate timing channels
– Difficult to be realized
Using separated page tables for kernel and user processes
– High performance overhead (~30%) due to frequent TLB flush
Fine-grained randomization
– Difficult to implement and performance degradation
Coarse-grained timer?
– Always suggested, but no one adopts it.
As we know, nothing has been done to address sidechannel attacks against ASLR. Even a javascript version of the exploit was demonstrated last year just to illustrate how effective sidechannel attacks can be.
https://www.theregister.co.uk/2017/02/14/aslr_busting_javascript_hac…
This, the VU team says, is what makes this vulnerability so significant. An attacker could embed the malicious JavaScript code in a webpage, and then run the assault without any user notification or interaction.
Because the assault does not rely on any special access or application flaws, it works on most browsers and operating systems. The researchers say that when tested on up-to-date web browsers, the exploit could fully unravel ASLR protections on 64-bit machines in about 90 seconds.
Infecting a system does not end there, of course: it just means ASLR has been defeated.
[q]Fortunately the “malicious performance degradation attack” (the ineffective work-around for the non-issue) is easy for end users to disable.
The problem is there’s very little published info on this newest attack. The little bits that are around suggest to me this is much more significant than merely broken ASLR. It sounds like intel’s out of order branch prediction may be executing speculative code prior to checking the full credentials in such a way that they found a way to exploit the deferment, which does not happen on AMD processors. Apparently the temporary software fix is to reload the page table every kernel invocation. This invalidates the caches and happens to fix ASLR as well, but I think fixing ASLR was just a side effect – there’s not enough information to know for sure. I could be completely wrong but this media hush now would make very little sense if they had merely broken ASLR again given that ASLR is already publicly cracked and has been for ages already. I believe the sense of urgency and the deployment of high performance-cost workarounds in macos, windows, and linux, and planned service outages at amazon strongly suggest something much more critical was found to directly compromise kernel security on intel processors.
Hopefully Thom will post an update when all is finally revealed.
Edited 2018-01-03 05:27 UTC

2018-01-03 6:14 am
Brendan
Hi,
The problem is there’s very little published info on this newest attack. The little bits that are around suggest to me this is much more significant than merely broken ASLR. It sounds like intel’s out of order branch prediction may be executing speculative code prior to checking the full credentials in such a way that they found a way to exploit the deferment, which does not happen on AMD processors. Apparently the temporary software fix is to reload the page table every kernel invocation. This invalidates the caches and happens to fix ASLR as well, but I think fixing ASLR was just a side effect – there’s not enough information to know for sure. I could be completely wrong but this media hush now would make very little sense if they had merely broken ASLR again given that ASLR is already publicly cracked and has been for ages already. I believe the sense of urgency and the deployment of high performance-cost workarounds in macos, windows, and linux, and planned service outages at amazon strongly suggest something much more critical was found to directly compromise kernel security on intel processors.
As I understand it:
a) Program tries to do a read from an address in kernel space
b) CPU speculatively executes the read and tags the read as “will generate page fault” (so that a page fault will occur at retirement), but also (without regard to permission checks and likely in parallel with permission checks) either speculatively reads the data into a temporary register (if the page is present) or pretends that data being read will be zero (if the page is not present) for performance reasons (so that other instructions can be speculatively executed after a read). Note that the data (if any) in the temporary register can not be accessed directly (it won’t become “architecturally visible” when the instruction retires).
c) Program does a read from an address that depends on the temporary register set by the first read, which is also speculatively executed, and because it’s speculatively executed it uses the “speculatively assumed” value in the temporary register. This causes a cache line to be fetched for performance reasons (to avoid a full cache miss penalty if the speculatively executed instruction is committed and not discarded).
d) Program “eats” the page fault (caused by step a) somehow so that it can continue (e.g. signal handler).
e) Program detects if the cache line corresponding to “temporary register was zero” was pre-fetched (at step c) by measuring the amount of time a read from this cache line takes (a cache hit or cache miss).
In this way (or at least, something vaguely like it); the program determines if a virtual address in kernel space corresponds to a “present” page or a “not present” page (without any clue what the page contains or why it’s present or if the page is read-only or read/write or executable or even if the page is free/unused space on the kernel heap).
– Brendan

2018-01-03 6:41 am
galvanash
There has to be more to it than that. I mean I’m not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.
No way would everyone issue fixes like this in such a cloak and dagger fashion, especially a fix that causes a significant performance regression, if it wasn’t scaring the crap out of some people…

2018-01-03 7:40 am
Brendan
Hi,
There has to be more to it than that. I mean I’m not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.
No way would everyone issue fixes like this in such a cloak and dagger fashion, especially a fix that causes a significant performance regression, if it wasn’t scaring the crap out of some people…
You’re right – there’s something I’ve overlooked.
For a sequence like “movzx edi,byte [kernelAddress]” then “mov rax,[buffer+edi*8]”, if the page is present, the attacker could find out which cache line (in their buffer) got fetched and use that to determine 5 bits of the byte at “kernelAddress”.
With 3 more individual attempts (e.g. with “mov rax,[buffer+4+edi*8]”, “mov rax,[buffer+2+edi*8]” and “mov rax,[buffer+1+edi*8]”) the attacker could determine the other 3 bits and end up knowing the whole byte.
Note: It’s not that easy – with a single CPU the kernel’s page fault handler would pollute the caches a little (and could deliberately invalidate or completely pollute caches as a defence) before you can measure which cache line was fetched. To prevent that the attacker would probably want/need to use 2 CPUs that share caches (and some fairly tight synchronisation between the CPUs so the timing isn’t thrown off too much).
– Brendan
Edited 2018-01-03 07:40 UTC
2018-01-03 9:27 am
Alfman verbose=1
Brendan,
In this way (or at least, something vaguely like it); the program determines if a virtual address in kernel space corresponds to a “present” page or a “not present” page (without any clue what the page contains or why it’s present or if the page is read-only or read/write or executable or even if the page is free/unused space on the kernel heap). [/q]
The thing is similar results have been achieved in the past using different techniques with hardily anyone blinking an eye.
For instance:
https://gruss.cc/files/prefetch.pdf
Indeed, prefetch instructions leak timing information on the exact translation level for every virtual address. More severely, they lack a privilege check and thus allow fetching inaccessible privileged memory into various CPU caches. Using these two properties, we build two attack primitives: the translation-level oracle and the address-translation oracle. Building upon these primitives, we then present three different attacks. Our first attack infers the translation level for every virtual address, effectively defeating ASLR. Our second attack resolves virtual addresses to physical addresses on 64-bit Linux systems and on Amazon EC2 PVM instances in less than one minute per gigabyte of system memory. This allows an attacker to perform ret2dir-like attacks. On modern systems, this mapping can only be accessed with root or kernel privileges to prevent attacks that rely on knowledge of physical addresses. Prefetch Side-Channel Attacks thus render existing approaches to KASLR ineffective. Our third attack is a practical KASLR exploit. We provide a proof-of-concept on a Windows 10 system that enables return-oriented programming on Windows drivers in memory. We demonstrate our attacks on recent Intel x86 and ARM Cortex-A CPUs, on Windows and Linux operating systems, and on Amazon EC2 virtual machines.
Incidentally this paper from 2016 recommends the exact same (slow) countermeasures that the linux kernel and others are now implementing, so if merely getting this meta information was such a big deal on it’s own, then why didn’t they act sooner when ASLR weaknesses first started being published?
I believe/assume the answer to this lies in the possibility that someone actually managed to breach a kernel boundary to either read or write memory without proper access. I don’t know if this is true, but IMHO it would explain the urgency we’re seeing.
galvanash,
There has to be more to it than that. I mean I’m not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.
We’ll have to wait and see, but as long as we’re speculating, here’s mine Assuming that the speculative engine doesn’t check page permissions for rights before executing speculative branches (as some sources have mentioned), then I wonder if maybe intel is leaking protected memory by a side channel attack on indirect jumps?
Pseudo code example:
[q]
// 4GB filled with nothing but “ret” opcodes
char*dummy=malloc(0x100000000ULL);
memset(dummy, 0xC3, 1<<32);
// ignore the page faults…
sigaction (SIGSEGV, IGNORE);
long best_clocks=~0;
long best_x;
long clocks;
clearcache();
asm(
“mov eax, [FORBIDDEN_MEMORY_ADDRESS]”
“call [$dummy+eax]”);
for(long long x=0; x<0x100000000ULL; x+=CACHE_WIDTH) {
clocks = cpuclocks();
char ignore = dummy[x];
clocks = cpuclocks() – clocks;
if (best_clocks>clocks) {
best_clocks = clocks;
best_x = x;
}
}
printf(“The value at %lx is near %ld!\n”, FORBIDDEN_MEMORY_ADDRESS, best_x);
Bear in mind this is just a very rough idea, but the theory is that if the branch predictor speculatively follows the branch, then page corresponding to the hidden kernel value should get loaded into cache. The call will inevitably trigger a fault, which is expected, but the state of the cache will not get reverted and will therefor leak information about the value in kernel memory. Scanning the dummy memory under clock analysis should reveals which pages are in cache. Different variations of this idea could provide more information.
Edited 2018-01-03 09:39 UTC

2018-01-03 7:23 am
le_c
Would it be possible to slow down page fault notifications? For example, if the page fault was not on kernel space, halt the application for the time offset of a kernel read. In this way all segfaults would be reported at the same time.
Are there any sane apps that depends on timely segfault handling and thus would be affected by such a workaround?

2018-01-03 8:02 am
oiaohm
Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn’t entirely fix the “problem” either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is).
Please do not claim bogus. Microkernels fixed against this defect will in fact take a higher hit than a monolithic kernel. Kernel to usermode switching cost increases when you have to change complete page tables every single time changing from kernel to userspace and back. Microkernel do this way more often than Monolithic. The advantage of running drivers in kernel space.
This is not exactly a non issue. The fact userspace ring is interfering was able to detect kernel space pages was found in 2016. Now kernel space pages with wrong protective bits for any reason are also exposed due to that fault.
Small fragment mapped of kernel mapped in userspace does not provide enough information to work out the randomisation. Complete kernel mapped into userspace as Intel CPU have been doing down right does.
Small fragment mapped into user-space on independent page tables does not mean there is any relationship of that information once you enter kernel mode and switch to kernel mode TLB.
Also this mapping fix to work around Intel MMU/CPU design issue applied to a Microkernel in fact hurts worst. It some ways it explains why AMD cpu have been slightly slower in particular benchmarks.
Yes AMD MMU/CPU if you attempt to access ring 0 pages from ring 1-3 and they have not been mapped for you its not happening. Same with ring 1 from ring 2 and 3.
So that KASLR attack from 2016 did not work on AMD CPUs. So finding extra protection flaws also have no effect on AMD CPU because the KASLR attack from 2016 does not work. Its not that the AMD has different timing is better page memory rules enforced by hardware so most of the kernel ring 0 pages are basically non existent to the userspace code.
Really this is another reason why in the past it was require by the USA Mil for anything they acquired to come 3 vendors using compatible sockets. So a vendor glitch like this could have been fixed by changing chips.
There is security by obscurity and there is no be there. No be there is better. AMD cpu/mmu is doing no be there so userspace could not see all of the kernel space ring 0 pages and this makes solving the address randomisation of kernel next to impossible.
Intel was depending on obscurity that no one would hunt the memory and find out that complete kernel space pages and userspace pages were in fact exposed to userspace program. Then Intel prayed that memory protection settings would always be enforced and correct. There turned up a few case in 2017 where the memory protections were not always on. So now you have kernel pages exposed to userspace and userspace able to write them how can you say mega failure.
Implementing two page tables one for kernel space and one for userspace is about the only valid way to work around intel goof up. Please note this kind of fault dates back to the 286. All the AMD processors with built in MMU have had different behaviour so preventing the problem.

2018-01-03 10:26 am
Brendan
Hi,
“Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn’t entirely fix the “problem” either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is).”
Please do not claim bogus. Microkernels fixed against this defect will in fact take a higher hit than a monolithic kernel. Kernel to usermode switching cost increases when you have to change complete page tables every single time changing from kernel to userspace and back. Microkernel do this way more often than Monolithic. The advantage of running drivers in kernel space. [/q]
Completely changing page tables is expensive. Windows and Linux will be doing this for every system call and every IRQ. For things that don’t involve a driver (all scheduling, all memory management, etc) a micro-kernel won’t change page tables (because there’s “almost nothing” in kernel-space, and certainly no encryption keys or passwords or other sensitive data, and therefore there’s no reason to bother) but the monolithic kernels will. This is the main reason that (after patches) monolithic kernels will be slower than micro-kernels.
For things that do involve a device driver, a micro-kernel and a monolithic kernel both change page tables and end up “similar”. However, micro-kernels are *designed* knowing that there will be context switching costs and therefore the designers design everything to reduce/mitigate those costs (including things like postponing task switches in the hope of “one task switch instead of many” and things like having a single “do this list of things” message rather than doing one little thing per message). Monolithic kernels were never designed to mitigate these costs. This is the other reason that monolithic kernels will be slower than micro-kernels.
Note: the “address space ID” feature reduces the cost of changing page tables for both monolithic and micro-kernel; and therefore doesn’t make monolithic (with these patches) “less worse” in comparison.
This is not exactly a non issue. The fact userspace ring is interfering was able to detect kernel space pages was found in 2016. Now kernel space pages with wrong protective bits for any reason are also exposed due to that fault.
Those were exposed anyway (just use a “for each page (try something)” loop and ignore any page faults).
Small fragment mapped of kernel mapped in userspace does not provide enough information to work out the randomisation. Complete kernel mapped into userspace as Intel CPU have been doing down right does.
That’s not how it works. You have a virtual address space where user-space is in one area of the virtual address space and kernel is in another area, and where these areas are separated by privilege levels. Kernel has never been mapped into user-space.
[q]Small fragment mapped into user-space on independent page tables does not mean there is any relationship of that information once you enter kernel mode and switch to kernel mode TLB.
Also this mapping fix to work around Intel MMU/CPU design issue applied to a Microkernel in fact hurts worst. It some ways it explains why AMD cpu have been slightly slower in particular benchmarks.
Yes AMD MMU/CPU if you attempt to access ring 0 pages from ring 1-3 and they have not been mapped for you its not happening. Same with ring 1 from ring 2 and 3.
So that KASLR attack from 2016 did not work on AMD CPUs. So finding extra protection flaws also have no effect on AMD CPU because the KASLR attack from 2016 does not work. Its not that the AMD has different timing is better page memory rules enforced by hardware so most of the kernel ring 0 pages are basically non existent to the userspace code.
Really this is another reason why in the past it was require by the USA Mil for anything they acquired to come 3 vendors using compatible sockets. So a vendor glitch like this could have been fixed by changing chips.
There is security by obscurity and there is no be there. No be there is better. AMD cpu/mmu is doing no be there so userspace could not see all of the kernel space ring 0 pages and this makes solving the address randomisation of kernel next to impossible.
Intel was depending on obscurity that no one would hunt the memory and find out that complete kernel space pages and userspace pages were in fact exposed to userspace program. Then Intel prayed that memory protection settings would always be enforced and correct. There turned up a few case in 2017 where the memory protections were not always on. So now you have kernel pages exposed to userspace and userspace able to write them how can you say mega failure.
Implementing two page tables one for kernel space and one for userspace is about the only valid way to work around intel goof up. Please note this kind of fault dates back to the 286. All the AMD processors with built in MMU have had different behaviour so preventing the problem.
Intel wasn’t depending on obscurity – they didn’t invent or implement “kernel address space randomisation”. This probably came from one of the “hardened Linux” groups (SELinux? Grsecurity?) before being adopted in “mainline Linux” (and cloned by Microsoft).
As far as I know this problem *might* effect Pentium III and newer (and might only effect Broadwell and newer – details are hard to find). It does not effect 80286, 80386, 80486 or Pentium. 80286 doesn’t even support paging (and doesn’t do speculative execution either).
Don’t forget that recent ARM CPUs are also effected – it’s not just “Intel’s goof up” (this time).
– Brendan

2018-01-03 3:47 pm
Kochise
So it finally proves Adrew S Tanenbaum was right all along, Minix a superior OS from the very beginning with a clever architecture
Btw, is that from finding Minix used as hypervisor last october, and subsequent pocs/hacks that the flaw was discovered in Intel cpus ?
Would a similar flaw be found in AMD chips if they also used Minix to perform similar tricks as well ? What about ARM chips and their ‘TrustedZone’ ?

2018-01-03 12:32 pm
Carewolf
Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn’t entirely fix the “problem” either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is).
Please do not claim bogus. Microkernels fixed against this defect will in fact take a higher hit than a monolithic kernel.
No. Let me put is like this: Microkernels are already talking a similar hit as their context switches are from user to user, now Macrokernels get a similar hit by having to dump the virtual tables before going to user mode, just like each user’s table are dropped when moving to another user.

2018-01-03 2:53 pm
Alfman verbose=1
Carewolf,
No. Let me put is like this: Microkernels are already talking a similar hit as their context switches are from user to user, now Macrokernels get a similar hit by having to dump the virtual tables before going to user mode, just like each user’s table are dropped when moving to another user.
Indeed, this workaround will make a macrokernel perform like a “naive” microkernel, which could be potentially worse than a microkernel that’s undergone design efforts to mitigate the userspace transition overhead (like vectored IO and memory mapped IPC, etc).
Intel will hopefully fix the flaw (whatever it is) for future CPUs, but realistically new CPUs could end up being cost prohibitive for many consumers who typically are multiple generations behind intel’s latest architectures, even after purchasing new computers since most of us cannot afford to pay several hundred dollars for intel’s latest CPU offerings. So unless intel gives some kind of credit to replace faulty CPUs previously sold & inventory, many consumers are going to be negatively impacted for the medium to long term.
Edit:
It’s too early to know what’s going on, but assuming one’s workloads aren’t terribly effected by this workaround, it could potentially be good news for people wanting to buy the faulty systems at a discounted price. For example this could instantly render tons of enterprise equipment completely worthless to their original owners. It may no longer be good enough for them, but it might be good for a home lab.
Edited 2018-01-03 15:09 UTC

2018-01-03 7:55 pm
tidux
There’s already an exploit in the wild using this to read kernel memory as a non-root user. All it takes is a bit of JavaScript downloading and executing such a binary and you’re pwned.
https://twitter.com/brainsmoke/status/948561799875502080

2018-01-03 4:30 am
raom
Damn. This should be great for AMD. I just hope it can be disabled in Microsoft’s case. No one needs to take 30% less performance on an offline machine.

2018-01-03 5:51 am
Alfman verbose=1
raom,
Damn. This should be great for AMD. I just hope it can be disabled in Microsoft’s case. No one needs to take 30% less performance on an offline machine. [/q]
What they’re saying is that AMD processors are not vulnerable, and AMD itself posted patches to disable the performance crippling workaround on it’s processors.
https://mail-archive.com/linux-kernel@vger.kernel.org/msg1572418.htm…
[q]static void __init early_identify_cpu(struct cpuinfo_x86 *c)
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
– /* Assume for now that ALL x86 CPUs are insecure */
– setup_force_cpu_bug(X86_BUG_CPU_INSECURE);
+ if (c->x86_vendor != X86_VENDOR_AMD)
+ setup_force_cpu_bug(X86_BUG_CPU_INSECURE);
fpu__init_system(c);
However at least as of right now, the current mainline kernel (4.15-rc6) does not include it! Meaning that AMD users will be punished as well if they use that kernel.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/d…
BTW if you want to browse the changes applied to the kernel source code in order to support this, here’s a handy link. All references to “PTI” functions and/or files are referring to this change.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/d…

2018-01-03 7:44 am
benoitb
I didn’t see anyone mention it but there will be an ecological price to pay for that bug.
In my work we have about 100 big servers crunching data for a network operator, that is only for 1 customer. A 20% hit means we will have to buy extra servers to compensate as we cannot afford to loose data.
This will cost not just money and deployment work time but 20% extra power usage from now on…

2018-01-03 8:06 am
grat
Since I’m assuming you’re using fairly modern CPU’s that support the ‘PCID’ feature, which minimizes the performance impact of the fix (~5%) it shouldn’t be too drastic.
Obviously, if you’re at the edge of the performance cliff already, you’ll be affected, but if you’re riding the edge, you’re already screwed and should be in the process of buying more hardware.
Or perhaps you should make a pitch for ThreadRipper / Epyc based systems.
Edited 2018-01-03 08:06 UTC

2018-01-03 12:43 pm
MobyTurbo
Which CPUs have PCID? I googled, was it introduced with Westmere, or was it later?
2018-01-03 2:05 pm
chewey2003
https://www.fool.com/investing/2017/12/19/intels-ceo-just-sold-a-lot…
On Nov. 29, Brian Krzanich, the CEO of chip giant Intel (NASDAQ:INTC), reported several transactions in Intel stock in a Form 4 filing with the SEC.
2018-01-03 3:00 pm
kwan_e
So why kernel patches and not microcode? I’m assuming it’s just a short term solution and there will be a better long term solution.

2018-01-03 3:16 pm
Alfman verbose=1
kwan_e,
So why kernel patches and not microcode? I’m assuming it’s just a short term solution and there will be a better long term solution.
There’s only so much you can do in microcode to alter the behavior of some opcodes, but features like branch prediction are still hardwired and require new silicon designs. My understanding is that the engineers tried to avoid this, this was a last resort, but they found no other solution.

2018-01-03 9:15 pm
kwan_e
but features like branch prediction are still hardwired and require new silicon designs.
That is surprising to me. I’d have thought you’d want to make something like branch prediction modifiable (well, just like other instructions/features) so fixes can be applied.
So my question is, why is the lack of security check hardwired, or why it was designed in such a way that not even a microcode update could fix it?

2018-01-03 9:41 pm
Kochise
Well, a cpu is not a fpga, the whole logic is not reprogrammable. The microcode allows to modify/patch the isa, but the main ‘engine’ (composed of the ‘alu’, the ‘execution unit’, …) have to be hardwired somehow.
Good explanation here : http://dsearls.org/courses/C391OrgSys/CPU/CPU.htm

2018-01-03 9:54 pm
kwan_e
The microcode allows to modify/patch the isa, but the main ‘engine’ (composed of the ‘alu’, the ‘execution unit’, …) have to be hardwired somehow.
I would hardly call speculative execution as part of the main engine, since processors can get along fine without it. I would have thought speculative execution would be one of the killer features of modern microcode-based designs.
AMD, at least claims, to not have this security hole hardwired into their processors, so it’s not impossible to not hardwire this stuff into the processor as to be unfixable.
2018-01-03 10:57 pm
Kochise
Well, cpu architecture is private ip and the special recipe, the salt and pepper a company like intel, amd or arm (to name a few) promote as unique and groundbreaking and tremendous and butterflies to put to shame competition.
But not only, it’s also tricky engineer stuff to get the job done a little way better, faster, frugalier than the competition. These (speculative) execution units are engraved in stone (well, etched in silicon) and are not subject to change until next architecture iteration.
Just like motor engine, the basic principle remains the same, but that doesn’t prevent to have modifications and improvements. Yet if the problem is about the particle exhaustion of the combustion engines, which is common across all the whole industry is doomed.
Edited 2018-01-03 22:58 UTC
2018-01-03 11:02 pm
Alfman verbose=1
kwan_e,
That is surprising to me. I’d have thought you’d want to make something like branch prediction modifiable (well, just like other instructions/features) so fixes can be applied.
So my question is, why is the lack of security check hardwired, or why it was designed in such a way that not even a microcode update could fix it? [/q]
Microcode is pretty basic and it’s main purpose is to transform chaotic CISC instructions into normalized, structured, and predictable RISC ones, which greatly reduces the complexity of execution pipelines. However the pipelines themselves aren’t implemented with microcode.
You might envision an architecture where the pipelines are implemented in microcode too, but let’s try to examine that approach using a hypothetical example. Assume everything were designed in such a way that microcode could fix everything, including the pipelines. This would resemble more and more a software CPU implementation rather than a hardwired one, it would give us much greater flexibility to fix & improve the CPU after fabrication. This might seem good in many ways, but there are a few issues with it. Software algorithms are always going to be slower than transistors hardwired to produce the logic. A more fundamental problem is that moving all features into software doesn’t actually obviate the need for complex and potentially buggy hardware pipelines, those merely gets shifted another layer down because you’ll still want speculative branch prediction just to run the software based CPU, which itself is running the target x86 code. Technically it’s all feasible, but at what complexity, cost and performance?
I would hardly call speculative execution as part of the main engine, since processors can get along fine without it. I would have thought speculative execution would be one of the killer features of modern microcode-based designs.
Well, I personally don’t know whether there’s a way for intel to disable speculative execution pipelines on modern intel processors or not, but even if you could, that would be turning off decades worth of faster-than-clockspeed performance bumps. Disabling these features on a modern 2.4GH processor could cause regressions back to the 2.4GH performance of much older processors (like y2k era pentiums).
[q]AMD, at least claims, to not have this security hole hardwired into their processors, so it’s not impossible to not hardwire this stuff into the processor as to be unfixable.
Given that AMD’s x86 implementation is different from intel’s there’s no reason a hardware flaw in one must be present on the other. Unfortunately it looks like we’ll have to wait until next week at the earliest before we get more details…grr.
2018-01-03 11:14 pm
Kochise
But even if the execution units were into microcode, what final unit would interpret at the very end of the chain ? There must be spinning gears somewhere getting the job done.
But… Transmeta Crusoe was (in factory) upgradable ?
2018-01-03 11:58 pm
Alfman verbose=1
Kochise,
But even if the execution units were into microcode, what final unit would interpret at the very end of the chain ? There must be spinning gears somewhere getting the job done. [/q]
That’s kind of what I meant, even if you have a software driven pipeline, you would still need a hardware pipeline somewhere. Just for fun, you could solve this problem recursively to the Nth degree, but unless you could come up with layers that are exactly 100% efficient, each layer would impose it’s own overhead.
[q]But… Transmeta Crusoe was (in factory) upgradable ?
I don’t know much about it.
2018-01-03 11:41 pm
kwan_e
However the pipelines themselves aren’t implemented with microcode.
You might envision an architecture where the pipelines are implemented in microcode too,
I was only wondering about the speculative execution aspect, where something like (the lack of) security checks could have been implemented softly. Of course some pipelines need to be hardwired for performance reasons.
———–
Either way, it seems like a strange design decision overall to not honour security checks by default. No safe language can protect against that
2018-01-04 12:12 am
Alfman verbose=1
kwan_e,
I was only wondering about the speculative execution aspect, where something like (the lack of) security checks could have been implemented softly. Of course some pipelines need to be hardwired for performance reasons.
———–
Either way, it seems like a strange design decision overall to not honour security checks by default. No safe language can protect against that
If it’s a check that has to happen on every single memory request, then the overhead of solving it with code rather than wires might not be justified.
I’m sure they’ve got thousands of tests cases scanning for recurrences of things like the FDIV bugs and whatnot, it would seem that nobody came up with a test case for this flaw at intel before now, so it slipped through the cracks. However without knowing what “it” is, we don’t know how egregious their failure was.

2018-01-03 3:43 pm
Kochise
Article says :
Similar operating systems, such as Apple’s 64-bit macOS, will also need to be updated â€“ the flaw is in the Intel x86-64 hardware, and it appears a microcode update can’t address it

2018-01-03 5:07 pm
leech
I was going to mention also that MacOS is affected as well, it isn’t just a Windows and Linux issue, as the article here states.

2018-01-03 9:08 pm
kwan_e
Apparently some people on this site don’t like it when you ask an earnest question because you don’t know something.
Sorry, next time I’ll pretend to know everything.

2018-01-03 9:37 pm
Kochise
You asked a question, I (we) replied, why are you getting angry ?

2018-01-03 9:48 pm
kwan_e
You asked a question, I (we) replied, why are you getting angry ?
First of all, how do you know I am angry?
Second, it isn’t the replies to my question that I was referring to. Given that you couldn’t see even that, refer to the first point.

2018-01-03 10:48 pm
Kochise
[q]Apparently some people on this site don’t like it when you ask an earnest question because you don’t know something.
Sorry, next time I’ll pretend to know everything.
You asked a question, I (we) replied, why are you getting angry ? [/q]
First of all, how do you know I am angry?
Second, it isn’t the replies to my question that I was referring to. Given that you couldn’t see even that, refer to the first point.
I’d say your sarcastic tone is revealing of your true self. Now pretend whatever you want to feel right.
2018-01-03 11:30 pm
kwan_e
Now I’m sarcastic.
How about this? I can write anything in any tone, and you’ll just read whatever emotion you want into it regardless of the truth.
Never does it once enter your mind that there are also cultural differences that may actually come across differently, and that you’re not an expert in all cultures.
2018-01-04 6:23 am
Kochise
Come on, don’t hide yourself behind a supposed cultural difference, I can actually detect earned up/down votes easier than you think, and there is an obvious bias of your groupies here.
Now pretend what you like, I made my own mind about it already.

2018-01-04 9:05 pm
whartung
Sounds like has some solution, and I assume it’s via Microcode.
https://newsroom.intel.com/news-releases/intel-issues-updates-protec…

2018-01-05 12:46 am
Alfman verbose=1
whartung,
Sounds like has some solution, and I assume it’s via Microcode. [/q]
That’s interesting.
I’ll take any good news we can right now, but it sounds like they are relying on the software based workarounds that OS vendors are working on. What intel has done to update CPU behavior is vague and makes me wonder what CPU specific updates they could offer? I’m really curious.
[q]Intel has already issued updates for the majority of processor products introduced within the past five years. By the end of next week, Intel expects to have issued updates for more than 90 percent of processor products introduced within the past five years. In addition, many operating system vendors, public cloud service providers, device manufacturers and others have indicated that they have already updated their products and services.
Taken at face value though, it seems many consumers won’t be covered because many desktop computers are sold with older cpus. My newest computer (i7-3770) that I bought two years ago is already outside their specified support window
Edited 2018-01-05 00:48 UTC

2018-01-03 5:01 pm
christian
While I seat here reading this, I have apt updating the software on my laptop, which is promptly burning a hole in my lap as the CPU spins with the company mandatory virus scanner scans each and every updated file, while at the same time making my laptop less responsive.
Seems we’re all too happy to pay a significant price for security, so it’ll be business as usual within 3 months once the furore has died down.

2018-01-03 5:09 pm
Kochise
Yeah, looks like everyone is making a bit fuss and bragging about a possible cpu slowdown while we already shove our over powerful computers with flash animations and real time virus scanners. Would have we noticed if not informed ?

2018-01-03 6:04 pm
whartung
Yeah, looks like everyone is making a bit fuss and bragging about a possible cpu slowdown while we already shove our over powerful computers with flash animations and real time virus scanners. Would have we noticed if not informed ?
We just bumped our DB box from 8 to 10 CPUs, like two weeks ago, as we were running higher and higher overall CPU loads.
This patch will effectively negate those CPUs and now we’ll probably have to allocate 2 more just to compensate and get us back to where we were.
However, because of our reliability and failover requirements, we also need to allocate more CPUs to the back up machines as well. Due to our need to ensure that our Staging and Production systems are equal for testing and rollout issues, we also have to upgrade our Staging infrastructure (which also has a hot spare machine).
So, this bug is going to “cost” us 8 more CPUs. We had to scavenge under used VMs to reclaim them in order to free up those CPUs for our upgrade. I honestly don’t know if we have 8 CPUs to spare.
Thankfully, we won’t need to do this to the rest of the infrastructure, as the CPU load isn’t as much of a problem but we’re certainly excited that everything (notably response times) are just going to be 10-20% slower across the board. Yay us.
So, yea, this is a big deal for us.

2018-01-03 7:30 pm
PJBonoVox
It comes down to use case I guess. There are folks on OSNews (and the internet as a whole) who run 1000+ build farms at work, there are some who tinker as hobbyists and some who probably only use x86 when absolutely necessary.
It’s gonna impact different people in different ways (if at all). Some folks are thinking about their brand new gaming rig, others about their company’s 8-figure cloud operations.
That’s why I love this site, it’s a whole mix of backgrounds.

2018-01-03 8:44 pm
grat
The impact of the security fix varies depending on what your workload is.
Gaming, which all runs in user space, rather than kernel space, seems largely unaffected.
Similarly, I would question whether a well tuned DB would be heavily affected, since it’s largely IO bound as a rule, rather than heavy kernel CPU.
Best plan is to start preparing for increased CPU counts, but wait to verify it’s a problem first.
… or buy AMD.

2018-01-03 8:38 pm
l3v1
so it’ll be business as usual within 3 months once the furore has died down
Well, there’s more to consider. In scientific circles it’s quite common to also include runtime results for algorithms, specifying the used cpu along with some relevant details – however, listing the kernel is not something generally done. It seems there’ll be the need to also do that from now on. Would be quite some “fun” (not) to keep around unpatched kernels just to do comparable tests to compare with published numbers of earlier works.

2018-01-05 5:10 am
BlueofRainbow
It is presumed that hobby and alternate operating systems will also be impacted.
There has been some discussion about the issue in Haiku’s forums. However, there was no mention of the issue at this time for MenuetOS, ReactOS, and SyllableOS (no longer active?).
There are probably many lesser known operating systems which may potentially be impacted. It will take some time to sort-out the legacy of this security flaw.

2018-01-05 4:58 pm
whartung
It is presumed that hobby and alternate operating systems will also be impacted.
There has been some discussion about the issue in Haiku’s forums. However, there was no mention of the issue at this time for MenuetOS, ReactOS, and SyllableOS (no longer active?).
There are probably many lesser known operating systems which may potentially be impacted. It will take some time to sort-out the legacy of this security flaw.
Well, obviously the onus is on the OS authors to take the proper steps for mitigation, but at the same time, they may well take in to account the simple odds of attack.
When I saw the potential performance impact on our dedicated DB server, I gave serious consideration to not wanting the patch. The argument is simply that if someone managed to get a process running on our DB machine (which is necessary to exploit this in the first place), we have far graver issues from much simpler paths of exposure than this thing.
The real threat of this thing, to me, is the public clouds, which, without mitigation, are patently unsafe now. But our current deployment model doesn’t leverage public infrastructure.
I will most likely be overruled by corporate in the end, however.

2018-01-08 4:19 pm
BlueofRainbow
Just wondering if UEFI is potentially impacted by this flaw notably with respect to the insertion of a root-kit?
Indirectly, how about the Intel Management Engine which has been running Minix at the “Ring -3”?

2018-01-08 5:06 pm
Alfman verbose=1
BlueofRainbow,
Just wondering if UEFI is potentially impacted by this flaw notably with respect to the insertion of a root-kit? [/q]
Off the top of my head, I wouldn’t think UEFI’s keys themselves are vulnerable to these attacks because they’re just public keys, which don’t require keeping secrets from the user/hacker.
Indirectly, how about the Intel Management Engine which has been running Minix at the “Ring -3”?
The minux kernel used by intel’s AMT is running on a completely different CPU core.
https://en.wikipedia.org/wiki/Intel_Active_Management_Technology#Har…
[q]Starting with ME 11, it is based on the Intel Quark x86-based 32-bit CPU and runs the MINIX 3 operating system.
Other bugs aside, this core isn’t normally accessible to users, Even if they could access it, I don’t think these simple low performance cores have speculative branching to begin with.
https://en.wikipedia.org/wiki/Intel_Quark
Intel system management SMM is a different matter though, it runs in a higher privilege level than an OS kernel. It’s normally used by the BIOS for mundane tasks. In theory it might contain code patterns that are vulnerable to timing attacks. In practice I’m not sure if the conditions required for an attack are being met. I guess we’d have to reverse engineer the BIOS code responsible for communicating with SMM to see if one could exploit that channel.
SMM has already been hacked in the past, although cache timing attacks could make for new kinds of exploits.
https://www.computerworld.com/article/2531246/malware-vulnerabilitie…