Microsoft’s CrowdStrike post-mortem

Microsoft has published a post-mortem of the CrowdStrike incident, and goes into great depths to describe where, exactly, the error lies, and how it could lead to such massive problems. I can’t comment anything insightful on the technical details and code they show to illustrate all of this – I’ll leave that discussion up to you – but Microsoft also spends considerable amount of time explaining why security vendors are choosing to use kernel-mode drivers.

Microsoft lists three major reasons why security vendors opt for using kernel modules, and none of them will come as a great surprise to OSNews readers: kernel drivers provide more visibility into the system than a userspace tool would, there are performance benefits, and they’re more resistant to tampering. The downsides are legion, too, of course, as any crash or similar issue in kernel mode has far-reaching consequences. The goal, then, according to Microsoft, is to balance the need for greater insight, performance, and tamper resistance with stability.

And while the company doesn’t say it directly, this is clearly where CrowdStrike failed – and failed hard. While you would want a security tool like CrowdStrike to perform as little as possible in kernelspace, and conversely as much as possible in userspace, that’s not what CrowdStrike did. They are running a lot of stuff in kernelspace that really shouldn’t be there, such as the update mechanism and related tools. In total, CrowdStrike loads four kernel drivers, and much of their functionality can be run in userspace instead.

It is possible today for security tools to balance security and reliability. For example, security vendors can use minimal sensors that run in kernel mode for data collection and enforcement limiting exposure to availability issues. The remainder of the key product functionality includes managing updates, parsing content, and other operations can occur isolated within user mode where recoverability is possible. This demonstrates the best practice of minimizing kernel usage while still maintaining a robust security posture and strong visibility.

Windows provides several user mode protection approaches for anti-tampering, like Virtualization-based security (VBS) Enclaves and Protected Processes that vendors can use to protect their key security processes. Windows also provides ETW events and user-mode interfaces like Antimalware Scan Interface for event visibility. These robust mechanisms can be used to reduce the amount of kernel code needed to create a security solution, which balances security and robustness.

↫ David Weston, Vice President, Enterprise and OS Security at Microsoft

In what is surely an unprecedented event, I agree with the CrowdStrike criticism bubbling under the surface of this post-mortem by Microsoft. Everything seems to point towards CrowdStrike stuffing way more things in kernelspace than is needed, and as such creating a far larger surface for things to go catastrophically wrong than needed. While Microsoft obviously isn’t going to openly and publicly throw CrowdStrike under the bus, it’s very clear what they’re hinting at here, and this is about as close to a public flogging we’re going to get.

Microsoft’s post-portem further details a ton of work Microsoft has recently done, is doing, and will soon be doing to further strenghthen Windows’ security, to lessen the need for kernelspace security drivers even more, including adding support for Rust to the Windows kernel, which should also aid in mitigating some common problems present in other, older programming languages (while not being a silver bullet either, of course).

29 Comments

  1. 2024-07-29 10:48 am
  2. 2024-07-29 11:29 am
    • 2024-07-29 7:16 pm
  3. 2024-07-29 12:06 pm
    • 2024-07-29 1:16 pm
      • 2024-07-29 1:42 pm
        • 2024-07-29 2:19 pm
          • 2024-07-29 3:51 pm
          • 2024-07-29 6:33 pm
          • 2024-07-29 7:33 pm
          • 2024-07-29 10:44 pm
          • 2024-07-30 8:46 am
          • 2024-07-30 10:38 am
    • 2024-07-29 4:36 pm
      • 2024-07-29 5:02 pm
        • 2024-07-29 6:48 pm
          • 2024-07-29 7:16 pm
          • 2024-07-29 10:05 pm
          • 2024-07-30 8:32 am
          • 2024-07-30 10:58 am
  4. 2024-07-29 12:27 pm
    • 2024-07-29 7:09 pm
        • 2024-07-29 11:42 pm
          • 2024-07-30 1:43 am
  5. 2024-07-29 9:11 pm
    • 2024-07-31 1:37 am
      • 2024-07-31 2:18 am
        • 2024-07-31 2:24 am