←back to thread

158 points kenjackson | 1 comments | | HN request time: 0.211s | source
Show context
roblabla ◴[] No.41031699[source]
This is some very poor journalism. The linux issues are so, so very different from the windows BSOD issue.

The redhat kernel panics were caused by a bug in the kernel ebpf implementation, likely a regression introduced by a rhel-specific patch. Blaming crowdstrike for this is stupid (just like blaming microsoft for the crowdstrike bsod is stupid).

For background, I also work on a product using eBPFs, and had kernel updates cause kernel panics in my eBPF probes.

In my case, the panic happened because the kernel decided to change an LSM hook interface, adding a new argument in front of the others. When the probe gets loaded, the kernel doesn’t typecheck the arguments, and so doesn’t realise the probe isn’t compatible with the new kernel. When the probe runs, shit happens and you end up with a kernel panic.

eBPF probes causing kernel panics are almost always indication of a kernel bug, not a bug in the ebpf vendor. There are exceptions of course (such as an ebpf denying access to a resource causing pid1 to crash). But they’re very few.

replies(4): >>41031896 #>>41032164 #>>41032610 #>>41034621 #
mbesto ◴[] No.41032164[source]
> just like blaming microsoft for the crowdstrike bsod is stupid

Wait, how is this stupid? Unless I'm missing something, wasn't the patch part of a Microsoft payload that included an update to Crowdstrike? Surely Crowdstrike is culpable, but that doesn't completely absolve Microsoft of any responsibility, as its their payload.

replies(7): >>41032197 #>>41032249 #>>41032287 #>>41032415 #>>41032517 #>>41032630 #>>41032666 #
1. GuB-42 ◴[] No.41032415[source]
What I understand is that some Azure VMs are running CrowdStrike, and like any other computer running CrowdStrike on Windows, they crashed. Totally not Microsoft's fault, CrowdStrike messed with the kernel, the only thing we can blame Microsoft on is to allow such a software to exist.

Where Microsoft is to blame however is the unrelated Azure outage in the Central US region that happened (and was fixed) just before the CrowdStrike faulty update.