Most active commenters
  • _flux(5)
  • mbreese(4)
  • hello_moto(4)

←back to thread

158 points kenjackson | 18 comments | | HN request time: 1.771s | source | bottom
Show context
roblabla ◴[] No.41031699[source]
This is some very poor journalism. The linux issues are so, so very different from the windows BSOD issue.

The redhat kernel panics were caused by a bug in the kernel ebpf implementation, likely a regression introduced by a rhel-specific patch. Blaming crowdstrike for this is stupid (just like blaming microsoft for the crowdstrike bsod is stupid).

For background, I also work on a product using eBPFs, and had kernel updates cause kernel panics in my eBPF probes.

In my case, the panic happened because the kernel decided to change an LSM hook interface, adding a new argument in front of the others. When the probe gets loaded, the kernel doesn’t typecheck the arguments, and so doesn’t realise the probe isn’t compatible with the new kernel. When the probe runs, shit happens and you end up with a kernel panic.

eBPF probes causing kernel panics are almost always indication of a kernel bug, not a bug in the ebpf vendor. There are exceptions of course (such as an ebpf denying access to a resource causing pid1 to crash). But they’re very few.

replies(4): >>41031896 #>>41032164 #>>41032610 #>>41034621 #
mbesto ◴[] No.41032164[source]
> just like blaming microsoft for the crowdstrike bsod is stupid

Wait, how is this stupid? Unless I'm missing something, wasn't the patch part of a Microsoft payload that included an update to Crowdstrike? Surely Crowdstrike is culpable, but that doesn't completely absolve Microsoft of any responsibility, as its their payload.

replies(7): >>41032197 #>>41032249 #>>41032287 #>>41032415 #>>41032517 #>>41032630 #>>41032666 #
sschueller ◴[] No.41032517[source]
Microsoft should revoke the CrowdStrike driver signature and should do an internal check as to why CrowdStrike's driver was approved when it can execute arbitrary code on the kernel level without any checks. If your "driver" requires this feature MS should require CrowdStrike to submit the entire source and they should have to pay MS to do a review of the code.

What is the point of driver signing if a vendor can basically build in a back door and Microsoft doesn't validate that this back door is at least somewhat reasonable

replies(6): >>41032936 #>>41033093 #>>41033397 #>>41033699 #>>41034816 #>>41034838 #
1. _flux ◴[] No.41032936[source]
Do you think Microsoft customers using CrowdStrike would then be happier, being unable to run the software at all, due to an action Microsoft took?

Backdoors of all kinds can be installed to most any operating system without vendor co-operation. That is the nature of general-purpose operating systems.

replies(3): >>41033775 #>>41033840 #>>41034227 #
2. sigseg1v ◴[] No.41033775[source]
I'm a customer that is forced to use CrowdStrike via IT policies and I would be giddy with delight if something came along and caused the removal of it from my systems. I don't need programs sitting on my computer preventing me from installing code that I've literally just compiled, preventing me from deleting or modifying folders on my machine, and causing extreme lag for many basic system operations even when it does work. At this point, the time in lost productivity (via normal operation) and downtime (via their recent bug) has easily exceeded a thousand times over the aggregate sum of all benefits that CrowdStrike will ever have provided from threat detection and prevention. It's time to remove the malware.
replies(3): >>41033889 #>>41034040 #>>41034857 #
3. mbreese ◴[] No.41033840[source]
At this point… yes.

It would be one thing Microsoft could do to focus 100% of the attention/blame away from Windows and onto CloudStrike. And customers will want their pound of flesh from somewhere.

Really, this should serve as a wake up call w/in Microsoft to start to harden the kernel against such vulnerabilities.

Was the crash the fault of Windows? No. But did a Windows design decision make this possible? yes.

I’m sure the design decision made sense at the time (at least business sense). Keeping the kernel more open for others to add drivers to makes it easier to write/add drivers, but makes the system more vulnerable. This a good opportunity within Microsoft to get support for changing that.

replies(1): >>41034028 #
4. _flux ◴[] No.41033889[source]
You are not the customer, though, your employer is the customer.

Perhaps you should push this change up in the food chain, then, and if the company is good the request will be taken seriously. As I understand it, while CrowdStrike is the biggest name in EDR, it's far from the only one, if that's what your company requires to pass some checkboxes in certifications.

replies(1): >>41034900 #
5. _flux ◴[] No.41034028[source]
Ultimately this would have been almost a non-issue if there had been better deployment strategies in place for also the data file updates.

If by changing the system you mean adding some kind of in-kernel isolation to it, then I don't think it would be worth the effort to make that kind of major change to the way operating systems work just to give arguably a minor risk reduction to systems—in particular if CrowdStrike and other vendors take some learnings from this event.

Microsoft might improve their system rollback mechanism to also include files that are not strictly integrated to the system, merely used by the parts that are (the channel files loaded by the driver).

Actually I think we can just be happy that the incident was a mistake, not an attack. Had this kind of "first ever" situation been an attack, it could be extremely difficult to recover from it. I wonder how well EDRs deal with "attacks from within"..

CrowdStrike pulled off the update within 1.5 hours. I wonder if they actually use Falcon themselves? But then somehow missed the problem? Doesn't seem like they eat their own dog food :). (Or at least their own channel files.)

replies(2): >>41034669 #>>41034911 #
6. ◴[] No.41034040[source]
7. CaptainZapp ◴[] No.41034227[source]
> Backdoors of all kinds can be installed to most any operating system without vendor co-operation

Not on Kernel level. Not without active support by the vendor.

replies(1): >>41034340 #
8. _flux ◴[] No.41034340[source]
How much does it really help you if your complete user-space can still be messed up by an offending Windows SYSTEM process? As I understand it, they are able to hurt the system e.g. by killing processes, uninstalling applications, replacing binaries, allocating memory, starting too many processes, ..

Actually I could easily see a buggy remote system management update could just decide to uninstall everything and nuke the system, because it thinks it's stolen. And it would be designed functionality for it.

9. mbreese ◴[] No.41034669{3}[source]
If many things had gone differently, this could have been avoided. But I’m looking at this from the Microsoft perspective. No matter how much people scream high and loud that it was a CloudStrike issue and not Microsoft’s fault, Microsoft is still getting blamed. It’s a Windows BSOD.

I talked to my dad (retired enterprise operations/IT) this weekend and he was telling me that the next computer he buys will probably be a Mac, largely because he doesn’t want to deal with the possibility of a crash like this. Does he run CloudStrike? Not at all. Does he know who they are? Nope. (He’s been retired for a while) What he does know (well, thinks) is that Windows now has an unstable kernel.

And Microsoft has no control over distribution policies for other vendors. How those vendors distribute updates is up to them. Even if a sane deployment strategy could have avoided the larger global problems, Microsoft can’t control that.

So, if you have Microsoft dealing with negative publicity and public sentiment, with no way to control errors like this in the future, what can you do? To me, the best they can do is kneecap CloudStrike, put the full blame on them, and use this as an excuse to change the kernel/driver model to one where they can have more control over the stability of the OS.

replies(1): >>41034932 #
10. hello_moto ◴[] No.41034857[source]
Sounds like your IT (sec team, specifically) doesn't setup the software correctly.

I've worked for a company that installs Falcon on all its fleet and I never run into issues like yours.

11. hello_moto ◴[] No.41034900{3}[source]
Vendors are competing with one and another to win contracts.

CIO/CISO don't select vendors lightly.

There seems to be a typical/classical Engineer's mindset of "make a claim first, ask later" around the subject lately.

"My boss plays golf with Sales Rep" might need more proof because if they selected the lesser capable vendors and they got hit with ransomware, bet my ass your boss will no longer play Golf with any Sales Rep ever.

replies(1): >>41043441 #
12. roblabla ◴[] No.41034911{3}[source]
There's a simple thing microsoft could do to avoid this, that doesn't require anything too crazy. EDRs work in kernel-land because that's the only place you can place yourself to block certain things, like process creation, driver loading, etc...

macOS has a userland API for this, called EndpointSecurity, which allows doing all the things an EDR needs, without ever touching kernelland. Microsoft could introduce a similar API, and EDRs would no longer need a driver.

replies(2): >>41035024 #>>41035439 #
13. hello_moto ◴[] No.41034932{4}[source]
They will kneecap Security industry and open up another can of worm: Windows insecure back on the menu.
replies(1): >>41035376 #
14. _flux ◴[] No.41035024{4}[source]
I suppose that's what CrowdStrike's system on Mac uses as well, then. Apparently on Linux they use EBPF and Microsoft is researching that for Windows as well: https://github.com/microsoft/ebpf-for-windows . So maybe that's actually the solution they'll go with?

It would certainly help solving this particular problem, even if not the kernel-integration in general.

15. mbreese ◴[] No.41035376{5}[source]
There are other vendors.

Microsoft could even reinstate CloudStrike at some point, but only after an extensive review process. And then probably require similar process reviews/checks for any other vendor that requires the same kernel access.

Or just remove the need for kernel access at all and migrate to a better driver architecture at the sacrifice of backwards compatibility. Security software doesn’t need to run in kernel space… there are other ways.

replies(1): >>41035499 #
16. mbreese ◴[] No.41035439{4}[source]
This is exactly what I’d advocate for. There are many things that run in kernel space that don’t need to. The Mac model with user-land hooks is one model. EBPF from Linux (and Windows?) is another.

I’m sure the reason why Apple migrated was because of all of the bugs/crashes security companies kept introducing into the kernel with kexts. Apple had the ability to change their architecture on a whim because of they aren’t quite a beholden to backwards compatibility as Windows.

Microsoft could take this as an opportunity to make some major changes that would be more readily accepted by the market.

17. hello_moto ◴[] No.41035499{6}[source]
That could potentially be a lawsuit against MSFT since their own MSFT Defender is in this space and potentially doing the same thing or else they have way less potency of catching attacks no?
18. CRConrad ◴[] No.41043441{4}[source]
> Vendors are competing with one and another to win contracts.

Sure, in a well-functioning market economy without any distortions. But there are lots of those at play, so competition is severely hampered (by network effects, regulatory capture, and on and on... Up to and including, I suspect, mere ephemeral fashion). What we actually have in many areas of the "tech market" are oligopolies and near-monopolies, not perfect competition.

> CIO/CISO don't select vendors lightly.

Muahaha. Seems rather more like they're at least as naïve as any Web-surfing consumer on their sofa, easily bamboozled by trendy buzzwords and slick marketing campaigns.