Most active commenters

bastawhiz(6)
viraptor(3)
lmeyerov(3)

Popular/hot comments

>>44383817 #

←back to thread

Libxml2's "no security embargoes" policy

(lwn.net)

Show context

arp242 ◴[25 Jun 25 22:06 UTC] No.44382233[source]▶

>>44381093 (OP) #

A lot of these "security bugs" are not really "security bugs" in the first place. Denial of service is not resulting in people's bank accounts being emptied or nude selfies being spread all over the internet.

Things like "panics on certain content" like [1] or [2] are "security bugs" now. By that standard anything that fixes a potential panic is a "security bug". I've probably fixed hundreds if not thousands of "security bugs" in my career by that standard.

Barely qualifies as a "security bug" yet it's rated as "6.2 Moderate" and "7.5 HIGH". To say nothing of gazillion "high severity" "regular expression DoS" nonsense and whatnot.

And the worst part is all of this makes it so much harder to find actual high-severity issues. It's not harmless spam.

[1]: https://github.com/gomarkdown/markdown/security/advisories/G...

[2]: https://rustsec.org/advisories/RUSTSEC-2024-0373.html

replies(13): >>44382268 #>>44382299 #>>44382855 #>>44384066 #>>44384368 #>>44384421 #>>44384513 #>>44384791 #>>44385347 #>>44385556 #>>44389612 #>>44390124 #>>44390292 #

viraptor ◴[25 Jun 25 23:45 UTC] No.44382855[source]▶

>>44382233 #

> Denial of service is not resulting in ...

DoS results in whatever the system happens to do. It may well result in bad things happening, for example stopping AV from scanning new files, breaking rate limiting systems to allow faster scanning, hogging all resources on a shared system for yourself, etc. It's rarely a security issue in isolation, but libraries are never used in isolation.

replies(2): >>44383029 #>>44383134 #

1. bastawhiz ◴[26 Jun 25 00:34 UTC] No.44383134[source]▶

>>44382855 #

An AV system stopping because of a bug in a library is bad, but that's not because the library has a security bug. It's a security problem because the system itself does security. It would be wild if any bug that leads to a crash or a memory leak was a "security" bug because the library might have been used by someone somewhere in a context that has security implications.

A bug in a library that does rate limiting arguably is a security issue because the library itself promises to protect against abuse. But if I make a library for running Lua in redis that ends up getting used by a rate limiting package, and my tool crashes when the input contains emoji, that's not a security issue in my library if the rate limiting library allows emails with punycode emoji in them.

"Hogging all of the resources on a shared system" isn't a security bug, it's just a bug. Maybe an expensive one, but hogging the CPU or filling up a disk doesn't mean the system is insecure, just unavailable.

The argument that downtime or runaway resource use due is considered a security issue but only if the problem is in someone else's code is some Big Brained CTO way of passing the buck onto open source software. If it was true, Postgres autovacuuming due to unpleasant default configuration would be up there with Heartbleed.

Maybe we need a better way of alerting downstream users of packages when important bugs are fixed. But jamming these into CVEs and giving them severities above 5 is just alert noise and makes it confusing to understand what issues an organization should actually care about and fix. How do I know that the quadratic time regexp in a string formatting library used in my logging code is even going to matter? Is it more important than a bug in the URL parsing code of my linter? It's impossible to say because that responsibility was passed all the way downstream to the end user. Every single person needs to make decisions about what to upgrade and when, which is an outrageous status quo.

replies(3): >>44383193 #>>44383817 #>>44384248 #

2. viraptor ◴[26 Jun 25 00:46 UTC] No.44383193[source]▶

>>44383134 (TP) #

> An AV system stopping because of a bug in a library is bad, but that's not because the library has a security bug.

(And other examples) That's a fallacy of looking for the root cause. The library had an issue, the system had an issue and together they resulted in a problem for you. Some issues will be more likely to result in security problems than others, so we classify them as such. We'll always deal with probabilities here, not clear lines. Otherwise we'll just end up playing a blame game "sure, this had a memory overflow, but it's package fault for not enabling protections that would downgrade it to a crash", "no it's deployments fault for not limiting that exploit to just this users data partition", "no it's OS fault for not implementing detailed security policies for every process", ...

replies(1): >>44384117 #

3. lmeyerov ◴[26 Jun 25 02:51 UTC] No.44383817[source]▶

>>44383134 (TP) #

Traditional security follows the CIA triad: Confidentiality (ex: data leaks), Integrity (ex: data deletion), and Availability (ex: site down). Something like SOC2 compliance typically has you define where you are on these, for example

Does availability not matter to you? Great. For others, maybe it does, like you are some medical device segfaulting or OOMing in an unmanaged way on a cfg upload is not good. 'Availability' is a pretty common security concern for maybe 40 years now from an industry view.

replies(4): >>44383876 #>>44384078 #>>44384219 #>>44385894 #

4. int_19h ◴[26 Jun 25 03:06 UTC] No.44383876[source]▶

>>44383817 #

We're talking about what's reasonable to expect as a baseline. A higher standard isn't wrong, obviously, but those who need it shouldn't be expecting others to provide it by default, and most certainly not for free.

5. bastawhiz ◴[26 Jun 25 04:10 UTC] No.44384117[source]▶

>>44383193 #

But it's not treated as dealing in probabilities. The CVEs (not that I think they're even worthwhile) are given scores that ignore the likelihood of an issue being used in a security sensitive context. They're scored for the worst case scenario. And if we're dealing with probabilities, it puts less onus on people who actually do things where security matters and spams everyone else where those probabilities are unrealistic, which in a huge majority of cases.

This is worse for essentially everyone except the people who should be doing more diligence around the code that they use. If you need code to be bug free (other than the fact that you're delusional about the notion of "bug free" code) you're just playing the blame game when you don't put protections in place. And I'm not talking about memory safety, I'm talking about a regexp with pathological edge cases or a panic in user inputs. If you're not handling unexpected failure modes from code you didn't write and inspect, why does that make it a security issue where the onus is on the library maintainer?

replies(1): >>44384590 #

6. bastawhiz ◴[26 Jun 25 04:37 UTC] No.44384219[source]▶

>>44383817 #

> some medical device segfaulting or OOMing in an unmanaged way

Memory safety is arguably always a security issue. But a library segfaulting when NOT dealing with arbitrary external input wouldn't be a CVE in any case, it's just a bug. An external third party would need to be able to push a crafted config to induce a segfault. I'm not sure what kind of medical device, short of a pacemaker that accepts Bluetooth connections, might fall into such a category, but I'd argue that if a crash in your dependencies' code prevents someone's heart from beating properly, relying CVEs to understand the safety of your system is on you.

Should excessive memory allocation in OpenCV for certain visual patterns be a CVE because someone might have built an autonomous vehicle with it that could OOM and (literally) crash? Just because you put the code in the critical path of a sensitive application doesn't mean the code has a vulnerability.

> 'Availability' is a pretty common security concern for maybe 40 years now from an industry view.

Of course! It's a security problem for me in my usage of a library because I made the failure mode of the library have security implications. I don't want my service to go offline, but that doesn't mean I should be entitled to having my application's exposure to failure modes affecting availability be treated on equal footing to memory corruption or an RCE or permissions bypass.

replies(2): >>44384385 #>>44384401 #

7. comex ◴[26 Jun 25 04:43 UTC] No.44384248[source]▶

>>44383134 (TP) #

This is a tangent from your main argument about DoS.

But when you talk about URL parsing in a linter or a regexp in logging code, I think you're implying that the bugs are unimportant, in part, because the code only handles trusted input.

Which is valid enough. The less likely some component is to receive untrusted input, the lower the severity should be.

But beware of going all the way and saying "it's not a bug because we assume trusted input". Whenever you do that, you're also passing down a responsibility to the user: the responsibility to segregate trusted and untrusted data!

Countless exploits have arisen when some parser never designed for untrusted input ended up being exposed to it. Perhaps that's not the parser's fault. But it always happens.

If you want to build secure systems, the only good approach is to stop using libraries that have those kinds of footguns.

replies(2): >>44384365 #>>44390342 #

8. scott_w ◴[26 Jun 25 05:15 UTC] No.44384365[source]▶

>>44384248 #

> But when you talk about URL parsing in a linter or a regexp in logging code, I think you're implying that the bugs are unimportant, in part, because the code only handles trusted input.

It is a bug but it’s not necessarily a security hole in the library. That’s what OP is saying.

replies(1): >>44384900 #

9. pjmlp ◴[26 Jun 25 05:19 UTC] No.44384385{3}[source]▶

>>44384219 #

Yes it should, software will eventually be liable like in any other industry that has been around for centuries.

replies(1): >>44390156 #

10. lmeyerov ◴[26 Jun 25 05:23 UTC] No.44384401{3}[source]▶

>>44384219 #

I agree on the first part, but it's useful to be more formal on the latter --

1. Agreed it's totally fine for a system to have some bugs or CVEs, and likewise fine for OSS maintainers to not feel compelled to address them. If someone cares, they can contribute.

2. Conversely, it's very useful to divorce some application's use case from the formal understanding of whether third-party components are 'secure' because that's how we stand on the shoulders of giants. First, it lets us make composable systems: if we use CIA parts, with some common definition of CIA, we get to carry that through to bigger parts and applications. Second, on a formal basis, 10-20 years after this stuff was understood to be useful, the program analysis community further realized we can even define them mathematically in many useful ways, where different definitions lead to different useful properties, and enables us to provably verify them, vs just test for them.

So when I say CIA nowadays, I'm actually thinking both mathematically irrespective of downstream application, and from the choose-your-own-compliance view. If some library is C+I but not A... that can be fine for both the library and the downstream apps, but it's useful to have objective definitions. Likewise, something can gradations of all this -- like maybe it preserves confidentiality in typical threat models & definitions, but not something like "quantitative information flow" models: also ok, but good for everyone to know what the heck they all mean if they're going to make security decisions on it.

replies(1): >>44385469 #

11. viraptor ◴[26 Jun 25 06:02 UTC] No.44384590{3}[source]▶

>>44384117 #

The score assigned to issues has to be the worst case one, because whoever is assessing it will not know how people use the library. The downstream users can then evaluate the issue and say it does/doesn't/kinda affects them with certainty and lower their internal impact. People outside that system would be only guessing. And you really don't want to guess "nobody would use it this way, it's fine" if it turns out some huge private deployment does.

replies(2): >>44385113 #>>44390221 #

12. comex ◴[26 Jun 25 06:58 UTC] No.44384900{3}[source]▶

>>44384365 #

Yes, that’s the OP’s main point, but their choice of examples suggests that they are also thinking about trusted input.

13. tsimionescu ◴[26 Jun 25 07:38 UTC] No.44385113{4}[source]▶

>>44384590 #

> The downstream users can then evaluate the issue and say it does/doesn't/kinda affects them with certainty and lower their internal impact.

Unfortunately that's not how it happens in practice. People run security scanners, and those report that you're using library X version Y which has a known vulnerability with a High CVSS score or whatever. Even if you provide a reasoned explanation of why that vulnerability doesn't impact your use case and you convince your customer's IT team of this, this is seen as merely a temporary waiver: very likely, you'll have the same discussion next time something is scanned and found to contain this.

The whole security audit system and industry is problematic, and often leads to huge amounts of busy work. Overly pessimistic CVEs are not the root cause, but they're still a big problem because of this.

14. holowoodman ◴[26 Jun 25 08:47 UTC] No.44385469{4}[source]▶

>>44384401 #

> So when I say CIA nowadays, I'm actually thinking both mathematically irrespective of downstream application, and from the choose-your-own-compliance view.

That doesn't help anyone, because it is far too primitive.

A medical device might have a deadly availability vulnerability. That in itself doesn't tell you anything about the actual severity of the vulnerability, because the exploit path might need "the same physical access as pulling the power plug". So not actually a problem.

Or the fix might need a long downtime which harms a number of patients. So maybe a problem, but the cure would be worse than the disease.

Or the vulnerability involves sending "I, Eve Il. Attacker, identified by badge number 666, do want to kill this patient" to the device. So maybe not a problem because an attacker will be caught and punished for murder, because the intent was clear.

replies(1): >>44385602 #

15. lmeyerov ◴[26 Jun 25 09:15 UTC] No.44385602{5}[source]▶

>>44385469 #

We're talking about different things. I agree CVE ratings and risk/severity/etc levels in general for third party libraries are awkward. I don't have a solution there. That does not mean we should stop reporting and tracking C+I+A violations - they're neutral, specific, and useful.

Risk, severity, etc are careful terms that are typically defined contextually relative to the application... yet CVEs do want some sort of prioritization level reported too for usability reasons, so it feels shoe-horned. Those words are useful in operational context where a team can prioritize based on them, and agreed, a third-party rating must be reinterpreted for the application's rating. CVE ratings is an area where it seems "something is better than nothing", and I don't think about it enough to have an opinion on what would be better.

Conversely, saying a library has a public method with an information flow leak is a statement that we can compositionally track (e.g., dataflow analysis). It's useful info that lets us stand on the shoulders of giants.

FWIW, in an age of LLMs, both kinds of information will be getting even more accessible and practical for many more people. I can imagine flipping my view on risk/severity to being more useful as the LLM can do the compositional reasoning in places the automated symbolic analyzers cannot.

16. ◴[26 Jun 25 10:12 UTC] No.44385894[source]▶

>>44383817 #

17. bastawhiz ◴[26 Jun 25 18:46 UTC] No.44390156{4}[source]▶

>>44384385 #

Who should be liable? The person who sells you the software? Or the person who put some code on GitHub that the first guy used?

18. bastawhiz ◴[26 Jun 25 18:53 UTC] No.44390221{4}[source]▶

>>44384590 #

> The downstream users can then evaluate the issue and say it does/doesn't/kinda affects them with certainty and lower their internal impact.

If you make it use the worst case lowest common denominator, it biases nearly everything towards Critical, and the actually critical stuff gets lost in a sea of prioritization. It's spam. If I got fifty emails for critical issues and two of them are really actually critical, I'm going to miss far more important ones than if I only got ten emails about critical issues.

If we all had infinite time and motivation, this wouldn't be a problem. But by being all-or-nothing purists, everything is worse in general.

19. bastawhiz ◴[26 Jun 25 19:07 UTC] No.44390342[source]▶

>>44384248 #

> But when you talk about URL parsing in a linter or a regexp in logging code, I think you're implying that the bugs are unimportant, in part, because the code only handles trusted input.

You proved my point, though. URL parsing is scary and it's a source of terrible security bugs. Not in a linter! Does it even have a means of egress? Is someone fetching the URLs that have been misparsed URLs from the output? How could you even deliver untrusted data to it?

In isolation, the issue is Bad On Paper. In context, the ability to actually exploit it meaningfully is vanishingly small if it even practically exists.

> Countless exploits have arisen when some parser never designed for untrusted input ended up being exposed to it. Perhaps that's not the parser's fault. But it always happens.

Yes! The cve should be for the tool that trusted code to do something it wasn't expected to do. Not for the code that failed in an unexpected circumstances. That's the point.

↑