Fifty Years of Open Source Software Supply Chain Security

(queue.acm.org)

Show context

neuroelectron ◴[07 Apr 25 21:30 UTC] No.43616167[source]▶

Very suspicious article. Sounds like the "nothing to see here folks, move along" school of security.

Reproducibility is more like a security smell; a symptom you’re doing things right. Determinism is the correct target and subtly different.

The focus on supply chain is a distraction, a variant of The “trusting trust” attack Ken Thompson described in 1984 is still among the most elegant and devastating. Infected development toolchains can spread horizontally to “secure” builds.

Just because it’s open doesn’t mean anyone’s been watching closely. "50 years of security"? Important pillars of OSS have been touched by thousands of contributors with varying levels of oversight. Many commits predate strong code-signing or provenance tracking. If a compiler was compromised at any point, everything it compiled—including future versions of itself—could carry that compromise forward invisibly. This includes even "cleanroom" rebuilds.

replies(4): >>43616257 #>>43617725 #>>43621870 #>>43622202 #

lrvick ◴[07 Apr 25 21:41 UTC] No.43616257[source]▶

>>43616167 #

The best defense we have against the Trusting Trust attack is full source bootstrapping, now done by two distros: Guix and Stagex.

replies(2): >>43616330 #>>43625793 #

1. AstralStorm ◴[07 Apr 25 21:50 UTC] No.43616330[source]▶

>>43616257 #

No you do not. If you have not actually validated each and every source package your trust is only related to the generated binaries corresponding to the sources you had. The trusting trust attack was deployed against the source code of the compiler, poisoning specific binaries. Do you know if GCC 6.99 or 7.0 doesn't put a backdoor in some specific condition?

There's no static or dynamic analysis deployed to enhance this level of trust.

The initial attempts are simulated execution like in valgrind, all the sanitizer work, perhaps difference on the functional level beyond the text of the source code where it's too easy to smuggle things through... (Like on an abstracted conditional graph.)

We cannot even compare binaries or executables right given differing compiler revisions.

replies(4): >>43616446 #>>43616959 #>>43617254 #>>43618041 #

2. neuroelectron ◴[07 Apr 25 22:02 UTC] No.43616446[source]▶

>>43616330 (TP) #

Besides full source boostrapping which could adopt progressive verification of hardware features and assumption of untrusted hardware, integration of Formal Verification into the lowest levels of boostrapping is a must. Bootstap security with the compiler.

This won't protect against more complex attacks like RoP or unverified state. For that we need to implement simple artifacts that are verifiable and mapped. Return to more simple return states (pass/error). Do error handling external to the compiled binaries. Automate state mapping and combine with targeted fuzzing. Systemd is a perfect example of this kind of thing, what not to do: internal logs and error states being handled by a web of interdependent systems.

replies(1): >>43616594 #

3. AstralStorm ◴[07 Apr 25 22:22 UTC] No.43616594[source]▶

>>43616446 #

RoP and unverified state would at least be highlighted by such an analysis. Generally it's a lot of work and we cannot quite trust fully automated systems to keyword it to us... Especially when some optimizer changes between versions of the compiler. Even a single compile flag can throw the abstract language upside down, much less the execution graph...

Fuzzing is good but probabilistic. It is unlikely to hit on a deliberate backdoor. Solid for finding bugs though.

replies(1): >>43616968 #

4. lrvick ◴[07 Apr 25 23:25 UTC] No.43616959[source]▶

>>43616330 (TP) #

So for example, Google uses a goobuntu/bazel based toolchain to get their go compiler binaries.

The full source bootstrapped go compiler binaries in stagex exactly match the hashes of the ones Google releases, giving us as much confidence as we can get in the source->binary chain, which until very recently had no solution at all.

Go has unique compiler design choices that make it very self contained that make this possible, though we also can deterministically build rust, or any other language from any OCI compatible toolchain.

You are talking about one layer down from that, the source code itself, which is our next goal as well.

Our plan is this:

1. Be able to prove all released artifacts came from hash locked source code (done)

2. Develop a universal normalized identifier for all source code regardless of origin (treehash of all source regardless of git, tar file etc, ignoring/removing generated files, docs, examples, or anything not needed to build) (in progress)

3. Build distributed code review system to coordinate the work to multiple signed reviews by reputable security researchers for every source package by its universal identifier (planning stages)

We are the first distro to reach step 1, and have a reasonably clear path to steps 2 and 3.

We feel step 2 would be a big leap forward on its own, as it would have fully eliminated the xz attack where the attack hid in the tar archive, but not the actual git tree.

Pointing out these classes of problem is easy. I know, did it for years. Actually dramatically removing attack surface is a lot more rewarding.

Help welcome!

5. lrvick ◴[07 Apr 25 23:27 UTC] No.43616968{3}[source]▶

>>43616594 #

I agree here. Use automated tools to find low hanging fruit, or mistakes.

There is unfortunately no substitute for a coordinated effort to document review by capable security researchers on our toolchain sources.

6. rcxdude ◴[08 Apr 25 00:26 UTC] No.43617254[source]▶

>>43616330 (TP) #

That's a different problem. The threat in Trusting Trust is that the backdoor may not ever appear in public source code.

7. pabs3 ◴[08 Apr 25 03:12 UTC] No.43618041[source]▶

>>43616330 (TP) #

Code review systems like CREV are the solution to backdoors being present in public source code.

https://github.com/crev-dev/

↑