Most active commenters
  • woodruffw(5)
  • takluyver(4)
  • eesmith(3)
  • belval(3)

←back to thread

218 points miketheman | 14 comments | | HN request time: 0.635s | source | bottom
Show context
belval ◴[] No.42137562[source]
I have a bit of uneasiness about how this is heavily pushing GitHub actions as the correct way to publish to PyPI. I had to check PEP740 to make sure it was not directly supported by Microsoft.

> The generation and publication of attestations happens by default, and no changes are necessary for projects that meet all of these conditions: publish from GitHub Actions; via Trusted Publishing; and use the pypa/gh-action-pypi-publish action to publish.

If you then click on "The manual way" it adds a big disclaimer:

> STOP! You probably don't need this section; it exists only to provide some internal details about how attestation generation and uploading work. If you're an ordinary user, it is strongly recommended that you use one of the official workflows described above.

Where the only official workflow is "Use GitHub Actions".

I guess I am an idealist but as a maintainer this falls short of my expectations for the openness of Python and PyPI.

replies(9): >>42137628 #>>42137831 #>>42138035 #>>42138967 #>>42140525 #>>42140881 #>>42142188 #>>42144001 #>>42144423 #
woodruffw ◴[] No.42137628[source]
> Where the only official workflow is "Use GitHub Actions".

The standard behind this (PEP 740) supports anything that can be used with Trusted Publishing[1]. That includes GitLab, Google Cloud, ActiveState, and can include any other OIDC IdP if people make a good case for including it.

It's not tied to Microsoft or GitHub in any particular way. The only reason it emphasizes GitHub Actions is because that's where the overwhelming majority of automatic publishing traffic comes from, and because it follows a similar enablement pattern as Trusted Publishing did (where we did GitHub first, followed by GitLab and other providers).

[1]: https://docs.pypi.org/trusted-publishers/

replies(6): >>42137658 #>>42137713 #>>42139209 #>>42140207 #>>42140433 #>>42143213 #
belval ◴[] No.42137713[source]
I get that, that's why I didn't go "This is Embrace Extend Extinguish", but as constructive feedback I would recommend softening the language and to replace:

> STOP! You probably don't need this section;

In https://docs.pypi.org/attestations/producing-attestations/#t...

Perhaps also add a few of the providers you listed as well?

> The only reason it emphasizes GitHub Actions is because that's where the overwhelming majority of automatic publishing traffic comes from

GitHub being popular is a self-reinforcing process, if GitHub is your first class citizen for something as crucial as trusted publishing then projects on GitHub will see a higher adoption and become the de-facto "secure choice".

replies(2): >>42137810 #>>42145611 #
woodruffw ◴[] No.42137810[source]
> but as constructive feedback I would recommend softening the language and to replace:

I can soften it, but I think you're reading it excessively negatively: that warning is there to make sure people don't try to do the fiddly, error-prone cryptographic bits if they don't need to. It's a numerical fact that most project owners don't need that section, since most are either using manual API tokens or are publishing via GitHub Actions.

> Perhaps also add a few of the providers you listed as well?

They'll be added when they're enabled. Like I said in the original comment, we're using a similar enablement pattern as happened with Trusted Publishing: GitHub was enabled first because it represents the majority of publishing traffic, followed by GitLab and the others.

> GitHub being popular is a self-reinforcing process, if GitHub is your first class citizen for something as crucial as trusted publishing then projects on GitHub will see a higher adoption and become the de-facto "secure choice".

I agree, but I don't think this is PyPI's problem to solve. From a security perspective, PyPI should prioritize the platforms where the traffic is.

(I'll note that GitLab has been supported by Trusted Publishing for a while now, and they could make the publishing workflow more of a first class citizen, the way it is on GHA.)

replies(3): >>42138119 #>>42138610 #>>42140447 #
belval ◴[] No.42138119[source]
> I agree, but I don't think this is PyPI's problem to solve. From a security perspective, PyPI should prioritize the platforms where the traffic is.

To me that's a bit of a weird statement, PyPI is part of the Python foundation, making sure that the project remains true to its open-source nature is reasonable?

My concern is that these type of things ultimately play out as "we are doing the right thing to limit supply chain attacks" which is good an defendable, but in ~5 years PyPI will have an announcement that they are sunsetting PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.

That being said we can agree to disagree, I am not part of the PSF and I did preface my first comment with "I guess I am an idealist".

replies(1): >>42138248 #
woodruffw ◴[] No.42138248[source]
> making sure that the project remains true to its open-source nature is reasonable?

What about this, in your estimation, undermines the open-source nature of PyPI? Nothing about this is proprietary, and I can't think of any sane definition of OSS in which PyPI choosing to verify OIDC tokens from GitHub (among other IdPs!) meaningfully subverts PyPI's OSS committment.

> PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.

Let me put it this way: if PyPI disables API tokens in favor of mandatory Trusted Publishing, I will eat my shoe on a livestream.

(I was the one of the engineers for both API tokens and Trusted Publishing on PyPI. They're complementary, and neither can replace the other.)

replies(2): >>42141048 #>>42145152 #
1. mananaysiempre ◴[] No.42141048[source]
> What about this, in your estimation, undermines the open-source nature of PyPI?

Absence of support for self-hosting, in the spirit of freedom 0 = OSD 5&6? Or, for that matter, for any provider whose code is fully open source?

replies(1): >>42141092 #
2. woodruffw ◴[] No.42141092[source]
> Absence of support for self-hosting, or for that matter for any non-proprietary service?

This has nothing to do with self-hosting, whatsoever. You can upload to PyPI with an API token; that will always work and will not do anything related to Trusted Publishing, which exists entirely because it makes sense for large services.

PyPI isn't required to federate with the server in my basement through OpenID Connect to be considered open source.

replies(1): >>42145751 #
3. takluyver ◴[] No.42145751[source]
I believe you that token uploads will continue to be possible, but it seems likely that in a couple of years trusted publishing & attestations will be effectively required for all but the tiniest project. You'll get issues and PRs to publish this way, and either you accept them, or you have to repeatedly justify what you've got against security.

And maybe that's a good thing? I'm not against security, and supply chain attacks are real. But it's still kind of sad that the amazing machines we all own are more and more just portals to the 'trusted' corporate clouds. And I think there are things that could be done to improve security with local uploads, but all the effort seems to go into the cloud path.

replies(3): >>42146801 #>>42147153 #>>42152403 #
4. eesmith ◴[] No.42146801{3}[source]
> or you have to repeatedly justify what you've got against security.

The only reason I started using PyPI was because I had a package on my website that someone else uploaded to PyPI, and I started getting support questions about it. The person did transfer control over to me - he was just trying to be helpful.

I stopped caring about PyPI with the 2FA requirement since I only have one device - my laptop - while they seem to expect that everyone is willing to buy a hardware device or has a smartphone, and I frankly don't care enough to figure it out since I didn't want to be there in the first place and no one paid me enough to care.

Which means there is a security issue whenever I make a new package available only on my website should someone decide to upload it to PyPI, perhaps along with a certain something extra, since people seem to think PyPI is authoritative and doesn't need checking.

replies(1): >>42150569 #
5. woodruffw ◴[] No.42147153{3}[source]
Thank you for being the first person to make a non-conspiratorial argument here! I agree with your estimation: PyPI is not going to mandate this, but it’s possible that there will be social pressure from individual package consumers to adopt attestations.

This is an unfortunate double effect, and one that I’m aware of. That’s why the emphasis has been on enabling them by default for as many people as possible.

I also agree about the need for a local/self-hosted story. We’ve been thinking about how to enable similar attestations with email and domain identities, since PyPI does or could have the ability to verify both.

replies(1): >>42150463 #
6. takluyver ◴[] No.42150463{4}[source]
If there is time for someone to work on local uploads, a good starting point would be a nicer workflow for uploading with 2FA. At present you either have to store a long lived token somewhere to use for many uploads, and risk that it is stolen, or fiddle about creating & then removing a token to use for each release.
7. takluyver ◴[] No.42150569{4}[source]
The 2FA requirement doesn't need a smartphone. You can generate the same one time passwords on a laptop. I know Bitwarden has this functionality, and there are other apps out there if that's not your cup of tea. Sorry that you feel pressured, but it is significantly easier to express a dependency on a package if it's on PyPI than a download on your own site.
replies(1): >>42151346 #
8. eesmith ◴[] No.42151346{5}[source]
Sure. But PyPI provides zero details on the process, I don't use 2FA for anything else in my life, no one is paying me to care, I find making PyPI releases tedious because I inevitably make mistakes in my release process, I have a strong aversion to centralization and dependencies[1][2].

I tell people to "pip install -i $MY_SITE $MY_PACKAGE". I can tell from my download logs that this is open to dependency confusion attacks as I can see all the 404s from attempts to, for example, install NumPy from my server. To be clear, the switch to 2FA was only the triggering straw - I was already migrating my packages off of PyPI.

Finally, I sell a source license for a commercial product (which is not the one which got me started with PyPI). My customers install it via their internally-hosted PyPI mirrors.

I provide a binary package with a license manager for evaluation purposes, and as a marketing promotion. As such, I really want them to come to my web site, see the documentation and licensing options, and contact me. I think making it easier to express as a dependency via PyPI does not help my sales, and actually believe the extra intermediation likely hinders my sales.

[1] I dislike dependencies so much that I figured out how to make a PEP 517 compatible version that doesn't need to contact PyPI simply to install a local package. Clearly I will not become a Rust developer.

[2] PyPI support depends on GitHub issues. I regard Microsoft as a deeply immoral company, and a threat to personal and national data sovereignty, which means I will not sign up for a GitHub account. When MS provides IT support for the upcoming forced mass deportations, I will have already walked away from Omelas.

replies(1): >>42167862 #
9. ryan29 ◴[] No.42152403{3}[source]
> I believe you that token uploads will continue to be possible, but it seems likely that in a couple of years trusted publishing & attestations will be effectively required for all but the tiniest project.

That's what I think will happen.

> And maybe that's a good thing? I'm not against security, and supply chain attacks are real.

The problem is the attestation is only for part of the supply chain. You can say "this artifact was built with GitHub Actions" and that's it.

If I'm using Gitea and Drone or self-hosted GitLab, I'm not going to get trusted publisher attestations even though I stick to best practices everywhere.

Contrast that with someone that runs as admin on the same PC they use for pirating software, has a passwordless GPG key that signs all their commits, and pushes to GitHub (Actions) for builds and deployments. That person will have more "verified" badges than me and, because of that, would out-compete me if we had similar looking projects.

The point being that knowing how part of the supply chain works isn't sufficient. Security considerations need to start the second your finger touches the power button on your PC. The build tool at the end of the development process is the tip of the iceberg and shouldn't be relied on as a primary indicator of trust. It can definitely be part of it, but only a small part IMO.

The only way a trusted publisher (aka platform) can reliably attest to the security of the supply chain is if they have complete control over your development environment which would include a boot-locked PC without admin rights, forced MFA with a trustworthy (aka their) authenticator, and development happening 100% on their cloud platform or with tools that come off a safe-list.

Even if everyone gets onboard with that idea it's not going to stop bad actors. It'll be exactly the same as bad actors setting up companies and buying EV code signing certificates. Anyone with enough money to buy into the platform will immediately be viewed with a baseline of trust that isn't justified.

replies(1): >>42156341 #
10. takluyver ◴[] No.42156341{4}[source]
As I understand it, the point of these attestations is that you can see what goes into a build on GitHub - if you look at the recorded commit on the recorded repo, you can be confident that the packages are made from that (unless your threat model is GitHub itself doing a supply chain attack). And the flip side of that is that if attestations become the norm, it's harder to slip malicious code into a package without it being noticed.

That's not everything, but it is a pretty big step. I don't love the way it reinforces dependence on a few big platforms, but I also don't have a great alternative to suggest.

replies(1): >>42168049 #
11. zvr ◴[] No.42167862{6}[source]
Have you maybe documented what you have done, so that others who want to follow the same path can look up some information?
replies(1): >>42171837 #
12. ryan29 ◴[] No.42168049{5}[source]
Yeah, if the commit record acts like an audit log I think there’s a lot of value. I wonder how hard it is to get the exact environment used to build an artifact.

I’m a big fan of this style [1] of building base containers and think that keeping the container where you’ve stacked 4 layers (up to resources) makes sense. Call it a build container and keep it forever.

1. https://phauer.com/2019/no-fat-jar-in-docker-image/

13. eesmith ◴[] No.42171837{7}[source]
No, I haven't. The main idea is to create your own in-tree build backend, described at https://peps.python.org/pep-0517/#in-tree-build-backends .

In short, use "backend-path" to include a subdirectory which contains your local copies of setuptools, wheel, etc. Create a file with the build hooks appropriate for "backend-path". Have that those hooks import the actual hooks in setuptools. Finally, set your "requires" to [].

Doing this means taking on a support burden of maintaining setuptools, wheels, etc. yourself. You'll also need to include their copyright statements in any distribution, even though the installed code doesn't use them.

(As I recall, that "etc" is hiding some effort to track down and install the full list of packages dragged in, but right now I don't have ready access to that code base.)

replies(1): >>42184754 #
14. zvr ◴[] No.42184754{8}[source]
Thanks for the info.