←back to thread

107 points joouha | 1 comments | | HN request time: 0s | source

I've invented a new alternative to forking / vendoring / monkey-patching packages in Python.

It's a bit like OverlayFS for Python modules - it allows you write modifications for a target module (lower) in a new module (upper), and have these combined in a new virtual module (mount).

It works by rewriting imports using AST transformations, then running both the lower and upper module's code in the new Python module.

This prevents polluting the global namespace when monkey-patching, and means if you want to make changes to a third-party package, you don't have to take on the maintenance burden of forking, you can package and distribute just your changes.

Show context
nbadg ◴[] No.45666046[source]
For context: one of the several projects I'm working on right now is an automated extraction system for literate-code-style documentation in python. This isn't the place nor time to talk about the why of it (especially compared to other existing similar solutions). The important thing is the how: it uses a temporary import hook to stub out all module imports, allowing the docs generator to process each module independently at runtime, track imports between them, etc. At the end of the process, it also cleans itself up nicely.

Point being, it's a lot of really complicated fiddling with the python import system. And a lesson I have learned is that messing around with import internals in python is extremely tricky to get right. Furthermore, trying to coordinate correctly between modules that do and don't get modified my the hook is very finicky. Not to mention that supply side attacks on the import system itself could be a terrifying attack vector that would be absurdly difficult to detect.

All this to say, I'm not a big fan of monkeypatching, but I know exactly how it behaves, its edge cases, and what to expect if I do it. It is, after all, pretty standard practice to patch things during python unit tests. And even with all its warts, I would prefer patching to import fiddling any day of the week and twice on Sunday.

Feedback for the author: you need to explain the "why" of your project more thoroughly. I'm sure you had a good reason to strike out in this direction, and maybe this is a super elegant solution. But you've failed to explain to me under what circumstances I might also encounter the same problems with patching that you've encountered, in order to explain to me why the risk of an import hook is justified.

replies(4): >>45666315 #>>45666523 #>>45669215 #>>45673083 #
OJFord ◴[] No.45666523[source]
I didn't really get why I'd want to actually use it (vs. just a cool demo) either, until:

> means if you want to make changes to a third-party package, you don't have to take on the maintenance burden of forking, you can package and distribute just your changes.

That's a big win. I've seen and done my share of `# this file from github.com/blah with minor change X to L123` etc.

replies(1): >>45666761 #
1. nbadg ◴[] No.45666761[source]
If the goal is to actually package and distribute the changes via import hook, that makes the supply chain attack question particularly relevant. And it still doesn't explain why you couldn't just package and distribute the monkeypatch itself, instead of creating a whole new import ecosystem surrounding hooks.

I've done my fair share of that too, but I'm still not seeing the benefit vs patching.