The first year of free-threaded Python

(labs.quansight.org)

291 points rbanffy | 2 comments | 16 May 25 09:42 UTC | HN request time: 0.024s | source

Show context

sgarland ◴[16 May 25 12:56 UTC] No.44004897[source]▶

> Instead, many reach for multiprocessing, but spawning processes is expensive

Agreed.

> and communicating across processes often requires making expensive copies of data

SharedMemory [0] exists. Never understood why this isn’t used more frequently. There’s even a ShareableList which does exactly what it sounds like, and is awesome.

[0]: https://docs.python.org/3/library/multiprocessing.shared_mem...

replies(8): >>44004956 #>>44005006 #>>44006103 #>>44006145 #>>44006664 #>>44006670 #>>44007267 #>>44013159 #

chubot ◴[16 May 25 14:46 UTC] No.44006145[source]▶

>>44004897 #

Spawning processes generally takes much less than 1 ms on Unix

Spawning a PYTHON interpreter process might take 30 ms to 300 ms before you get to main(), depending on the number of imports

It's 1 to 2 orders of magnitude difference, so it's worth being precise

This is a fallacy with say CGI. A CGI in C, Rust, or Go works perfectly well.

e.g. sqlite.org runs with a process PER REQUEST - https://news.ycombinator.com/item?id=3036124

replies(9): >>44006287 #>>44007950 #>>44008877 #>>44009754 #>>44009755 #>>44009805 #>>44010011 #>>44012318 #>>44013651 #

morningsam ◴[16 May 25 21:02 UTC] No.44009805[source]▶

>>44006145 #

>Spawning a PYTHON interpreter process might take 30 ms to 300 ms

Which is why, at least on Linux, Python's multiprocessing doesn't do that but fork()s the interpreter, which takes low-single-digit ms as well.

replies(2): >>44010280 #>>44011993 #

zahlman ◴[16 May 25 22:20 UTC] No.44010280[source]▶

>>44009805 #

Even when the 'spawn' strategy is used (default on Windows, and can be chosen explicitly on Linux), the overhead can largely be avoided. (Why choose it on Linux? Apparently forking can cause problems if you also use threads.) Python imports can be deferred (`import` is a statement, not a compiler or pre-processor directive), and child processes (regardless of the creation strategy) name the main module as `__mp_main__` rather than `__main__`, allowing the programmer to distinguish. (Being able to distinguish is of course necessary here, to avoid making a fork bomb - since the top-level code runs automatically and `if __name__ == '__main__':` is normally top-level code.)

But also keep in mind that cleanup for a Python process also takes time, which is harder to trace.

Refs:

https://docs.python.org/3/library/multiprocessing.html#conte... https://stackoverflow.com/questions/72497140

replies(1): >>44010625 #

kstrauser ◴[16 May 25 23:11 UTC] No.44010625[source]▶

>>44010280 #

I really wish Python had a way to annotate things you don't care about cleaning up. I don't know what the API would look like, but I imagine something like:

  l = list(cleanup=False)
  for i in range(1_000_000_000): l.append(i)

telling the runtime that we don't need to individually GC each of those tiny objects and just let the OS's process model free the whole thing at once.

Sure, close TCP connections before you kill the whole thing. I couldn't care less about most objects, though.

replies(4): >>44010738 #>>44010861 #>>44010911 #>>44012256 #

zahlman ◴[16 May 25 23:34 UTC] No.44010738[source]▶

>>44010625 #

You'd presumably need to do something involving weakrefs, since it would be really bad if you told Python that the elements can be GCd at all (never mind whether it can be done all at once) but someone else had a reference.

Or completely rearchitect the language to have a model of automatic (in the C sense) allocation. I can't see that ever happening.

replies(1): >>44010924 #

kstrauser ◴[17 May 25 00:09 UTC] No.44010924[source]▶

>>44010738 #

I don't think either of those are true. I'm not arguing against cleaning up objects during the normal runtime. What I'd like is something that would avoid GC'ing objects one-at-a-time at program shutdown.

I've had cases where it took Python like 30 seconds to exit after I'd slurped a large CSV with a zillion rows into RAM. At that time, I'd dreamed of a way to tell Python not to bother free()ing any of that, just exit() and let Linux unmap RAM all at once. If you think about it, there probably aren't that many resources you actually care about individually freeing on exit. I'm certain somewill will prove me wrong, but at a first pass, objects that don't define __del__ or __exit__ probably don't care how you destroy them.

replies(1): >>44011521 #

1. zahlman ◴[17 May 25 02:11 UTC] No.44011521[source]▶

>>44010924 #

Ah.

I imagine the problem is that `__del__` could be monkeypatched, so Python doesn't strictly know what needs custom finalization until that moment.

But if you have a concrete proposal, it's likely worth shopping around at https://discuss.python.org/c/ideas/6 or https://github.com/python/cpython/issues/ .

replies(1): >>44011632 #

2. kstrauser ◴[17 May 25 02:39 UTC] No.44011632[source]▶

>>44011521 (TP) #

I might do that. It’s nothing I’ve thought about in depth, just an occasionally recurring idea that bugs me every now and then.

↑