←back to thread

253 points rbanffy | 2 comments | | HN request time: 0s | source
Show context
sgarland ◴[] No.44004897[source]
> Instead, many reach for multiprocessing, but spawning processes is expensive

Agreed.

> and communicating across processes often requires making expensive copies of data

SharedMemory [0] exists. Never understood why this isn’t used more frequently. There’s even a ShareableList which does exactly what it sounds like, and is awesome.

[0]: https://docs.python.org/3/library/multiprocessing.shared_mem...

replies(7): >>44004956 #>>44005006 #>>44006103 #>>44006145 #>>44006664 #>>44006670 #>>44007267 #
chubot ◴[] No.44006145[source]
Spawning processes generally takes much less than 1 ms on Unix

Spawning a PYTHON interpreter process might take 30 ms to 300 ms before you get to main(), depending on the number of imports

It's 1 to 2 orders of magnitude difference, so it's worth being precise

This is a fallacy with say CGI. A CGI in C, Rust, or Go works perfectly well.

e.g. sqlite.org runs with a process PER REQUEST - https://news.ycombinator.com/item?id=3036124

replies(7): >>44006287 #>>44007950 #>>44008877 #>>44009754 #>>44009755 #>>44009805 #>>44010011 #
charleshn ◴[] No.44007950[source]
> Spawning processes generally takes much less than 1 ms on Unix

It depends on whether one uses clone, fork, posix_spawn etc.

Fork can take a while depending on the size of the address space, number of VMAs etc.

replies(2): >>44009524 #>>44009676 #
crackez ◴[] No.44009524[source]
Fork on Linux should use copy-on-write vmpages now, so if you fork inside python it should be cheap. If you launch a new Python process from let's say the shell, and it's already in the buffer cache, then you should only have to pay the startup CPU cost of the interpreter, since the IO should be satisfied from buffer cache...
replies(1): >>44010504 #
1. charleshn ◴[] No.44010504{3}[source]
> Fork on Linux should use copy-on-write vmpages now, so if you fork inside python it should be cheap.

No, that's exactly the point I'm making, copying PTEs is not cheap on a large address space, woth many VMAs.

You can run a simple python script allocating a large list and see how it affects fork time.

replies(1): >>44011430 #
2. charleshn ◴[] No.44011430[source]
See e.g. https://www.alibabacloud.com/blog/async-fork-mitigating-quer...