(blog.habets.se)

495 points guntars | 1 comments | 22 Aug 25 03:51 UTC | HN request time: 0.233s | source

Show context

butterisgood ◴[22 Aug 25 15:00 UTC] No.44985475[source]▶

>>44980865 (OP) #

Where do people get the idea that one thread per core is correct on a system that deals with time slices?

In my experience “oversubscribing” threads to cores (more threads than cores) provides a wall-clock time benefit.

I think one thread per core would work better without preemptive scheduling.

But then we aren’t talking about Unix.

replies(4): >>44985631 #>>44986628 #>>44988220 #>>44988584 #

gorset ◴[22 Aug 25 15:14 UTC] No.44985631[source]▶

>>44985475 #

Isolating a core and then pinning a single thread is the way to go to get both low latency and high throughput, sacrificing efficiency.

This works fine on Linux, and common approach for trading systems where it’s fine to oversubscribe a bunch of cores for this type of stuff. The cores are mostly busy spinning and doing nothing, so it’s very inefficient in terms of actual work, but great for latency and throughput when you need it.

replies(1): >>44986485 #

butterisgood ◴[22 Aug 25 16:25 UTC] No.44986485[source]▶

>>44985631 #

I just wish people who give this advice for 1 thread per core would "expand their reasoning" or "show the work".

It's not blanket good advice for all things.

replies(2): >>44987171 #>>44989341 #

1. thinkharderdev ◴[22 Aug 25 20:20 UTC] No.44989341[source]▶

>>44986485 #

It is definitely not good advice for all things. For workloads that are either end of the CPU/IO spectrum (e.g. almost all waiting on IO or almost all doing CPU work) it can be a huge win as you can get very good L1 cache utilization, are not context-switching and don't need to handle thread synchronization in your code because not state is shared between threads.

For workloads that are a mix of IO and non-trivial CPU work, it can still work but is much, much harder to get right.

↑

Io_uring, kTLS and Rust for zero syscall HTTPS server