←back to thread

128 points ksec | 1 comments | | HN request time: 0.208s | source
Show context
rwmj ◴[] No.42752085[source]
Slightly off topic, but if I'm aiming to get the fastest 'make -jN' for some random C project (such as the kernel) should I set N = #P [threads] + #E, or just the #P, or something else? Basically, is there a case where using the E cores slows a compile down? Or is power management a factor?

I timed it on the single Intel machine I have access to with E-cores and setting N = #P + #E was in fact the fastest, but I wonder if that's a general rule.

replies(3): >>42752108 #>>42752141 #>>42752456 #
saurik ◴[] No.42752108[source]
Did you test at least +1 if not *1.5 or something? I would expect you to occasionally get blocked on disk I/O and would want some spare work sitting hot to switch in.
replies(1): >>42752119 #
rwmj ◴[] No.42752119[source]
Let me test that now. Note I only have 1 Intel machine so any results are very specific to this laptop.

  -j           time (mean ± σ)
  12 (#P+#E)   130.889 s ±  4.072 s
  13 (..+1)    135.049 s ±  2.270 s
   4 (#P)      179.845 s ±  1.783 s
   8 (#E)      141.669 s ±  3.441 s
Machine: 13th Gen Intel(R) Core(TM) i7-1365U; 2 x P-cores (4 threads), 8 x E-cores
replies(1): >>42752557 #
wtallis ◴[] No.42752557[source]
Your processor has two P cores, and ten cores total, not twelve. The HyperThreading (SMT) does not make the two P cores into four cores. Your experiment with 4 threads will most likely result in using both P cores and two E cores, as no sane OS would double up threads on the P cores before the E cores were full with one thread each.
replies(2): >>42752610 #>>42752692 #
rwmj ◴[] No.42752610[source]
The hyperthreading should cover up memory latency, since the workload (compiling qemu) might not fit into L3 cache. Although I take your point that it doesn't magically create two core-equivalents.
replies(1): >>42753268 #
gonzo ◴[] No.42753268[source]
“Hyperthreading” is a write pipe hack.

If the core stalls on a write then the other thread gets run.

replies(1): >>42756450 #
1. atq2119 ◴[] No.42756450[source]
It's much more than that. It also allows one thread to make progress while the other is waiting for memory loads, or filling in instruction slots while the other thread is recovering from a branch mispredict.

Compilers tend to do a lot of pointer chasing and branching, so it's expected that they would benefit decently from hyperthreading.