←back to thread

Multi-Core by Default

(www.rfleury.com)
70 points kruuuder | 6 comments | | HN request time: 0.829s | source | bottom
Show context
jerf ◴[] No.45539056[source]
If the author has not already, I would commend to them a search of the literature (or the relevant blog summaries) for the term "implicit parallelism". This was an academic topic from a few years back (my brain does not do this sort of thing very well but I want to say 10-20 years go) where the hope was that we could just fire some sort of optimization technique at normal code which would automatically extract all the implicit parallelism in the code and parallelize it, resulting in massive essentially-free gains.

In order to do this, the first thing that was done was to analyze existing source code and determine what the maximum amount of implicit parallelism was that was in the code, assuming it was free. This attempt then basically failed right here. Intuitively we all expect that our code has tons of implicitly parallelism that can be exploited. It turns out our intuition is wrong, and the maximum amount of parallelism that was extracted was often in the 2x range, which even if the parallelization was free it was only a marginal improvement.

Moreover, it is also often not something terribly amenable to human optimization either.

A game engine might be the best case scenario for this sort of code, but once you start putting in the coordination costs back into the charts those charts start looking a lot less impressive in practice. I have a sort of rule of thumb that the key to high-performance multithreading is that the cost of the payload of a given bit of coordination overhead needs to be substantially greater than the cost the coordination, and a games engine will not necessarily have that characteristic... it may have lots of tasks to be done in parallel, but if they

replies(4): >>45539945 #>>45540340 #>>45541602 #>>45542254 #
1. jayd16 ◴[] No.45540340[source]
I guess you can argue that instruction reordering, SMT/Hyper-threading are already eating the easy wins there. And as you said, it seems like the gains taper off at 2x.

I'm not sure why games would be a good target. They're traditionally very much tied to a single thread, because ironically, passing data to the graphics and display hardware and to multi threaded subroutines like physics all has to be synchronized.

The easiest way to do that without locking a bunch of threads is to let a single thread go as fast as possible through all that main thread work.

If you really want a game focused parallelization framework, look into the Entity Component System pattern. The developer defines the data and mutability flow of various features in the game.

Because the execution ordering is fully known, the frameworks can chunk, schedule, reorder, and fan-out, etc the work across threads with less waiting or cache misses.

replies(3): >>45541199 #>>45541223 #>>45541462 #
2. jerf ◴[] No.45541199[source]
"I'm not sure why games would be a good target..."

"If you really want a game focused parallelization framework, look into the Entity Component System pattern."

Exactly that. You can break a lot modern games nicely into a lot of little things being done to discrete entities very quickly. But there the problem is that it's easy for the things to be too small, meaning you don't have a lot of time to be "clever" in the code.

I'm ignoring the GPU and just looking at CPU for this. GPU is in its own world where parallelization is forced on you comprehensively, in a space forced to be amenable to that.

replies(1): >>45542940 #
3. rcxdude ◴[] No.45541223[source]
Yeah, ECS is the approach I've heard can get games to have sufficient parallelism. Though only if you are careful about sticking to it properly, and with careful management of the dataflow between systems. I think the main thing is, apart from the difficulty of doing the analysis in the first place, if you're writing the code without thinking about parallelism, you will tend to introduce data dependcies all over the place and changing that structure is very difficult.
replies(1): >>45543013 #
4. _aavaa_ ◴[] No.45541462[source]
ECS is not a game specific thing. It’s a redrawing of encapsulation boundaries which can be applied to systems broadly.

https://youtube.com/watch?v=wo84LFzx5nI

5. jayd16 ◴[] No.45542940[source]
Sure there's work you can throw into ECS but its a paradigm shift that is not implicit and also highlights how much doesn't work.
6. jayd16 ◴[] No.45543013[source]
Agreed. It exemplifies what parallelism is on the table but also how many more guarantees need to be enforced to get it.

You're almost swinging the pendulum back to a fixed pipeline and I don't think you can get that for free.