Most active commenters
  • klodolph(4)
  • chrisseaton(4)
  • dahart(4)
  • brundolf(3)

←back to thread

Pixar's Render Farm

(twitter.com)
382 points brundolf | 39 comments | | HN request time: 1.2s | source | bottom
1. klodolph ◴[] No.25615970[source]
My understanding (I am not an authority) is that for a long time, it has taken Pixar roughly an equal amount of time to render one frame of film. Something on the order of 24 hours. I don’t know what the real units are though (core-hours? machine-hours? simple wall clock?)

I am not surprised that they “make the film fit the box”, because managing compute expenditures is such a big deal!

(Edit: When I say "simple wall clock", I'm talking about the elapsed time from start to finish for rendering one frame, disregarding how many other frames might be rendering at the same time. Throughput != 1/latency, and all that.)

replies(6): >>25615994 #>>25616015 #>>25616474 #>>25617115 #>>25617883 #>>25618498 #
2. brundolf ◴[] No.25615994[source]
Well it can't just be one frame total every 24 hours, because an hour-long film would take 200+ years to render ;)
replies(5): >>25616010 #>>25616035 #>>25616054 #>>25616125 #>>25616154 #
3. chrisseaton ◴[] No.25616010[source]
I’m going to guess they have more than one computer rendering frames at the same time.
replies(1): >>25616073 #
4. ChuckNorris89 ◴[] No.25616015[source]
Wait, what? 24 hours per frame?!

At the standard 24fps it takes you 24 days per film second which works out to 473 years for the average 2 hour long film which can't be right.

replies(7): >>25616045 #>>25616061 #>>25616115 #>>25616213 #>>25616559 #>>25616561 #>>25617639 #
5. riffic ◴[] No.25616035[source]
you solve that problem with massively parallel batch processing. Look at schedulers like Platform LSF or HTCondor.
replies(1): >>25616368 #
6. dralley ◴[] No.25616045[source]
24 hours scaled to a normal computer, not 24 hours for the entire farm per frame.
7. onelastjob ◴[] No.25616054[source]
True. With render farms, when they say X minutes or hours per frame, they mean the time it takes 1 render node to render 1 frame. Of course, they will have lots of render nodes working on a shot at once.
replies(1): >>25616066 #
8. noncoml ◴[] No.25616061[source]
Maybe they mean 24 hours per frame per core
9. ◴[] No.25616066{3}[source]
10. brundolf ◴[] No.25616073{3}[source]
Yeah, I was just (semi-facetiously) pointing out the obvious that it can't be simple wall-clock time
replies(2): >>25616150 #>>25616184 #
11. mattnewton ◴[] No.25616115[source]
Not saying it's true, but I assume this is all parallizable so 24 cores would complete that 1 second in 1 day, and 3600*24 cores would complete the first hour of the film in a day, etc. And each frame might have parallizable processes to get them under 1 day wall time, but still cost 1 "day" of core-hours
12. welearnednothng ◴[] No.25616125[source]
They almost certainly render two frames at a time. Thus bringing the render time down to only 100+ years per film.
13. chrisseaton ◴[] No.25616150{4}[source]
Why can’t it be simple wall-clock time? Each frame takes 24 hours of real wall-clock time to render start to finish. But they render multiple frames at the same time. Doing so does not change the wall-clock time of each frame.
replies(1): >>25616231 #
14. ◴[] No.25616154[source]
15. masklinn ◴[] No.25616184{4}[source]
It could still be wallclock per-frame, but you can render each frame independently.
16. dagmx ◴[] No.25616213[source]
It's definitely not 24 hours per frame outside of gargantuan shots, at least by wall time. If you're going by core time, then it assumes you're serial which is never the case.

That also doesn't include rendering multiple shots at once. It's all about parallelism.

Finally, those frame counts for a film only assume final render. There's a whole slew of work in progress renders too, so a given shot may be rendered 10-20 times. Often they'll render every other frame to spot check and render at lower resolutions to get it back quick.

17. brundolf ◴[] No.25616231{5}[source]
In my (hobbyist) experience, path-tracing and rendering in general are enormously parallelizable. So if you can render X frames in parallel such that they all finish in 24 hours, that's roughly equivalent to saying you can render one of those frames in 24h/X.

Of course I'm sure things like I/O and art-team-workflow hugely complicate the story at this scale, but I still doubt there's a meaningful concept of "wall-clock time for one frame" that doesn't change with the number of available cores.

replies(3): >>25616259 #>>25616605 #>>25617310 #
18. chrisseaton ◴[] No.25616259{6}[source]
Wall-clock usually refers to time actually taken, in practice, with the particular configuration they use, not time could be taken if they used the configuration to minimise start-to-finish time.
19. sandermvanvliet ◴[] No.25616368{3}[source]
Haven’t heard those two in a while, played around with those while I was in uni 15 years ago :-O
20. joshspankit ◴[] No.25616474[source]
Maybe it’s society or maybe it’s intrinsic human nature, but there seems to be an overriding “only use resources to make it faster to a point, otherwise just make it better [more impressive?]”.

Video games, desktop apps, web apps, etc. And now confirmed that it happens to movies at Pixar.

replies(1): >>25617761 #
21. klodolph ◴[] No.25616559[source]
Again, I'm not sure whether this is core-hours, machine-hours, or wall clock. And to be clear, when I say "wall clock", what I'm talking about is latency between when someone clicks "render" and when they see the final result.

My experience running massive pipelines is that there's a limited amount of parallelization you can do. It's not like you can just slice the frame into rectangles and farm them out.

replies(1): >>25617401 #
22. berkut ◴[] No.25616561[source]
In high-end VFX, 12-36 hours (wall clock) per frame is a roughly accurate time frame for a final 2k frame at final quality.

36 is at the high end of things, and the histogram is more skewed towards the lower end than > 30 hours, but it's relatively common.

Frames can be parallelised, so multiple frames in a shot/sequence are rendered at once, on different machines.

replies(1): >>25631262 #
23. klodolph ◴[] No.25616605{6}[source]
I suspect hobbyist experience isn't relevant here. My experience running workloads at large scale (similar to Pixar's scale) is that as you increase scale, thinking of it as "enormously parallelizable" starts to fall apart.
24. CyberDildonics ◴[] No.25617115[source]
Not every place talks about frame rendering times the same. Some talk about the time it takes to render one frame of every pass sequentially, some talk about more about the time of the hero render or the longest dependency chain, since that is the latency to turn around a single frame. Core hours is usually separate because most of the time you want to know if something will be done overnight or if broken frames can be rendered during the day.

24 hours of wall clock time is excessive and the reality is that anything over 2 hours starts to get painful. If you can't render reliably over night, your iterations slow down to molasses and the more iterations you can do the better something will look. These times are usually inflated in articles. I would never accept 24 hours to turn around a typical frame as being necessary. If I saw people working with that, my top priority would be to figure out what is going on, because with zero doubt there would be a huge amount of nonsense under the hood.

25. dodobirdlord ◴[] No.25617310{6}[source]
Ray tracing is embarrassingly parallel, but it requires having most if not all of the scene in memory. If you have X,000 machines and X,000 frames to render in a day, it almost certainly makes sense to pin each render to a single machine to avoid having to do a ton of moving data around the network and in and out of memory on a bunch of machines. In which case the actual wall-clock time to render a frame on a single machine that is devoted to the render becomes the number to care about and to talk about.
replies(1): >>25617362 #
26. chrisseaton ◴[] No.25617362{7}[source]
Exactly - move the compute to the data, not the data to the compute.
27. capableweb ◴[] No.25617401{3}[source]
> It's not like you can just slice the frame into rectangles and farm them out.

Funny thing, you sure can! Distributed rendering of single frames been a thing for a long time already.

replies(1): >>25618153 #
28. KaiserPro ◴[] No.25617639[source]
yup, you've also got to remember that a final frame will have been rendered many times.

Each and every asset, animation, lighting, texturing sim and final comp will go through a number of revisions before being accepted.

So in all actuality that final frame could have been rendered 20+ times.

VFX farms are huge, In 2014 I worked on one that was 36K cpus and about 15pb of storage. Its probably now in the 200k cpu mark.

29. dahart ◴[] No.25617761[source]
You can come at this from multiple directions.

On the one hand, it’s wise to only expend effort making something faster up to a point. At some point, unless a human has to wait for the result, there is no reason to make something faster [1].

On the other hand, once something takes more than a minute or two, and the person who started it goes and does something else, it doesn’t matter how long it takes, as long as it’s done before you get back. Film shots usually render overnight, so as long as they’re done in the morning and as long as they don’t prevent something else from being rendered by the morning, it doesn’t necessarily need to go faster. Somewhere out there is a blog post I remember about writing renderers and how artists behave; it posits perhaps there’s a couple of thresholds. If something takes longer than ten seconds to render, they’re going to leave to get coffee. If something takes longer than ten minutes to render, they’re going to start it at night and check on it in the morning.

[1] I always like the way Michael Abrash framed it:

“Understanding High Performance: Before we can create high-performance code, we must understand what high performance is. The objective (not always attained) in creating high- performance software is to make the software able to carry out its appointed tasks so rapidly that it responds instantaneously, as far as the user is concerned. In other words, high-performance code should ideally run so fast that any further improvement in the code would be pointless.

“Notice that the above definition most emphatically does not say anything about making the software as fast as possible. It also does not say anything about using assembly language, or an optimizing compiler, or, for that matter, a compiler at all. It also doesn’t say anything about how the code was designed and written. What it does say is that high-performance code shouldn’t get in the user’s way—and that’s all.” (From the “Graphics Programming Black Book”)

replies(1): >>25617952 #
30. dahart ◴[] No.25617883[source]
I don’t believe the average is 24 hours of wall clock. I do think average render times have increased a bit over time, but FWIW, I think the average render time needs to be “overnight”. The shot just needs to rendered before dailies in the morning. If it takes longer than maybe 6-8 hours, it risks not being done by the next day, and that means each iteration with the director takes two days instead of one. There is significant pressure to avoid that, so when shots don’t finish overnight, people generally start optimizing.

When I was doing CG production shot work ~15 years ago, there were occasionally shots that ran 24 hours, but the average was more like 3 or 4 hours. The shots that took 24 hours or more usually caused people to investigate whether something was wrong.

I worked on one such shot that was taking more than 24 hours. A scene in the film Madagascar where the boat blows a horn and all the trees on the island blow over. The trees and plants were modeled for close-ups, including flowers with stamens and pistils, but the shot was a view of the whole island. One of my co-workers wrote a pre-render filter with only a few lines of code, to check if pieces of the geometry were smaller than a pixel, and if so just discard them. IIRC, render times immediately dropped from 24 hours to 8 hours.

31. joshspankit ◴[] No.25617952{3}[source]
Excellent points, but I have two counters:

- Feedback loops

- Cumulative end-to-end latency

The second is especially challenging as only a handful of coders saying “that’s good enough” can add up to perceptibly massive latency for the end user.

replies(1): >>25618069 #
32. dahart ◴[] No.25618069{4}[source]
Oh I’d agree, the blog post about coffee was somewhat tongue-in-cheek. One shouldn’t presume it’s fast enough, one should always measure. And your points echo Abrash... if a human is waiting for the computer, then the computer could be made faster. That includes any and all human-computer interactions and workflows.

Recalling a bit more now, the actual point of the blog post I was thinking of, and not summarizing super accurately, was to try to make things faster to prevent the artists from getting out of their seat, precisely because the tool it was referring to was primarily a feedback loop interaction. The tool in question was the PDI lighting tool “Light”, which received an Academy award a few years back. https://www.oscars.org/sci-tech/ceremonies/2013

33. klodolph ◴[] No.25618153{4}[source]
What about GI? You can't just slice GI into pieces.
replies(3): >>25618297 #>>25620501 #>>25621006 #
34. dahart ◴[] No.25618297{5}[source]
Why are you thinking GI wouldn’t work? Slicing the image plane pretty much works for parallelizing GI just as well as it does for raster. It does help to use small-ish tiles, that way you get some degree of automatic load balancing.
35. quelsolaar ◴[] No.25618498[source]
In computer graphics its known as Blinn’s Law, It states that no matter how fast hardware you get, artists put more details in to shots and therefore render times remain roughly the same. Its been roughly true for 30ish years.
replies(1): >>25619499 #
36. mroche ◴[] No.25619499[source]
And it hurts. But man are the images gorgeous!
37. capableweb ◴[] No.25620501{5}[source]
How I've seen it work in the past, it'll totally work with GI (and more generally, raytracing). If the frame to be rendered is CPU bound rather than I/O (because of heavy scenes), the whole project would be farmed out to the works, so they have a full copy of what's to be rendered, then resize which part of the frame to be rendered by them. Normally this would happen locally, and if you have 8 CPU cores, each one of them get responsible for a small size of the frame. Now if you're doing distributed rendering, replace CPU core with a full machine, and you have the same principle.

Obviously doesn't work for every frame/scene/project, only if the main time is spent on actual rendering with CPU/GPU. Most of the times when doing distributed rendering, CPU isn't actually the bottleneck, but rather transferring the necessary stuff for the rendering (project/scene data structures that each worker needs).

38. Hard_Space ◴[] No.25621006{5}[source]
This has been possible even for CGI tinkerers like me with C4D for more than ten years.
39. franzb ◴[] No.25631262{3}[source]
Hi Berkut, I'd love to get in touch with you, unfortunately I couldn't find any contact info in your profile. You can find my email in my profile. Cheers!