Most active commenters

CyberDildonics(9)
berkut(6)
aprdm(6)
dahart(6)
forelle2(5)
boulos(4)
erosenbe0(4)
brundolf(3)
lattalayta(3)
shaklee3(3)

Popular/hot comments

>>25616527 #
>>25616923 #
>>25617017 #
>>25617080 #
>>25616693 #
>>25637088 #

←back to thread

Pixar's Render Farm

(twitter.com)

1. mmcconnell1618 ◴[02 Jan 21 20:48 UTC] No.25616372[source]▶

>>25615888 (OP) #

Can anyone comment on why Pixar uses standard CPU for processing instead of custom hardware or GPU? I'm wondering why they haven't invested in FPGA or completely custom silicon that speeds up common operations by an order of magnitude. Is each show that different that no common operations are targets for hardware optimization?

replies(12): >>25616493 #>>25616494 #>>25616509 #>>25616527 #>>25616546 #>>25616623 #>>25616626 #>>25616670 #>>25616851 #>>25616986 #>>25617019 #>>25636451 #

2. colordrops ◴[02 Jan 21 21:02 UTC] No.25616493[source]▶

>>25616372 (TP) #

I'm "anyone" since I know very little about the subject but I'd speculate that they've done a cost-benefit analysis and figured that would be overkill and tie them to proprietary hardware, so that they couldn't easily adapt and take advantage of advances in commodity hardware.

replies(1): >>25616660 #

3. boulos ◴[02 Jan 21 21:02 UTC] No.25616494[source]▶

>>25616372 (TP) #

Amusingly, Pixar did build the "Pixar Image Computer" [1] in the 80s and they keep one inside their renderfarm room in Emeryville (as a reminder).

Basically though, Pixar doesn't have the scale to make custom chips (the entire Pixar and even "Disney all up" scale is pretty small compared to say a single Google or Amazon cluster).

Until recently GPUs also didn't have enough memory to handle production film rendering, particularly the amount of textures used per frame (which even on CPUs are handled out-of-core with a texture cache, rather than "read it all in up front somehow"). I think the recent HBM-based GPUs will make this a more likely scenario, especially when/if OptiX/RTX gains a serious texture cache for this kind of usage. Even still, however, those GPUs are extremely expensive. For folks that can squeeze into the 16 GiB per card of the NVIDIA T4, it's just about right.

tl;dr: The economics don't work out. You'll probably start seeing more and more studios using GPUs (particularly with RTX) for shot work, especially in VFX or shorts or simpler films, but until the memory per card (here now!) and $/GPU (nope) is competitive it'll be a tradeoff.

[1] https://en.wikipedia.org/wiki/Pixar_Image_Computer

replies(1): >>25616674 #

4. jlouis ◴[02 Jan 21 21:04 UTC] No.25616509[source]▶

>>25616372 (TP) #

Probably because CPU times fall within acceptable windows. That would be my guess. You can go faster with FPGA or silicon, but it also has a very high cost, on the order of 10 to 100 as expensive. You can get a lot of hardware for that.

5. berkut ◴[02 Jan 21 21:06 UTC] No.25616527[source]▶

>>25616372 (TP) #

Because the expense is not really worth it - even GPU rendering (while around 3/4 x faster than CPU rendering) is memory constrained compared to CPU rendering, and as soon as you try and go out-of-core on the GPU, you're back at CPU speeds, so there's usually no point doing GPU rendering for entire scenes (which can take > 48 GB of RAM for all geometry, accel structures, textures, etc) given the often large memory requirements.

High end VFX/CG usually tessellates geometry down to micropolygon, so you roughly have 1 quad (or two triangles) per pixel in terms of geometry density, so you can often have > 150,000,000 polys in a scene, along with per vertex primvars to control shading, and many textures (which can be paged fairly well with shade on hit).

Using ray tracing pretty much means having all that in memory at once (paging sucks in general of geo and accel structures, it's been tried in the past) so that intersection / traversal is fast.

Doing lookdev on individual assets (i.e. turntables) is one place where GPU rendering can be used as the memory requirements are much smaller, but only if the look you get is identical to the one you get using CPU rendering, which isn't always the case (some of the algorithms are hard to get working correctly on GPUs, i.e. volumetrics).

Renderman (the renderer Pixar use, and create in Seattle) isn't really GPU ready yet (they're attempting to release XPU this year I think).

replies(4): >>25616832 #>>25617017 #>>25617606 #>>25620652 #

6. brundolf ◴[02 Jan 21 21:08 UTC] No.25616546[source]▶

>>25616372 (TP) #

In addition to what others have said, I remember reading somewhere that CPUs give more reliably accurate results, and that that's part of why they're still preferred for pre-rendered content

replies(2): >>25616695 #>>25616939 #

7. aprdm ◴[02 Jan 21 21:16 UTC] No.25616623[source]▶

>>25616372 (TP) #

FPGA is really expensive for the scale of a modern studio render farm, we're talking around 40~100k cores per datacenter. Because 40~100k cores isn't Google scale either it also doesn't seem to make sense to invest in custom silicon.

There's a huge I/O bottleneck as well as you're reading huge textures (I've seen textures as big as 1 TB) and writing constantly to disk the result of the renderer.

Other than that, most of the tooling that modern studios use is off the shelf, for example, Autodesk Maya for Modelling or Sidefx Houdini for Simulations. If you had a custom architecture then you would have to ensure that every piece of software you use is optimized / works with that.

There are studios using GPUs for some workflows but most of it is CPUs.

replies(2): >>25616693 #>>25616904 #

8. lattalayta ◴[02 Jan 21 21:17 UTC] No.25616626[source]▶

>>25616372 (TP) #

Pixar renders their movies with their commercially available software, Renderman. In the past they have partnered with Intel [1] and Nvidia [2] on optimizations

I'd imagine another reason is that Pixar uses off-the-shelf Digital Content Creation apps (DCCs) like Houdini and Maya in addition to their proprietary software, so while they could optimize some portions of their pipeline, it's probably better to develop for more general computing tasks. They also mention the ability to "ramp up" and "ramp down" as compute use changes over the course of a show

[1] https://devmesh.intel.com/projects/supercharging-pixar-s-ren...

[2] https://nvidianews.nvidia.com/news/pixar-animation-studios-l...

9. CyberDildonics ◴[02 Jan 21 21:22 UTC] No.25616660[source]▶

>>25616493 #

GPUs are commodity hardware, but you don't have to speculate, this was answered well here:

https://news.ycombinator.com/item?id=25616527

10. ArtWomb ◴[02 Jan 21 21:23 UTC] No.25616670[source]▶

>>25616372 (TP) #

There's a brief footnote about their REVES volume rasterizer used in Soul World crowd characters. They simply state their render farm is CPU based and thus no GPU optimizations were required. At the highest, most practical level of abstraction, it's all software. De-coupling the artistic pipeline from underlying dependence on proprietary hardware or graphics APIs is probably the only way to do it.

https://graphics.pixar.com/library/SoulRasterizingVolumes/pa...

On a personal note, I had a pretty visceral "anti-" reaction to the movie Soul. I just felt it too trite in its handling of themes that humankind has wrestled with since the dawn of time. And jazz is probably the most cinematic of musical tastes. Think of the intros to Woody Allen's Manhattan or Midnight in Paris. But it felt generic here.

That said the physically based rendering is state of the art! If you've ever taken the LIE toward the Queensborough Bridge as the sun sets across the skyscraper canyons of the city you know it is one of the most surreal tableaus in modern life. It's just incredible to see a pixel perfect globally illuminated rendering of it in an animated film, if only for the briefest of seconds ;)

11. brundolf ◴[02 Jan 21 21:23 UTC] No.25616674[source]▶

>>25616494 #

That wikipedia article could be its own story!

12. nightfly ◴[02 Jan 21 21:26 UTC] No.25616693[source]▶

>>25616623 #

I'm assuming these 1TiB textures are procedural generated or composites? Where do this large of textures come up?

replies(3): >>25616722 #>>25616850 #>>25617045 #

13. enos_feedler ◴[02 Jan 21 21:26 UTC] No.25616695[source]▶

>>25616546 #

I believe this to be historically true as GPUs often “cheated” with floating point math to optimize hardware pipelines for game rasterization where only looks matter. This is probably not true as GPGPU took hold over the last decade.

replies(1): >>25616742 #

14. aprdm ◴[02 Jan 21 21:29 UTC] No.25616722{3}[source]▶

>>25616693 #

Can be either. You usually have digital artists creating them.

https://en.wikipedia.org/wiki/Texture_artist

replies(1): >>25617054 #

15. brundolf ◴[02 Jan 21 21:32 UTC] No.25616742{3}[source]▶

>>25616695 #

Ah, that makes sense

16. ArtWomb ◴[02 Jan 21 21:44 UTC] No.25616832[source]▶

>>25616527 #

Nice to have an industry insider perspective on here ;)

Can you speak to any competitive advantages a vfx-centric gpu cloud provider may have over commodity AWS? Even the RenderMan XPU looks to be OSL / Intel AVX-512 SIMD based. Thanks!

Supercharging Pixar's RenderMan XPU™ with Intel® AVX-512

https://www.youtube.com/watch?v=-WqrP50nvN4

replies(1): >>25616923 #

17. _3r2w ◴[02 Jan 21 21:46 UTC] No.25616850{3}[source]▶

>>25616693 #

1 terabyte sounds like an outlier, but typically texture maps are used as inputs to shading calculations. So it's not uncommon for hero assets in large-scale VFX movies to have more than 10 different sets of texture files that represent different portions of a shading model. For large assets, it may take more than fifty 4K-16K images to adequately cover the entire model such that if you were to render it from any angle, you wouldn't see the pixelation. And these textures are often stored as mipmapped 16 bit images so the renderer can choose the most optimal resolution at rendertime.

So that can easily end up being several hundred gigabytes of source image data. At rendertime, only the textures that are needed to render what's visible in the camera are loaded into memory and utilized, which typically ends up being a fraction of the source data.

Large scale terrains and environments typically make more use of procedural textures, and they may be cached temporarily in memory while the rendering process happens to speed up calculations

18. dahart ◴[02 Jan 21 21:46 UTC] No.25616851[source]▶

>>25616372 (TP) #

> Can anyone comment on why Pixar uses standard CPU for processing instead of custom hardware or GPU?

A GPU enabled version of RenderMan is just coming out now. I imagine their farm usage after this could change.

https://gfxspeak.com/2020/09/11/animation-studios-renderman/

I’m purely speculating, but I think the main reason they haven’t been using GPUs until now is that RenderMan is very full featured, extremely scalable on CPUs, has a lot of legacy features, and it takes a metric ton of engineering to port and re-architect well established CPU based software over to the GPU.

replies(1): >>25621369 #

19. dahart ◴[02 Jan 21 21:52 UTC] No.25616904[source]▶

>>25616623 #

> There are studios using GPUs for some workflows but most of it is CPUs.

This is probably true today, but leaves the wrong impression IMHO. The clear trend is moving toward GPUs, and surprisingly quickly. Maya & Houdini have release GPU simulators and renderers. RenderMan is releasing a GPU renderer this year. Most other third party renderers have already gone or are moving to the GPU for path tracing - Arnold, Vray, Redshift, Clarisse, etc., etc.

20. lattalayta ◴[02 Jan 21 21:55 UTC] No.25616923{3}[source]▶

>>25616832 #

One potential difference is that the input data required to render a single frame of a high end animated or VFX movie might be several hundred gigabytes (even terabytes for heavy water simulations or hair) - caches, textures, geometry, animation & simulation data, scene description. Often times a VFX centric cloud provider will have some robust system in place for uploading and caching out data across the many nodes that need it. (https://www.microsoft.com/en-us/avere)

And GPU rendering has been gaining momentum over the past few years, but the biggest bottleneck until recently was availabe VRAM. Big budget VFX scenes can often take 40-120 GB of memory to keep everything accessible during the raytrace process, and unless a renderer supports out-of-core memory access, then the speed up you may have gained from the GPU gets thrown out the window from swapping data

replies(4): >>25617113 #>>25617144 #>>25617198 #>>25618613 #

21. dahart ◴[02 Jan 21 21:57 UTC] No.25616939[source]▶

>>25616546 #

> I remember reading somewhere that CPUs give more reliably accurate results

This is no longer true, and hasn’t been for around a decade. This is a left-over memory of when GPUs weren’t using IEEE 754 compatible floating point. That changed a long time ago, and today all GPUs are absolutely up to par with the IEEE standards. GPUs even took the lead for a while with the FMA instruction that was more accurate than what CPUs had, and Intel and other have since added FMA instructions to their CPUs.

22. mhh__ ◴[02 Jan 21 22:04 UTC] No.25616986[source]▶

>>25616372 (TP) #

Relative to the price of a standard node, FPGA's aren't magic : You have to find the parallelism in order to exploit it. As for custom silicon, anything on a close to a modern process costs millions in NRE alone.

From a different perspective, think about supercomputers - many supercomputers do indeed do relatively specific things (and I would assume some do run custom hardware), but the magic is in the interconnects - getting the data around effectively is where the black magic is.

Also, if you aren't particularly time bound, why bother? FPGAs require completely different types of engineers, and are generally a bit of pain to program for even ignoring how horrific some vendor tools are - your GPU code won't fail timing for example.

replies(1): >>25621393 #

23. dahart ◴[02 Jan 21 22:07 UTC] No.25617017[source]▶

>>25616527 #

> Because the expense is not really worth it

I disagree with this takeaway. But full disclosure I’m biased: I work on OptiX. There is a reason Pixar and Arnold and Vray and most other major industry renderers are moving to the GPU, because the trends are clear and because it has recently become ‘worth it’. Many renderers are reporting factors of 2-10 for production scale scene rendering. (Here’s a good example: https://www.youtube.com/watch?v=ZlmRuR5MKmU) There definitely are tradeoffs, and you’ve accurately pointed out several of them - memory constraints, paging, micropolygons, etc. Yes, it does take a lot of engineering to make the best use of the GPU, but the scale of scenes in production with GPUs today is already firmly well past being limited to turntables, and the writing is on the wall - the trend is clearly moving toward GPU farms.

replies(4): >>25617080 #>>25617265 #>>25619363 #>>25622440 #

24. corysama ◴[02 Jan 21 22:08 UTC] No.25617019[source]▶

>>25616372 (TP) #

Not an ILMer, but I was at LucasArts over a decade ago. Back then, us silly gamedevs would argue with ILM that they needed to transition from CPU to GPU based rendering. They always pushed back that their bottleneck was I/O for the massive texture sets their scenes through around. At the time RenderMan was still mostly rasterization based. Transitioning that multi-decade code and hardware tradition over to GPU would be a huge project that I think they just wanted to put off as long as possible.

But, very soon after I left Lucas, ILM started pushing ray tracing a lot harder. Getting good quality results per ray is very difficult. Much easier to throw hardware at the problem and just cast a whole lot more rays. So, they moved over to being heavily GPU-based around that time. I do not know the specifics.

replies(1): >>25620346 #

25. CyberDildonics ◴[02 Jan 21 22:11 UTC] No.25617045{3}[source]▶

>>25616693 #

I would take that with a huge grain of salt. Typically the only thing that would be a full terabyte is a full resolution water simulation for an entire shot. I'm unconvinced that is actually necessary, but it does happen.

An entire movie at 2k, uncompressed floating point rgb would be about 4 terabytes.

26. CyberDildonics ◴[02 Jan 21 22:12 UTC] No.25617054{4}[source]▶

>>25616722 #

Texture artists aren't painting 1 terabyte textures dude.

replies(1): >>25620347 #

27. berkut ◴[02 Jan 21 22:15 UTC] No.25617080{3}[source]▶

>>25617017 #

I write a production renderer for a living :)

So I'm well aware of the trade offs. As I mentioned, for lookdev and small scenes, GPUs do make sense currently (if you're willing the pay the penalty of getting code to work on both CPU and GPU, and GPU dev is not exactly trivial in terms of debugging / building compared to CPU dev).

But until GPUs exist with > 64 GB RAM, for rendering large scale scenes, it's just not worth it given the extra burdens (increased development costs, heterogeneous sets of machines in the farm, extra debugging, support), so for high-end scale, we're likely 3/4 years away yet.

replies(4): >>25617276 #>>25617360 #>>25619796 #>>25620580 #

28. pja ◴[02 Jan 21 22:20 UTC] No.25617113{4}[source]▶

>>25616923 #

As a specific example, Disney released the data for rendering a single shot from Moana a couple of years ago. You can download it here: https://www.disneyanimation.com/data-sets/?drawer=/resources...

Uncompressed, it’s 93Gb of render data, plus 130Gb of animation data if you want to render the entire shot instead of a single frame.

From what I’ve seen elsewhere, that’s not unusual at all for a modern high end animated scene.

29. berkut ◴[02 Jan 21 22:24 UTC] No.25617144{4}[source]▶

>>25616923 #

To re-enforce this, here is some discussion of average machine memory size at Disney and Weta two years ago:

https://twitter.com/yiningkarlli/status/1014418038567796738

30. lattalayta ◴[02 Jan 21 22:29 UTC] No.25617198{4}[source]▶

>>25616923 #

Oh, and also, security. After the Sony hack several years ago, many film studios have severe restrictions on what they'll allow off-site. For upcoming unreleased movies, many studios are overly protective of their IP and want to mitigate the chance of a leak as much as possible. Often times complying with those restrictions and auditing the entire process is enough to make on-site rendering more attractive.

31. berkut ◴[02 Jan 21 22:37 UTC] No.25617265{3}[source]▶

>>25617017 #

I should also point out that ray traversal / intersection costs are generally only around 40% of the costs of extremely large scenes, and that's predominantly where GPUs are currently much faster than CPUs.

(I'm aware of the OSL batching/GPU work that's taking place, but it remains to be seen how well that's going to work).

From what I've heard from friends in the industry (at other companies) who are using GPU versions of Arnold, the numbers are no-where near as good as the upper numbers you're claiming when rendering at final fidelity (i.e. with AOVs and Deep output), so again, the use-cases - at least for high-end VFX with GPU - are still mostly for lookdev and lighting blocking iterative workflow from what I understand. Which is still an advantage and provides clear benefits in terms of iteration time over CPU renderers, but it's not a complete win, and so far, only the smaller studios have started dipping their toes in the water.

Also, the advent of AMD Epyc has finally thrown some competitiveness back to CPU rendering, so it's now possible to get a machine with x2 as many cores for close to half the price, which has given CPU rendering a further shot in the arm.

32. foota ◴[02 Jan 21 22:38 UTC] No.25617276{4}[source]▶

>>25617080 #

Given current consumer GPUs are at 24 GB I think 3-4 years is likely overly pessimistic.

replies(1): >>25617327 #

33. berkut ◴[02 Jan 21 22:44 UTC] No.25617327{5}[source]▶

>>25617276 #

They've been at 24 GB for two years though - and they cost an arm and a leg compared to a CPU with a similar amount.

It's not just about them existing, they need to be cost effective.

replies(2): >>25617419 #>>25620838 #

34. dahart ◴[02 Jan 21 22:48 UTC] No.25617360{4}[source]▶

>>25617080 #

I used to write a production renderer for a living, now I work with a lot of people who write production renderers for both CPU and GPU. I’m not sure what line you’re drawing exactly ... if you mean that it will take 3 or 4 years before the industry will be able to stop using CPUs for production rendering, then I totally agree with you. If you mean that it will take 3 or 4 years before industry can use GPUs for any production rendering, then that statement would be about 8 years too late. I’m pretty sure that’s not what you meant, so it’s somewhere in between there, meaning some scenes are doable on the GPU today and some aren’t. It’s worth it now in some cases, and not worth it in other cases.

The trend is pretty clear, though. The size of scenes than can be done on the GPU today is large and growing fast, both because of improving engineering and because of increasing GPU memory speed & size. It’s just a fact that a lot of commercial work is already done on the GPU, and that most serious commercial renderers already support GPU rendering.

It’s fair to point out that the largest production scenes are still difficult and will remain so for a while. There are decent examples out there of what’s being done in production with GPUs already:

https://www.chaosgroup.com/vray-gpu#showcase

https://www.redshift3d.com/gallery

https://www.arnoldrenderer.com/gallery/

replies(1): >>25617689 #

35. lhoff ◴[02 Jan 21 22:55 UTC] No.25617419{6}[source]▶

>>25617327 #

Not anymore. The new Ampere based Quadros and Teslas just launched with up to 48GB of RAM. A special datacenter version with 80Gb is also already announced: https://www.nvidia.com/en-us/data-center/a100/

They are really expensive though. But chassis and rackspace also isn't free. If one beefy node with a couple GPUs can replace have a rack of CPU only Nodes the GPUs are totally worth it.

I'm not too familiar with 3D rendering but in other workloads the GPU speedup is so huge that if its possible to offload to the GPU it made sense to do it from a economical perspective.

replies(1): >>25620882 #

36. KaiserPro ◴[02 Jan 21 23:18 UTC] No.25617606[source]▶

>>25616527 #

because GPUs in datacenters are expensive.

Not only that, they are massive, kickout a whole bunch of heat in new and interesting ways. worse still they depreciate like a mofo.

the tip top renderbox of today is next years comp box. a two generation old GPU is a pointless toaster.

37. berkut ◴[02 Jan 21 23:29 UTC] No.25617689{5}[source]▶

>>25617360 #

The line I'm drawing is high-end VFX / CG is still IMO years away from using GPUs for final frame (with loads of AOVs and Deep output) rendering.

Are GPUs starting to be used at earlier points in the pipeline? Yes, absolutely, but they always were to a degree in previs and modelling (via rasterisation). They are gradually becoming more useable at more steps in pipelines, but they're not there yet for high-end studios.

In some cases, if a studio's happy using an off-the-shelf renderer with the stock shaders (so no custom shaders at all - at least until OSL is doing batching and GPU stuff, or until MDL actually supports production renderer stuff) studios can use GPUs further down the pipeline, and currently that's smaller scale stuff from what I gather talking to friends who are using Arnold GPU. Certainly the hero-level stuff at Weta / ILM / Framestore isn't being done with with GPUs, as they require custom shaders, and they aren't going to be happy with just using the stock shaders (which are much better than stock shaders from 6/7 years ago, but still far from bleeding edge in terms of BSDFs and patterns).

Even from what I hear at Pixar with their lookdev Flow renderer things aren't completely rosy on the GPU front, although it is at least getting some use, and the expectation is XPU will take over there, but I don't think it's quite ready yet.

Until a studio feels GPU rendering can be used for a significant amount of the renders (that they do, for smaller studios, the fidelity will be less, so the threshold will be lower for them), I think it's going to be a chicken-and-egg problem of not wanting to invest in GPUs on the farms (or even local workstations).

replies(1): >>25619391 #

38. cubano ◴[03 Jan 21 01:54 UTC] No.25618613{4}[source]▶

>>25616923 #

Did you really just say that one frame can be in the TB range??

Didn't you guys get the memo from B. Gates that no one will ever need more than 640k?

39. boulos ◴[03 Jan 21 04:12 UTC] No.25619363{3}[source]▶

>>25617017 #

Dave, doesn’t that video show more like “50% faster”? Here’s the timecode (&t=360) [1] for the “production difficulty” result (which really doesn’t seem to be, but whatever).

Isn’t there a better Vray or Arnold comparison somewhere?

As in my summary comment, an A100 can now run real scenes, but will cost you ~$10k per card. For $10k, you get a lot more threads from AMD.

[1] https://m.youtube.com/watch?v=ZlmRuR5MKmU&t=360

replies(2): >>25619798 #>>25620125 #

40. boulos ◴[03 Jan 21 04:19 UTC] No.25619391{6}[source]▶

>>25617689 #

I think you’re right about the current state (not quite there, especially in raw $$s), but the potential is finally good enough that folks are investing seriously on the software side.

The folks at Framestore and many other shops already don’t do more than XX GiB per frame for their rendering. So for me, this comes down to “can we finally implement a good enough texture cache in optix/the community” which I understand Mark Leone is working on :).

The shader thing seems easy enough. I’m not worried about an OSL compiled output running worse than the C-side. Divergence is a real issue, but so many studios are now using just a handful of BSDFs with lots of textures to drive, that as long as you don’t force the shading to be “per object group” but instead “per shader, varying inputs is fine”, you’ll still get high utilization.

The 80 GiB parts will make it so that some shops could go fully in-core. I expect we’ll see that sooner than you’d think, just because people will start doing interactive work, never want to give it up, and then say “make that but better” for the finals.

41. shaklee3 ◴[03 Jan 21 05:42 UTC] No.25619796{4}[source]▶

>>25617080 #

GPUs do exist with 64+GB of rao, virtually. A dgx2 has distributed memory where you can see the entire 16x32GB of address space backed by nvlink. And that technology is now 3 years old, and it's even higher now.

42. shaklee3 ◴[03 Jan 21 05:44 UTC] No.25619798{4}[source]▶

>>25619363 #

What do you mean by a lot more threads? Are you comparing an epyc?

replies(1): >>25620082 #

43. boulos ◴[03 Jan 21 06:51 UTC] No.25620082{5}[source]▶

>>25619798 #

Yeah, that came off clumsily (I’d lost part of my comment while switching tabs on my phone).

An AMD Rome/Milan part will give you 256 decent threads on a 2S box with a ton of RAM for say $20-25k at list price (e.g., a Dell power edge without any of their premium support or lots of flash). By comparison, the list price of just an A100 is $15k (and you still need a server to drive the thing).

So for shops shoving these into a data center they still need to do a cost/benefit tradeoff of “how much faster is this for our shows, can anyone else make use of it, how much power do these draw...”. If anything, the note about more and more software using CUDA is probably as important as “ray tracing is now sufficiently faster” since the lack of reuse has held them back (similar things for video encoding historically: if you’ve got a lot of cpus around, it was historically hard to beat for $/transcode).

replies(2): >>25620996 #>>25621682 #

44. dahart ◴[03 Jan 21 07:02 UTC] No.25620125{4}[source]▶

>>25619363 #

Yes, this example isn’t quite as high as the 2-10x range I claimed, but I still liked it as an example because the CPU is very beefy, and it’s newer and roughly the same list price as the GPU being compared. I like that they compare power consumption too, and ultimately the GPU comes out well ahead. There are lots of other comparisons that show huge x-factors, this one seemed less likely to get called out for cherry picking, and @berkut’s critique of texture memory consumption for large production scenes is fair... we’re not all the way there yet. But, 50% faster is still “worth it”. In the video, Sam mentions that if you compare lower end components on both sides, the x-factor will be higher.

45. greggman3 ◴[03 Jan 21 07:55 UTC] No.25620346[source]▶

>>25617019 #

AFAIU the issue with GPU rendering is generally you have to design the assets to be GPU friendly. So while you can get a huge speed up at rendering time you get a huge slowdown creating the assets in the first place because you have new issues. Using normal maps and displacement maps instead of millions of polygons. Keeping textures to the minimal size that will get the job done, etc...

Is any of that true?

replies(1): >>25626079 #

46. forelle2 ◴[03 Jan 21 07:55 UTC] No.25620347{5}[source]▶

>>25617054 #

The largest texture sets are heading towards 1TB in size, or at least they were when I was last involved in production support. I saw Mari projects north of 650gb and that was 5 years ago. Disclaimer : I wrote Mari, the vfx industry standard painting system.

Note though these are not single 1TB textures, they’re multiple sets of textures, plus all of the layers that constitute them. Some large robots In particular had 65k 4K textures if you count the layers.

replies(1): >>25622741 #

47. fluffy87 ◴[03 Jan 21 08:57 UTC] No.25620580{4}[source]▶

>>25617080 #

There are already GPUs with >90GB RAM? DGX-A100 has a version with 16 A100 GPUs, having each 90 Gb.. that’s 1.4TB of GPU memory on a single node.

48. ced ◴[03 Jan 21 09:15 UTC] No.25620652[source]▶

>>25616527 #

> while around 3/4 x faster than CPU rendering

My understanding is that for neural networks, the speedup is much more than 4x. Does anyone know why there's such a difference?

replies(1): >>25621058 #

49. erosenbe0 ◴[03 Jan 21 10:08 UTC] No.25620838{6}[source]▶

>>25617327 #

Desktop GPUs could have 64GB of GDDR right now but the memory bus width to drive those bits optimally (in primary use case of real-time game rendering, not offline) would up the power and heat dissipation requirements beyond what is currently engineered onto a [desktop] PCIE card.

If 8k gaming becomes a real thing you can expect work to be done towards a solution, but until then not so much.

Edit: added [desktop] preceding PCIE

50. erosenbe0 ◴[03 Jan 21 10:18 UTC] No.25620882{7}[source]▶

>>25617419 #

Hashing and linear algebra kernels get much more speedup on a GPU than a vfx pipeline does. But I am glad to see reports here detailing that the optimization of vfx is progressing.

51. erosenbe0 ◴[03 Jan 21 10:48 UTC] No.25620996{6}[source]▶

>>25620082 #

Is the high priced 256 thread part that interesting for rendering? You can get 4 of the 64 thread parts on separate boards and each one will have its own 8 channel ddr instead of having to share that bandwidth. Total performance will be higher for less or same money. Power budget will be higher but only a couple dollars a day, at most. But I haven't been involved in a cluster for some time, so not really sure what is done these days.

52. erosenbe0 ◴[03 Jan 21 11:02 UTC] No.25621058{3}[source]▶

>>25620652 #

Sure. Training neural nets is somewhat analogous to starting on the top of a mountain looking for the lowest of the low points of the valley below. But instead of being in normal 3d space you might have 1000d determining your altitude, so you can't see where you're going, and you have to iterate and check. But ultimately you just calculate the same chain of the same type of functions over and over until you've reached a pretty low point in the hypothetical valley.

OTOH, Vfx rendering involves a varying scene with moving light sources, cameras, objects, textures, and physics. Much more dynamic interactions. This is a gross simplification but I hope it helps.

53. Moru ◴[03 Jan 21 12:23 UTC] No.25621369[source]▶

>>25616851 #

Isn't CPU's sturdier too? GPU's running 24/7 is rumored to not be working very well after a few years.

54. Moru ◴[03 Jan 21 12:30 UTC] No.25621393[source]▶

>>25616986 #

And probably needing a complete rewrite of all the tooling they use.

55. shaklee3 ◴[03 Jan 21 13:38 UTC] No.25621682{6}[source]▶

>>25620082 #

The reason I asked is I did a performance trade-off with a v100 and dual epyc rome with 64 cores, and the v100 won handily for my tasks. That obviously won't always be the case, but in terms of threads you're now comparing 256 to 5000+, but obviously not apples to apples.

56. Narann ◴[03 Jan 21 15:54 UTC] No.25622440{3}[source]▶

>>25617017 #

> There is a reason Pixar and Arnold and Vray and most other major industry renderers are moving to the GPU

The reason is that those softwares need to be sold to many, and a big part of studios are doing advertise and series. GPU rendering is perfect for them as they don't need/can't afford large scale render farms.

About your example, that not honest. It's full of instances and perfect use case for a "Wow" effect but it's not a production shot. Doing a production shot required complexity management on the long run, even for CPU rendering. On this side, GPU is more "constrained" than CPU, management is even more complex.

57. CyberDildonics ◴[03 Jan 21 16:30 UTC] No.25622741{6}[source]▶

>>25620347 #

I think we both realize that it's a bit silly to have so much data in textures that you have 100x the pixel data of a 5 second shot at 4k with 32 bit float rgb. 650GB of textures would mean that even with 10gb ethernet (which I'm not sure is common yet) you would wait at least 12 minutes just for the textures to get to the computer before rendering could start and rendering 100 frames at a time would mean 100GB/s from a file server for a single shot. Even a single copy of the textures to freeze an iteration would be thousands in expensive disk space.

I know it doesn't makes sense to tell your clients that what they are doing is nonsense, but if I saw something like that going on, the first thing I would do is chase down why it happened. Massive waste like that is extremely problematic while needing to make a sharper texture for some tiny piece that gets close to the camera is not a big deal.

replies(1): >>25624059 #

58. forelle2 ◴[03 Jan 21 18:59 UTC] No.25624059{7}[source]▶

>>25622741 #

Texture caching in modern renders tends to be on demand and paged so it is very unlikely the full texture set is ever pulled from the filers.

Over texturing like this can be a good decision depending on the production. Asset creation often starts a long time before shots or cameras are locked down.

If you don’t know how an asset is to be used it makes sense to texture all of it upfront as if it will be full screen, 4K.

Taking an asset off final to ‘Upres’ it for a can be a pain in the ass and more costly than just detailing it up in the first place.

In isolation it’s a insane amount of detail and given perfect production planning it is normally not needed, but until directors lock down the scripts and shots it can be the simplest option.

replies(1): >>25625195 #

59. CyberDildonics ◴[03 Jan 21 20:58 UTC] No.25625195{8}[source]▶

>>25624059 #

> Texture caching in modern renders tends to be on demand and paged so it is very unlikely the full texture set is ever pulled from the filers.

This was easier to rely on in the days before ray tracing, when texture filtering was consistent because everything was from the camera. Ray differentials from incoherent rays aren't quite as forgiving.

> If you don’t know how an asset is to be used it makes sense to texture all of it upfront as if it will be full screen, 4K.

4k textures for large parts of the asset in the UV layout can be an acceptable amount of overkill. That's not the same as putting 65,000 4k textures on something because each little part is given its own 4k texture. I know that you know this, but I'm not sure why you would conflate those two things.

> Taking an asset off final to ‘Upres’ it for a can be a pain in the ass and more costly than just detailing it up in the first place

It is very rare that specific textures need to be redone like that and it is not a big deal.

650GB of textures for one asset drags everything from iterations to final renders to disk usage to disk activity to network usage down for every shot in a completely unnecessary way. There isn't a fine line between these things, there is a giant gap between that much excessive texture resolution and needing to upres some piece because it gets close to the camera.

> Asset creation often starts a long time before shots or cameras are locked down.

This is actually fairly rare.

> In isolation it’s a insane amount of detail and given perfect production planning it is normally not needed, but until directors lock down the scripts and shots it can be the simplest option.

That's rarely how the time line fits together. It's irrelevant though, because there is no world where 65,000 4k textures on a single asset makes sense. It's multiple orders of magnitude out of bounds of reality.

I am glad that you have that insane amount of scalability as a focus since you are making tools that people rely on heavily, and I wish way more people on the tools end thought like this. Still, it is about 1000x what would set off red flags in my mind.

I apologize on behalf of whoever told you that was necessary, because they need to learn how to work within reasonable resources (which is not difficult given modern computers), no matter what project or organization they are attached to.

replies(1): >>25627505 #

60. corysama ◴[03 Jan 21 22:59 UTC] No.25626079{3}[source]▶

>>25620346 #

Not for the ILM use-case. I expect they would stick to finely-tessellated geometry. The challenge would be moving all of that data across the PCI bus in and out of the relatively limited DRAM on the GPU. It would require a very intelligent streaming solution. Similar to the one they already have to stream resources from storage to the CPU RAM of various systems.

61. forelle2 ◴[04 Jan 21 02:56 UTC] No.25627505{9}[source]▶

>>25625195 #

Mari was designed in production at Weta, based off the lessons learned from, well, everything that Weta does.

Take for example, a large hero asset like King Kong.

Kong look development started many months before a script was locked down. Kong is 60ft tall, our leading lady is 5’2”.

We think we need shots where she’ll be standing in Kong’s hands, feet, be lifted up to his face, nose etc.

So we need fingers prints that will stand up at 4K renders, tear ducts, pores on the inside on the nose, etc etc but we don’t know. All of which will have to match shot plates in detail.

We could address each of these as the shots turn up and tell the director (who owns the company) he needs to wait a few days for his new shot, or you can break Kong into 500 patches and create a texture for each of the diffuse, 3 spec, 3 subsurface, 4 bump, dirt, blood, dust, scratch, fur, flow etc etc inputs to our shaders.

Let’s says we have 500 UDIM patches for Kong so we can sit our leading lady on the finger tips, and 20 channels to drive our shaders and effects systems.

When working the artist uses 6 paint layers for each channel ( 6 is a massive underestimate for most interesting texture work).

So we have 500 patches * 20 channels * 6 layers which gives us 60k images. Not all of these will need be at 4K however.

For Kong replace any hero asset where shots will be more placed “in and on” the asset rather than “at”. Heli carriers, oil rigs, elven great halls, space ships, giant robots.... The line between asset and environment is blurred at that point and maybe think “set” rather than “asset”

replies(1): >>25628170 #

62. CyberDildonics ◴[04 Jan 21 05:29 UTC] No.25628170{10}[source]▶

>>25627505 #

500 separate 4k textures patches for a character covered in fur is excessively wasteful. Things like 3 4k subsurface maps on 500 patches maps on a black skinned creature that is mostly covered by fur is grossly unnecessary no matter who tells you it's needed.

We both know that stuff isn't showing up on film and that the excess becomes a CYA case of the emperor's new clothes where no one wants to be the one to say it's ridiculous.

> When working the artist uses 6 paint layers for each channel ( 6 is a massive underestimate for most interesting texture work).

This is intermediary and not what is being talked about.

replies(1): >>25628631 #

63. aprdm ◴[04 Jan 21 07:32 UTC] No.25628631{11}[source]▶

>>25628170 #

Your opinion on something doesn’t mean much when confronted with real world experiences from the biggest studios.

replies(1): >>25633365 #

64. CyberDildonics ◴[04 Jan 21 16:21 UTC] No.25633365{12}[source]▶

>>25628631 #

Maybe some day I'll know what I'm talking about. Which part specifically do think is wrong?

replies(1): >>25635444 #

65. aprdm ◴[04 Jan 21 18:11 UTC] No.25635444{13}[source]▶

>>25633365 #

Focusing on the technical steps and what might be technically feasible or not versus the existing world and artists workflows. Also speaking as an authority that knows best patronizing who actually works in the industry.

replies(1): >>25636318 #

66. CyberDildonics ◴[04 Jan 21 19:08 UTC] No.25636318{14}[source]▶

>>25635444 #

> Focusing on the technical steps and what might be technically feasible or not versus the existing world and artists workflows.

I would say it's the opposite. There is nothing necessary about 10,000 4k maps and definitely nothing typical. Workflows trade a certain amount of optimization for consistency, but not like this.

> patronizing who actually works in the industry.

I don't think I was patronizing. This person is valuable in that they are trying to make completely excessive situations work. Telling people (or demonstrating to them) they are being ridiculous is not his responsibility and is a tight rope to walk in his position.

> Also speaking as an authority that knows best

If I said that 2 + 2 = 4 would you ask about a math degree? This is an exercise in appeal to authority. This person and myself aren't even contradicting each other very much.

He is saying the extremes that he has seen, I'm saying that 10,000 pixels of texture data for each pixel in a frame is enormous excess.

The only contradiction is that he seems to think that because someone did it, it must be a neccesity.

Instead of confronting what I'm actually saying, you are trying to rationalize why you don't need to.

replies(1): >>25636394 #

67. aprdm ◴[04 Jan 21 19:13 UTC] No.25636394{15}[source]▶

>>25636318 #

> This person is valuable in that they are trying to make completely excessive situations work. Telling people (or demonstrating to them) they are being ridiculous is not his responsibility and is a tight rope to walk in his position.

Usually the way VFX works is that technology (R&D) is very moved away from production. The artist job is getting the shot done regardless of technology and they have very short deadlines. They usually push the limits.

Digital artists are not very tech savvy in a lot of disciplines, it is not feasible to have a TD in the delivery deadlines of the shots for a show.

The person at Weta also told you how Weta actually worked in Kong which is very typical. You don't know upfront what you need. And you dismissed it as something unnecessary, still, is how every big VFX studio works. Do you feel that you know better and/or everyone is doing something wrong and hasn't really thought about it? If that is the case you might have a business opportunity for a more efficient VFX studio!

replies(1): >>25637088 #

68. Arelius ◴[04 Jan 21 19:18 UTC] No.25636451[source]▶

>>25616372 (TP) #

Honestly? Because they have a big legacy in CPU code, and because of mostly political reasons they haven't invested in making their GPU (Realtime preview) renderer production ready till very recently. There are some serious technical challenges to solve, and not having GPUs with tons of ram among them, but the investment to solve them hasn't really been there yet.

69. CyberDildonics ◴[04 Jan 21 20:04 UTC] No.25637088{16}[source]▶

>>25636394 #

Your post is an actual example of being patronizing. Before I was just trying to explain what the person I replied to probably already knew intuitively.

> how Weta actually worked in Kong which is very typical

It is not typical to have 10,000 4k maps on a creature. What has been typical when rendering at 2k is a set of 2k maps for face, torso, arms and legs. Maybe a single arm and leg can be painted and the UVs can be mirrored, though mostly texture painters will layout the UVs separately and duplicate the texture themselves to leave room for variations.

> it is not feasible to have a TD in the delivery deadlines of the shots for a show.

Actually most of the people working on shots are considered TDs. Specific asset work for some sequence with a hero asset is actually very common, which makes sense if you think about it from a story point of view of needing a visual change to communicate a change of circumstances.

4k rendering (was the 2017 king kong rendered in 4k?) and all the closeups of king kong mean that higher resolution maps and more granular sections are understandable, but it doesn't add up to going from 16 2k maps to 10,000 4k maps. Maps like diffuse, specular and subsurface albedo are also just multiplicative, so there is no reason to have multiple maps unless they need to be rebalanced against each other per shot (such as for variations).

You still never actually explained a problem or inconsistency with anything I've said.

replies(3): >>25638237 #>>25638372 #>>25638788 #

70. aprdm ◴[04 Jan 21 21:20 UTC] No.25638237{17}[source]▶

>>25637088 #

I do not think you said anything wrong, is much less about what you're saying and more about how you're saying (as if it was a simple thing to get right and people are dumb for not doing it in an optimal way).

> Actually most of the people working on shots are considered TDs.

That's not true in the studios I've been. TDs is usually reserved to more close to pipeline folks that aren't doing shot work (as in, delivering shots). They're supporting folks doing so.

For the record, I haven't downvoted you at all.

71. forelle2 ◴[04 Jan 21 21:30 UTC] No.25638372{17}[source]▶

>>25637088 #

An interesting exercise might be working out a texture budget for this asset.

https://youtu.be/PBhCE97ZN98

This was created with the requirement that the director be able to use it at will. Closeups. Set replacements, destruction, the works.

You don’t have a shot breakdown or camera list.

You’ve got 6 months of pre production to support 1000 shots. Once in production you will be the only texture artist supporting 30 TDS.

How do you spend your 6 months to make sure production runs smoothly?

I’m kinda interested in your experience of this stuff as the numbers you’re quoting for 2k work are, in my experience, waaaaaay off and are closer to how a high end games asset would currently be textured.

I don’t disagree with you that the numbers involved are crazy when taken in isolation but it is (or at least was 5 years ago) a very common workflow at ILM, Weta, Dneg, DD, R&H Framestore etc etc. The quoted high numbers are the very upper end but many thousands of assets on hundreds of productions have been textured at what I believe you would consider “insane” detail levels.

72. forelle2 ◴[04 Jan 21 21:57 UTC] No.25638788{17}[source]▶

>>25637088 #

If you’ve worked in high end production I would love to work at your facility.

You clearly understand many of the issues involved but downplay the complexity in running high end assets in less than perfect production.

Unless the industry has changed dramatically in 5 years shot changes, per shot fixes, variants (clean, dirty, destroyed), shader tweaks, happen on every single show I’ve ever been part of.

Render time and storage is one factor as is individual artist iteration, but the real productivity killer is an inter discipline iteration.

Going from a “blurry texture” note in comp to a TD fix to a texture “upres” is potentially a 5 person, 4 day turn around. I would trade a whole bunch of cpu and storage to avoid that.

Computers are cheap, people are expensive, people coordinating even more so.

↑