Splatt3R: Zero-Shot Gaussian Splatting from Uncalibrated Image Pairs

1. rkagerer ◴[27 Aug 24 15:43 UTC] No.41368805[source]▶

What is a splat?

replies(4): >>41369333 #>>41369665 #>>41369971 #>>41375736 #

2. llm_nerd ◴[27 Aug 24 16:26 UTC] No.41369333[source]▶

You have a car in real life and want to visualize it in software. You take some pictures of the car from various angles -- each picture a 2D array of pixel data -- and process it through software, transforming it into effectively 3D pixels: Splats.

The splats are individual elongated 3D spheres -- thousands to millions of them -- floating in a 3D coordinate space. 3D pixels, essentially. Each with radiance colour properties so they might have different appearances from different angles (e.g. environmental lighting or reflection, etc.)

The magic is obviously figuring out how each pixel in a set of pictures correlates when translated to a 3D space filled with splats. Traditionally it took a load of pictures for it to rationalize, so doing it with two pictures is pretty amazing.

replies(2): >>41369872 #>>41371212 #

3. boltzmann64 ◴[27 Aug 24 16:47 UTC] No.41369665[source]▶

>>41368805 (TP) #

When you throw a balloon of colored water at a wall, the impression it makes on the wall is called a splat. Say you have a function which takes a point in 3d and outputs a density value which goes to zero as you move away to infinity from the from the functions location (mean) like a bell curve (literally), and you throw (project) that function to plane (your camera film), you get a splat.

Note: I've made some simplifying assumption in the above explanation.

4. CamperBob2 ◴[27 Aug 24 16:59 UTC] No.41369872[source]▶

>>41369333 #

So, voxels, then...?

replies(2): >>41370107 #>>41370956 #

5. dimatura ◴[27 Aug 24 17:07 UTC] No.41369971[source]▶

>>41368805 (TP) #

I'm not a computer graphics expert, but traditionally (since long before the latest 3D gaussian splatting) I've seen splatting used in computer graphics to describes a way of rendering 3D elements onto a 2D canvas with some "transparency", similar to 2D alpha compositing. I think the word derives from "splatter" - like what happens when you throw a tomato against a wall, except here you're throwing some 3D entity onto the camera plane. In the current context of 3D gaussian splatting, the entities that are splatted are 3D gaussians, and the parameters of those 3D gaussians are inferred with optimization at run time and/or predicted from a trained model.

6. llm_nerd ◴[27 Aug 24 17:15 UTC] No.41370107{3}[source]▶

>>41369872 #

Similar, but with the significant difference that splats are elongated spheres with variable orientation and elongation. Voxels are fixed sized, fixed orientation cubes. Splatting can be much more efficient for many scenarios than voxels.

7. kridsdale3 ◴[27 Aug 24 18:23 UTC] No.41370956{3}[source]▶

>>41369872 #

Fuzzy, round-ish voxels.

8. bredren ◴[27 Aug 24 18:42 UTC] No.41371212[source]▶

>>41369333 #

What do you call the processing after a splat, that identifies what's in the model and generates what should exist on the other side?

9. vessenes ◴[28 Aug 24 03:45 UTC] No.41375736[source]▶

>>41368805 (TP) #

Think of a single splat as a Gaussian ellipsoid placed in (usually) 3 dimensions, with size, shape, alpha falloff, color (varying both at any point in the ellipsoid, and when viewed from any angle at that point.)

Now think to yourself “Could I approximate, say Kirby with 10 splats? And could I get a GPU to hone in on the best splats to approximate Kirby, maybe using gradient descent?” Then ask yourself “could I get a 4k+ resolution photographic grade 3D scene that included transmissive and reflective behavior using this method?”

If your answer to the second is “obviously!” Then you have a good head for this ML stuff. Somewhat surprisingly (shockingly?) The answer is ‘yes’, and also somewhat surprisingly, you can use ML pipelines and autograd type tech to hone in on what the 18 or so implied variables for a splat should be, and when you have millions of them, they look like AMAZING reconstructions of scenes. And also are incredibly quick to render.

Splats are pretty cool.