←back to thread

229 points modinfo | 1 comments | | HN request time: 0.268s | source
Show context
LarsDu88 ◴[] No.40835167[source]
If they're going to cram a small LLM in the browser, they might as well start cramming a small image generating diffusion model + new image format to go along with it.

I believe we can start compressing down the amount of data going over the wire 100x this way...

replies(1): >>40835244 #
squigz ◴[] No.40835244[source]
> I believe we can start compressing down the amount of data going over the wire 100x this way...

What?

replies(2): >>40835364 #>>40836094 #
1. LarsDu88 ◴[] No.40835364[source]
I should correct myself a bit here. I believe it's actually UNET type models that can be used to very lossily "compress" images. You can use latents as the images, and use the locally running image generation model to expand the latents to full images.

In fact any kind of decoder model, including text models can use the same principle to lossily compress data. Of course, hallucination will be a thing...

Diffusion models, depending on the full architecture might not have smaller dimension layers that could be used for compression.