←back to thread

255 points tbruckner | 1 comments | | HN request time: 0.204s | source
Show context
adam_arthur ◴[] No.37420461[source]
Even a linear growth rate of average RAM capacity would obviate the need to run current SOTA LLMs remotely in short order.

Historically average RAM has grown far faster than linear, and there really hasn't been anything pressing manufacturers to push the envelope here in the past few years... until now.

It could be that LLM model sizes keep increasing such that we continue to require cloud consumption, but I suspect the sizes will not increase as quickly as hardware for inference.

Given how useful GPT-4 is already. Maybe one more iteration would unlock the vast majority of practical use cases.

I think people will be surprised that consumers ultimately end up benefitting far more from LLMs than the providers. There's not going to be much moat or differentiation to defend margins... more of a race to the bottom on pricing

replies(8): >>37420537 #>>37420948 #>>37421196 #>>37421214 #>>37421497 #>>37421862 #>>37421945 #>>37424918 #
gorbypark ◴[] No.37421945[source]
I can't wait for my phone to have something like 512Gb-1TB of RAM to run some really interesting models locally :D
replies(1): >>37426671 #
AnthonyMouse ◴[] No.37426671[source]
You can buy 768GB of DDR3 and an Ivy Bridge Xeon E5 to put it in for a total of around $500, most of which is the memory. (The CPUs wouldn't be fast for a model that size though.)
replies(1): >>37427110 #
astrange ◴[] No.37427110[source]
I'd be impressed if you fit that into a phone.
replies(1): >>37430712 #
1. AnthonyMouse ◴[] No.37430712[source]
It'll make phone calls. Just put a VoIP app on it.

Obviously what you can do in practice is put the interface on your phone. It doesn't have to run on battery to run locally.