←back to thread

623 points magicalhippo | 6 comments | | HN request time: 0.425s | source | bottom
1. blackoil ◴[] No.42620569[source]
Is there any effort in local cloud computing? I can't justify $3000 for a fun device. But if all devices (6 phone, 2 iPads, a desktop and 2 laptops) in my home can leverage that for fast LLM, gaming, and photo/video editing, now it makes so much more sense.
replies(5): >>42620586 #>>42620810 #>>42620877 #>>42621768 #>>42621901 #
2. KeplerBoy ◴[] No.42620586[source]
You can just setup your local openAI like API endpoints for LLMs. Most devices and apps won't be able to use them, because consumers don't run self-hosted apps, but for a simple chatGPT style app this is totally viable. Today.
3. papichulo2023 ◴[] No.42620810[source]
Most tools expose openai-like apis that you can easily integrate with.
4. TiredOfLife ◴[] No.42620877[source]
That is literally how it was announced as. AI cloud in a box. That can also be used as Linux desktop.
5. reissbaker ◴[] No.42621768[source]
Open WebUI, SillyTavern, and other frontends can access any OpenAI-compatible server, and on Nvidia cards you have a wealth of options that will run one of those servers for you: llama.cpp (or the Ollama wrapper), of course, but also the faster vLLM and SGLang inference engines. Buy one of these, slap SGLang or vLLM on it, and point your devices at your machine's local IP address.

I'm mildly skeptical about performance here: they aren't saying what the memory bandwidth is, and that'll have a major impact on tokens-per-second. If it's anywhere close to the 4090, or even the M2 Ultra, 128GB of Nvidia is a steal at $3k. Getting that amount of VRAM on anything non-Apple used to be tens of thousands of dollars.

(They're also mentioning running the large models at Q4, which will definitely hurt the model's intelligence vs FP8 or BF16. But most people running models on Macs runs them at Q4, so I guess it's a valid comparison. You can at least run a 70B at FP8 on one of these even with fairly large context size, which I think will be the sweet spot.)

6. aa-jv ◴[] No.42621901[source]
For companies interested in integrating machine learning into their products, both soft and hard - essentially training models specific to a particular use-case - this could be quite a useful tool to add to the kit.