My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)

(simonwillison.net)

Show context

joelthelion ◴[29 Jul 25 14:58 UTC] No.44724227[source]▶

Apart from using a Mac, what can you use for inference with reasonable performance? Is a Mac the only realistic option at the moment?

replies(6): >>44724398 #>>44724419 #>>44724553 #>>44724563 #>>44724959 #>>44727049 #

1. whimsicalism ◴[29 Jul 25 15:58 UTC] No.44724959[source]▶

>>44724227 #

you are almost certainly better off renting GPUs, but i understand self-hosting is an HN touchstone

replies(2): >>44725021 #>>44725699 #

2. qingcharles ◴[29 Jul 25 16:02 UTC] No.44725021[source]▶

>>44724959 (TP) #

This. Especially if you just want to try a bunch of different things out. Renting is insanely cheap -- to the point where I don't understand how the renters are making their money back unless they stole the hardware and power.

It can really help you figure a ton of things out before you blow the cash on your own hardware.

replies(1): >>44725157 #

3. 4b11b4 ◴[29 Jul 25 16:13 UTC] No.44725157[source]▶

>>44725021 #

Recommended sites to rent from

replies(2): >>44725244 #>>44725337 #

4. doormatt ◴[29 Jul 25 16:20 UTC] No.44725244{3}[source]▶

>>44725157 #

runpod.io

5. whimsicalism ◴[29 Jul 25 16:26 UTC] No.44725337{3}[source]▶

>>44725157 #

runpod, vast, hyperbolic, prime intellect. if all you're doing is going to be running LLMs, you can pay per token on openrouter or some of the providers listed there

6. mrinterweb ◴[29 Jul 25 16:55 UTC] No.44725699[source]▶

>>44724959 (TP) #

I don't know about that. I've had my RTX 4090 for nearly 3 years now. If I had a script that provisioned and deprovisioned a rented 4090 at $0.70/hr for an 8 hour work day for 20 work days per month. Assuming I get 2 paid weeks off per year + normal holidays over 3 years.

0.7 * 8 * ((20 * 12) - 8 - 14) * 3 = $3662

I bought my RTX 4090 for about $2200. I also had the pleasure of being able to use it for gaming when I wasn't working. To be fair, the VRAM requirements for local models keeps climbing and my 4090 isn't able to run many of the latest LLMs. Also, I omitted cost of electricity for my local LLM server cost. I have not been measuring total watts consumed by just that machine.

One nice thing about renting is that it give you flexibility in terms of what you want to try.

If you're really looking for the best deals look at 3rd party hosts serving open models for the API-based pricing, or honestly a Claude subscription can easily be worth it if you use LLMs a fair bit.

replies(1): >>44725791 #

7. whimsicalism ◴[29 Jul 25 17:03 UTC] No.44725791[source]▶

>>44725699 #

1. I agree - there are absolutely scenarios in which it can make sense to buy a GPU and run it yourself. If you are managing a software firm with multiple employees, you very well might break even in less than a few years. But I would gander this is not the case for 90%+ of people self-hosting these models, unless they have some other good reason (like gaming) to buy a GPU.

2. I basically agree with your caveats - excluding electricity is a pretty big exclusion and I don't think that you've had 3 years of really high-value self-hostable models, I would really only say the last year and I'm somewhat skeptical of how good for ones that can be hosted in 24gb vram. 4x4090 is a different story.

↑