(geek.sg)

221 points whitefables | 1 comments | 16 Oct 24 05:26 UTC | HN request time: 0.207s | source

Show context

varun_ch ◴[16 Oct 24 07:27 UTC] No.41856480[source]▶

I’m curious about how good the performance with local LLMs is on ‘outdated’ hardware like the author’s 2060. I have a desktop with a 2070 super that it could be fun to turn into an “AI server” if I had the time…

replies(7): >>41856521 #>>41856558 #>>41856559 #>>41856609 #>>41856875 #>>41856894 #>>41857543 #

1. taosx ◴[16 Oct 24 07:34 UTC] No.41856521[source]▶

>>41856480 #

Last time I tried a local llm was about a year ago with a 2070S and 3950x and the performance was quite slow for anything beyond phi 3.5 and the small models quality feels worse than what some providers offer for cheap or free so it doesn't seem worth it with my current hardware.

Edit: I've loaded llama 3.1 8b instruct GGUF and I got 12.61 tok/sec and 80tok/sec for 3.2 3b.

↑

I Self-Hosted Llama 3.2 with Coolify on My Home Server