←back to thread

221 points whitefables | 1 comments | | HN request time: 0.21s | source
Show context
varun_ch ◴[] No.41856480[source]
I’m curious about how good the performance with local LLMs is on ‘outdated’ hardware like the author’s 2060. I have a desktop with a 2070 super that it could be fun to turn into an “AI server” if I had the time…
replies(7): >>41856521 #>>41856558 #>>41856559 #>>41856609 #>>41856875 #>>41856894 #>>41857543 #
1. taosx ◴[] No.41856521[source]
Last time I tried a local llm was about a year ago with a 2070S and 3950x and the performance was quite slow for anything beyond phi 3.5 and the small models quality feels worse than what some providers offer for cheap or free so it doesn't seem worth it with my current hardware.

Edit: I've loaded llama 3.1 8b instruct GGUF and I got 12.61 tok/sec and 80tok/sec for 3.2 3b.