←back to thread

261 points david927 | 3 comments | | HN request time: 0.881s | source

What are you working on? Any new ideas that you're thinking about?
Show context
AJRF ◴[] No.43156818[source]
I recently made a little tool for people interested in running local LLMs to figure out if their hardware is able to run an LLM in GPU memory.

https://canirunthisllm.com/

replies(10): >>43156837 #>>43156946 #>>43157271 #>>43157577 #>>43157623 #>>43157743 #>>43158600 #>>43159526 #>>43160623 #>>43163802 #
1. niek_pas ◴[] No.43158600[source]
Feature request: I would like to know if I can run _any_ LLms on my machine, and if so, which.
replies(1): >>43160840 #
2. AJRF ◴[] No.43160840[source]
I've now had multiple people ask for this - I will work on adding a new tab for this feature as it is a little different than what the site was originally intended to do.

Generally speaking models seem to be bucketed by param count (3b, 7b, 8b, 14b, 34b, 70b) so for a given VRAM bucket you will end up being able to run 1000's of models - so is it valuable to show 1000s of models?

My bet is "No" - and what really is valuable is like the top 50 trending models on HuggingFace that would fit in your VRAM bucket. So I will try build that.

Would love your thoughts on that though - does that sound like a good idea?

replies(1): >>43289321 #
3. niek_pas ◴[] No.43289321[source]
I see your point. I think the solution you mention (top 50 trending models) is as good a solution as I could come up with. Maybe the flow should be: Select a GPU / device -> list all the runnable models, sorted by popularity descending. How you want to operationalize popularity is another question...