I recently made a little tool for people interested in running local LLMs to figure out if their hardware is able to run an LLM in GPU memory.
replies(10):
- Use natural language for telling offloading requirements.
- Just year of the LLM launch of HF url can help if it’s an outdated LLM or a cutting edge LLM.
- VLMs/Embedding models are missing?
- Use natural language for telling offloading requirements.
Do you mean remove the JSON thing and just summarise the offloading requirements? - Just year of the LLM launch of HF url can help if it’s an outdated LLM or a cutting edge LLM.
Great Idea - I will try add this tonight. - VLMs/Embedding models are missing?
Yeah I just have text generation models ATM as that is by far where the most interest is. I will look at adding other model types in another type, but wouldn't be until the weekend that I do that.