←back to thread

684 points prettyblocks | 1 comments | | HN request time: 0.226s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
Show context
iamnotagenius ◴[] No.42784970[source]
No, but I use llama 3.2 1b and qwen2.5 1.5 as bash oneliner generator, always runnimg in console.
replies(2): >>42785424 #>>42786003 #
1. XMasterrrr ◴[] No.42786003[source]
What's your workflow like? I use AI Chat. I load Qwen2.5-1.5B-Instruct with llama.cpp server, fully offloaded to the CPU, and then I config AI Chat to connect to the llama.cpp endpoint.