684 points prettyblocks | 1 comments | 21 Jan 25 19:39 UTC | HN request time: 0.263s | source
I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
Microsoft published a paper on their FLAME model (60M parameters) for Excel formula repair/completion which outperformed much larger models (>100B parameters).
But I feel we're going back full circle. These small models are not generalist, thus not really LLMs at least in terms of objective. Recently there has been a rise of "specialized" models that provide lots of values, but that's not why we were sold on LLMs.
Specialized models work much better still for most stuff. Really we need an LLM to understand the input and then hand it off to a specialized model that actually provides good results.