←back to thread

684 points prettyblocks | 1 comments | | HN request time: 0.263s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
Show context
azhenley ◴[] No.42785041[source]
Microsoft published a paper on their FLAME model (60M parameters) for Excel formula repair/completion which outperformed much larger models (>100B parameters).

https://arxiv.org/abs/2301.13779

replies(4): >>42785270 #>>42785415 #>>42785673 #>>42788633 #
3abiton ◴[] No.42785673[source]
But I feel we're going back full circle. These small models are not generalist, thus not really LLMs at least in terms of objective. Recently there has been a rise of "specialized" models that provide lots of values, but that's not why we were sold on LLMs.
replies(3): >>42785764 #>>42786287 #>>42786397 #
1. Suppafly ◴[] No.42786287[source]
Specialized models work much better still for most stuff. Really we need an LLM to understand the input and then hand it off to a specialized model that actually provides good results.