684 points prettyblocks | 1 comments | 21 Jan 25 19:39 UTC | HN request time: 0.217s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?

Show context

linsomniac ◴[22 Jan 25 02:58 UTC] No.42788414[source]▶

>>42784365 (OP) #

I have this idea that a tiny LM would be good at canonicalizing entered real estate addresses. We currently buy a data set and software from Experian, but it feels like something an LM might be very good at. There are lots of weirdnesses in address entry that regexes have a hard time with. We know the bulk of addresses a user might be entering, unless it's a totally new property, so we should be able to train it on that.

replies(1): >>42798829 #

1. thesz ◴[22 Jan 25 23:48 UTC] No.42798829[source]▶

>>42788414 #

From my experience (2018), run LLM output through beam search over different choices of canonicalization of certain part of text. Even 3-gram models (yeah, 2018) fare better this way.

↑

Ask HN: Is anyone doing anything cool with tiny language models?