684 points prettyblocks | 5 comments | 21 Jan 25 19:39 UTC | HN request time: 0.015s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?

Show context

mettamage ◴[21 Jan 25 20:17 UTC] No.42784724[source]▶

>>42784365 (OP) #

I simply use it to de-anonymize code that I typed in via Claude

Maybe should write a plugin for it (open source):

1. Put in all your work related questions in the plugin, an LLM will make it as an abstract question for you to preview and send it

2. And then get the answer with all the data back

E.g. df[“cookie_company_name”] becomes df[“a”] and back

replies(4): >>42784789 #>>42785696 #>>42785808 #>>42788777 #

politelemon ◴[21 Jan 25 20:24 UTC] No.42784789[source]▶

>>42784724 #

Could you recommend a tiny language model I could try out locally?

replies(1): >>42784953 #

1. mettamage ◴[21 Jan 25 20:40 UTC] No.42784953[source]▶

>>42784789 #

Llama 3.2 has about 3.2b parameters. I have to admit, I use bigger ones like phi-4 (14.7b) and Llama 3.3 (70.6b) but I think Llama 3.2 could do de-anonimization and anonimization of code

replies(2): >>42785057 #>>42785333 #

2. OxfordOutlander ◴[21 Jan 25 20:52 UTC] No.42785057[source]▶

>>42784953 (TP) #

+1 this idea. I do the same. Just do it locally using ollama, also using 3.2 3b

3. RicoElectrico ◴[21 Jan 25 21:21 UTC] No.42785333[source]▶

>>42784953 (TP) #

Llama 3.2 punches way above its weight. For general "language manipulation" tasks it's good enough - and it can be used on a CPU with acceptable speed.

replies(1): >>42785773 #

4. seunosewa ◴[21 Jan 25 22:17 UTC] No.42785773[source]▶

>>42785333 #

How many tokens/s?

replies(1): >>42792310 #

5. iamnotagenius ◴[22 Jan 25 12:58 UTC] No.42792310{3}[source]▶

>>42785773 #

10-15t/s on 12400 with ddr5

↑

Ask HN: Is anyone doing anything cool with tiny language models?