Ask HN: Is anyone doing anything cool with tiny language models?

1. behohippy ◴[21 Jan 25 20:57 UTC] No.42785105[source]▶

I have a mini PC with an n100 CPU connected to a small 7" monitor sitting on my desk, under the regular PC. I have llama 3b (q4) generating endless stories in different genres and styles. It's fun to glance over at it and read whatever it's in the middle of making. I gave llama.cpp one CPU core and it generates slow enough to just read at a normal pace, and the CPU fans don't go nuts. Totally not productive or really useful but I like it.

replies(6): >>42785192 #>>42785253 #>>42785325 #>>42786081 #>>42786114 #>>42787856 #

2. Dansvidania ◴[21 Jan 25 21:05 UTC] No.42785192[source]▶

>>42785105 (TP) #

this sounds pretty cool, do you have any video/media of it?

replies(1): >>42792159 #

3. bithavoc ◴[21 Jan 25 21:11 UTC] No.42785253[source]▶

>>42785105 (TP) #

this is so cool, any chance you post a video?

replies(1): >>42792165 #

4. Uehreka ◴[21 Jan 25 21:20 UTC] No.42785325[source]▶

>>42785105 (TP) #

Do you find that it actually generates varied and diverse stories? Or does it just fall into the same 3 grooves?

Last week I tried to get an LLM (one of the recent Llama models running through Groq, it was 70B I believe) to produce randomly generated prompts in a variety of styles and it kept producing cyberpunk scifi stuff. When I told it to stop doing cyberpunk scifi stuff it went completely to wild west.

replies(7): >>42785456 #>>42786232 #>>42788219 #>>42789260 #>>42792152 #>>42794103 #>>42796598 #

5. o11c ◴[21 Jan 25 21:35 UTC] No.42785456[source]▶

>>42785325 #

You should not ever expect an LLM to actually do what you want without handholding, and randomness in particular is one of the places it fails badly. This is probably fundamental.

That said, this is also not helped by the fact that all of the default interfaces lack many essential features, so you have to build the interface yourself. Neither "clear the context on every attempt" nor "reuse the context repeatedly" will give good results, but having one context producing just one-line summaries, then fresh contexts expanding each one will do slightly less badly.

(If you actually want the LLM to do something useful, there are many more things that need to be added beyond this)

replies(1): >>42786158 #

6. keeganpoppen ◴[21 Jan 25 22:44 UTC] No.42786081[source]▶

>>42785105 (TP) #

oh wow that is actually such a brilliant little use case-- really cuts to the core of the real "magic" of ai: that it can just keep running continuously. it never gets tired, and never gets tired of thinking.

7. ipython ◴[21 Jan 25 22:46 UTC] No.42786114[source]▶

>>42785105 (TP) #

That's neat. I just tried something similar:

    FORTUNE=$(fortune) && echo $FORTUNE && echo "Convert the following output of the Unix `fortune` command into a small screenplay in the style of Shakespeare: \n\n $FORTUNE" | ollama run phi4

replies(1): >>42790266 #

8. dotancohen ◴[21 Jan 25 22:50 UTC] No.42786158{3}[source]▶

>>42785456 #

Sounds to me like you might want to reduce the Top P - that will prevent the really unlikely next tokens from ever being selected, while still providing nice randomness in the remaining next tokens so you continue to get diverse stories.

9. janalsncm ◴[21 Jan 25 22:56 UTC] No.42786232[source]▶

>>42785325 #

Generate a list of 5000 possible topics you’d like it to talk about. Randomly pick one and inject that into your prompt.

10. droideqa ◴[22 Jan 25 01:53 UTC] No.42787856[source]▶

>>42785105 (TP) #

That's awesome!

11. coder543 ◴[22 Jan 25 02:33 UTC] No.42788219[source]▶

>>42785325 #

Someone mentioned generating millions of (very short) stories with an LLM a few weeks ago: https://news.ycombinator.com/item?id=42577644

They linked to an interactive explorer that nicely shows the diversity of the dataset, and the HF repo links to the GitHub repo that has the code that generated the stories: https://github.com/lennart-finke/simple_stories_generate

So, it seems there are ways to get varied stories.

replies(1): >>42841237 #

12. TMWNN ◴[22 Jan 25 05:13 UTC] No.42789260[source]▶

>>42785325 #

> Do you find that it actually generates varied and diverse stories? Or does it just fall into the same 3 grooves?

> Last week I tried to get an LLM (one of the recent Llama models running through Groq, it was 70B I believe) to produce randomly generated prompts in a variety of styles and it kept producing cyberpunk scifi stuff.

100% relevant: "Someday" <https://en.wikipedia.org/wiki/Someday_(short_story)> by Isaac Asimov, 1956

13. watermelon0 ◴[22 Jan 25 08:01 UTC] No.42790266[source]▶

>>42786114 #

Doesn't `fortune` inside double quotes execute the command in bash? You should use single quotes instead of backticks.

14. behohippy ◴[22 Jan 25 12:40 UTC] No.42792152[source]▶

>>42785325 #

It's a 3b model so the creativity is pretty limited. What helped for me was prompting for specific stories in specific styles. I have a python script that randomizes the prompt and the writing style, including asking for specific author styles.

15. behohippy ◴[22 Jan 25 12:40 UTC] No.42792159[source]▶

>>42785192 #

I don't have a video but here's a pic of the output: https://imgur.com/ip8GWIh

replies(1): >>42799809 #

16. behohippy ◴[22 Jan 25 12:41 UTC] No.42792165[source]▶

>>42785253 #

Just this pic: https://imgur.com/ip8GWIh

17. greenavocado ◴[22 Jan 25 15:55 UTC] No.42794103[source]▶

>>42785325 #

Set temperature to 1.0

18. jaggs ◴[22 Jan 25 19:32 UTC] No.42796598[source]▶

>>42785325 #

https://old.reddit.com/r/LocalLLaMA/comments/1i615u1/the_fir...

19. sky2224 ◴[23 Jan 25 02:00 UTC] No.42799809{3}[source]▶

>>42792159 #

The next step is to format it so it looks like an endless starwars intro.

20. fi-le ◴[27 Jan 25 14:03 UTC] No.42841237{3}[source]▶

>>42788219 #

I was wondering where the traffic came from, thanks for mentioning it!