Most active commenters

spiritplumber(8)
computers3333(7)
kaspermarstal(6)
Evidlo(5)
kortilla(5)
mettamage(5)
antonok(4)
behohippy(4)
iamnotagenius(4)
deivid(4)

Popular/hot comments

>>42785739 #
>>42790190 #
>>42786869 #
>>42786641 #
>>42785325 #
>>42785105 #
>>42786957 #
>>42786993 #
>>42791453 #
>>42786586 #
>>42786841 #
>>42788468 #
>>42784612 #
>>42785838 #
>>42792014 #
>>42785041 #
>>42787404 #
>>42790370 #
>>42784724 #
>>42786891 #

Ask HN: Is anyone doing anything cool with tiny language models?

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?

1. psyklic ◴[21 Jan 25 20:05 UTC] No.42784612[source]▶

>>42784365 (OP) #

JetBrains' local single-line autocomplete model is 0.1B (w/ 1536-token context, ~170 lines of code): https://blog.jetbrains.com/blog/2024/04/04/full-line-code-co...

For context, GPT-2-small is 0.124B params (w/ 1024-token context).

replies(4): >>42785009 #>>42785728 #>>42785838 #>>42786326 #

2. mettamage ◴[21 Jan 25 20:17 UTC] No.42784724[source]▶

>>42784365 (OP) #

I simply use it to de-anonymize code that I typed in via Claude

Maybe should write a plugin for it (open source):

1. Put in all your work related questions in the plugin, an LLM will make it as an abstract question for you to preview and send it

2. And then get the answer with all the data back

E.g. df[“cookie_company_name”] becomes df[“a”] and back

replies(4): >>42784789 #>>42785696 #>>42785808 #>>42788777 #

3. politelemon ◴[21 Jan 25 20:24 UTC] No.42784789[source]▶

>>42784724 #

Could you recommend a tiny language model I could try out locally?

replies(1): >>42784953 #

4. RhysU ◴[21 Jan 25 20:37 UTC] No.42784922[source]▶

>>42784365 (OP) #

"Comedy Writing With Small Generative Models" by Jamie Brew (Strange Loop 2023)

https://m.youtube.com/watch?v=M2o4f_2L0No

Spend the 45 minutes watching this talk. It is a delight. If you are unsure, wait until the speaker picks up the guitar.

replies(2): >>42784951 #>>42798484 #

5. 100k ◴[21 Jan 25 20:40 UTC] No.42784951[source]▶

>>42784922 #

Seconded! This was my favorite talk at Strange Loop (including my own).

6. mettamage ◴[21 Jan 25 20:40 UTC] No.42784953{3}[source]▶

>>42784789 #

Llama 3.2 has about 3.2b parameters. I have to admit, I use bigger ones like phi-4 (14.7b) and Llama 3.3 (70.6b) but I think Llama 3.2 could do de-anonimization and anonimization of code

replies(2): >>42785057 #>>42785333 #

7. iamnotagenius ◴[21 Jan 25 20:42 UTC] No.42784970[source]▶

>>42784365 (OP) #

No, but I use llama 3.2 1b and qwen2.5 1.5 as bash oneliner generator, always runnimg in console.

replies(2): >>42785424 #>>42786003 #

8. Havoc ◴[21 Jan 25 20:44 UTC] No.42784983[source]▶

>>42784365 (OP) #

Pretty sure they are mostly used as fine tuning targets, rather than as-is.

replies(1): >>42785562 #

9. WithinReason ◴[21 Jan 25 20:46 UTC] No.42785009[source]▶

>>42784612 #

That size is on the edge of something you can train at home

replies(2): >>42785431 #>>42786773 #

10. arionhardison ◴[21 Jan 25 20:48 UTC] No.42785022[source]▶

>>42784365 (OP) #

I am, in a way by using EHR/EMR data for fine tuning so agents can query each other for medical records in a HIPPA compliant manner.

11. azhenley ◴[21 Jan 25 20:50 UTC] No.42785041[source]▶

>>42784365 (OP) #

Microsoft published a paper on their FLAME model (60M parameters) for Excel formula repair/completion which outperformed much larger models (>100B parameters).

https://arxiv.org/abs/2301.13779

replies(4): >>42785270 #>>42785415 #>>42785673 #>>42788633 #

12. OxfordOutlander ◴[21 Jan 25 20:52 UTC] No.42785057{4}[source]▶

>>42784953 #

+1 this idea. I do the same. Just do it locally using ollama, also using 3.2 3b

13. behohippy ◴[21 Jan 25 20:57 UTC] No.42785105[source]▶

>>42784365 (OP) #

I have a mini PC with an n100 CPU connected to a small 7" monitor sitting on my desk, under the regular PC. I have llama 3b (q4) generating endless stories in different genres and styles. It's fun to glance over at it and read whatever it's in the middle of making. I gave llama.cpp one CPU core and it generates slow enough to just read at a normal pace, and the CPU fans don't go nuts. Totally not productive or really useful but I like it.

replies(6): >>42785192 #>>42785253 #>>42785325 #>>42786081 #>>42786114 #>>42787856 #

14. eb0la ◴[21 Jan 25 21:04 UTC] No.42785183[source]▶

>>42784365 (OP) #

We're using small language models to detect prompt injection. Not too cool, but at least we can publish some AI-related stuff on the internet without a huge bill.

replies(1): >>42785713 #

15. Dansvidania ◴[21 Jan 25 21:05 UTC] No.42785192[source]▶

>>42785105 #

this sounds pretty cool, do you have any video/media of it?

replies(1): >>42792159 #

16. bithavoc ◴[21 Jan 25 21:11 UTC] No.42785253[source]▶

>>42785105 #

this is so cool, any chance you post a video?

replies(1): >>42792165 #

17. barrenko ◴[21 Jan 25 21:14 UTC] No.42785270[source]▶

>>42785041 #

This is really cool. Is this already in Excel?

18. A4ET8a8uTh0_v2 ◴[21 Jan 25 21:18 UTC] No.42785307[source]▶

>>42784365 (OP) #

Kinda? All local so very much personal, non-business use. I made Ollama talk in a specific persona styles with the idea of speaking like Spider Jerusalem, when I feel like retaining some level of privacy by avoiding phrases I would normally use. Uncensored llama just rewrites my post with a specific persona's 'voice'. Works amusingly well for that purpose.

19. Uehreka ◴[21 Jan 25 21:20 UTC] No.42785325[source]▶

>>42785105 #

Do you find that it actually generates varied and diverse stories? Or does it just fall into the same 3 grooves?

Last week I tried to get an LLM (one of the recent Llama models running through Groq, it was 70B I believe) to produce randomly generated prompts in a variety of styles and it kept producing cyberpunk scifi stuff. When I told it to stop doing cyberpunk scifi stuff it went completely to wild west.

replies(7): >>42785456 #>>42786232 #>>42788219 #>>42789260 #>>42792152 #>>42794103 #>>42796598 #

20. RicoElectrico ◴[21 Jan 25 21:21 UTC] No.42785333{4}[source]▶

>>42784953 #

Llama 3.2 punches way above its weight. For general "language manipulation" tasks it's good enough - and it can be used on a CPU with acceptable speed.

replies(1): >>42785773 #

21. deet ◴[21 Jan 25 21:28 UTC] No.42785400[source]▶

>>42784365 (OP) #

We (avy.ai) are using models in that range to analyze computer activity on-device, in a privacy sensitive way, to help knowledge workers as they go about their day.

The local models do things ranging from cleaning up OCR, to summarizing meetings, to estimating the user's current goals and activity, to predicting search terms, to predicting queries and actions that, if run, would help the user accomplish their current task.

The capabilities of these tiny models have really surged recently. Even small vision models are becoming useful, especially if fine tuned.

replies(1): >>42791416 #

22. andai ◴[21 Jan 25 21:30 UTC] No.42785415[source]▶

>>42785041 #

This is wild. They claim it was trained exclusively on Excel formulas, but then they mention retrieval? Is it understanding the connection between English and formulas? Or am I misunderstanding retrieval in this context?

Edit: No, the retrieval is Formula-Formula, the model (nor I believe tokenizer) does not handle English.

23. andai ◴[21 Jan 25 21:30 UTC] No.42785424[source]▶

>>42784970 #

Could you elaborate?

replies(2): >>42785998 #>>42792097 #

24. vineyardmike ◴[21 Jan 25 21:31 UTC] No.42785431{3}[source]▶

>>42785009 #

If you have modern hardware, you can absolutely train that at home. Or very affordable on a cloud service.

I’ve seen a number of “DIY GPT-2” tutorials that target this sweet spot. You won’t get amazing results unless you want to leave a personal computer running for a number of hours/days and you have solid data to train on locally, but fine-tuning should be in the realm of normal hobbyists patience.

replies(1): >>42785617 #

25. o11c ◴[21 Jan 25 21:35 UTC] No.42785456{3}[source]▶

>>42785325 #

You should not ever expect an LLM to actually do what you want without handholding, and randomness in particular is one of the places it fails badly. This is probably fundamental.

That said, this is also not helped by the fact that all of the default interfaces lack many essential features, so you have to build the interface yourself. Neither "clear the context on every attempt" nor "reuse the context repeatedly" will give good results, but having one context producing just one-line summaries, then fresh contexts expanding each one will do slightly less badly.

(If you actually want the LLM to do something useful, there are many more things that need to be added beyond this)

replies(1): >>42786158 #

26. dcl ◴[21 Jan 25 21:50 UTC] No.42785562[source]▶

>>42784983 #

But for what purposes?

27. ignoramous ◴[21 Jan 25 21:51 UTC] No.42785568[source]▶

>>42784365 (OP) #

We're prototyping a text firewall (for Android) with Gemma2 2B (which limits us to English), though DeepSeek's R1 variants now look pretty promising [0]: Depending on the content, we rewrite the text or quarantine it from your view. Of course this is easy (for English) in the sense that the core logic is all LLMs [1], but the integration points (on Android) are not so straight forward for anything other than SMS. [2]

A more difficult problem we forsee is to turn it into a real-time (online) firewall (for calls, for example).

[1] https://chat.deepseek.com/a/chat/s/d5aeeda1-fefe-4fc6-8c90-2...

[1] MediaPipe in particular makes it simple to prototype around Gemma2 on Android: https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inf...

[2] Intend to open source it once we get it working for anything other than SMSes

replies(1): >>42895850 #

28. mritchie712 ◴[21 Jan 25 21:51 UTC] No.42785570[source]▶

>>42784365 (OP) #

I used local LLMs via Ollama for generating H1's / marketing copy.

1. Create several different personas

2. Generate a ton of variation using a high temperature

3. Compare the variagtions head-to-head using the LLM to get a win / loss ratio

The best ones can be quite good.

0 - https://www.definite.app/blog/overkillm

replies(2): >>42788292 #>>42790343 #

29. nottorp ◴[21 Jan 25 21:59 UTC] No.42785617{4}[source]▶

>>42785431 #

Hmm is there anything reasonably ready made* for this spot? Training and querying a llm locally on an existing codebase?

* I don't mind compiling it myself but i'd rather not write it.

30. 3abiton ◴[21 Jan 25 22:05 UTC] No.42785673[source]▶

>>42785041 #

But I feel we're going back full circle. These small models are not generalist, thus not really LLMs at least in terms of objective. Recently there has been a rise of "specialized" models that provide lots of values, but that's not why we were sold on LLMs.

replies(3): >>42785764 #>>42786287 #>>42786397 #

31. sitkack ◴[21 Jan 25 22:09 UTC] No.42785696[source]▶

>>42784724 #

So you are using a local small model to remove identifying information and make the question generic, which is then sent to a larger model? Is that understanding correct?

I think this would have some additional benefits of not confusing the larger model with facts it doesn't need to know about. My erasing information, you can allow its attention heads to focus on the pieces that matter.

Requires further study.

replies(1): >>42790194 #

32. sitkack ◴[21 Jan 25 22:10 UTC] No.42785713[source]▶

>>42785183 #

What kind of prompt injection attacks do you filter out? Have you tested with a prompt tuning framework?

33. smaddox ◴[21 Jan 25 22:11 UTC] No.42785728[source]▶

>>42784612 #

You can train that size of a model on ~1 billion tokens in ~3 minutes on a rented 8xH100 80GB node (~$9/hr on Lambda Labs, RunPod io, etc.) using the NanoGPT speed run repo: https://github.com/KellerJordan/modded-nanogpt

For that short of a run, you'll spend more time waiting for the node to come up, downloading the dataset, and compiling the model, though.

34. flippyhead ◴[21 Jan 25 22:12 UTC] No.42785739[source]▶

>>42784365 (OP) #

I have a tiny device that listens to conversations between two people or more and constantly tries to declare a "winner"

replies(14): >>42785781 #>>42785791 #>>42785949 #>>42785970 #>>42785979 #>>42786455 #>>42786672 #>>42787108 #>>42788174 #>>42788937 #>>42789840 #>>42791711 #>>42807514 #>>42890452 #

35. colechristensen ◴[21 Jan 25 22:16 UTC] No.42785764{3}[source]▶

>>42785673 #

But that's the thing, I don't need my ML model to be able to write me a sonnet about the history of beets, especially if I want to run it at home for specific tasks like as a programming assistant.

I'm fine with and prefer specialist models in most cases.

replies(1): >>42786703 #

36. seunosewa ◴[21 Jan 25 22:17 UTC] No.42785773{5}[source]▶

>>42785333 #

How many tokens/s?

replies(1): >>42792310 #

37. pseudosavant ◴[21 Jan 25 22:17 UTC] No.42785781[source]▶

>>42785739 #

I'd love to hear more about the hardware behind this project. I've had concepts for tech requiring a mic on me at all times for various reasons. Always tricky to have enough power in a reasonable DIY form factor.

38. oa335 ◴[21 Jan 25 22:18 UTC] No.42785791[source]▶

>>42785739 #

This made me actually laugh out loud. Can you share more details on hardware and models used?

39. sauwan ◴[21 Jan 25 22:19 UTC] No.42785808[source]▶

>>42784724 #

Are you using the model to create a key-value pair to find/replace and then reverse to reanonymize, or are you using its outputs directly? If the latter, is it fast enough and reliable enough?

40. pseudosavant ◴[21 Jan 25 22:22 UTC] No.42785838[source]▶

>>42784612 #

I wonder how big that model is in RAM/disk. I use LLMs for FFMPEG all the time, and I was thinking about training a model on just the FFMPEG CLI arguments. If it was small enough, it could be a package for FFMPEG. e.g. `ffmpeg llm "Convert this MP4 into the latest royalty-free codecs in an MKV."`

replies(4): >>42785929 #>>42786381 #>>42786629 #>>42787136 #

41. jedbrooke ◴[21 Jan 25 22:33 UTC] No.42785929{3}[source]▶

>>42785838 #

the jetbrains models are about 70MB zipped on disk (one model per language)

replies(1): >>42794671 #

42. simonjgreen ◴[21 Jan 25 22:33 UTC] No.42785938[source]▶

>>42784365 (OP) #

Micro Wake Word is a library and set of on device models for ESPs to wake on a spoken wake word. https://github.com/kahrendt/microWakeWord

Recently deployed in Home Assistants fully local capable Alexa replacement. https://www.home-assistant.io/voice_control/about_wake_word/

replies(1): >>42789733 #

43. econ ◴[21 Jan 25 22:34 UTC] No.42785949[source]▶

>>42785739 #

This is a product I want

44. amelius ◴[21 Jan 25 22:36 UTC] No.42785970[source]▶

>>42785739 #

You can use the model to generate winning speeches also.

45. jjcm ◴[21 Jan 25 22:37 UTC] No.42785979[source]▶

>>42785739 #

Are you raising a funding round? I'm bought in. This is hilarious.

46. XMasterrrr ◴[21 Jan 25 22:39 UTC] No.42785998{3}[source]▶

>>42785424 #

I think I know what he means. I use AI Chat. I load Qwen2.5-1.5B-Instruct with llama.cpp server, fully offloaded to the CPU, and then I config AI Chat to connect to the llama.cpp endpoint.

Checkout the demo they have below

https://github.com/sigoden/aichat#shell-assistant

47. XMasterrrr ◴[21 Jan 25 22:39 UTC] No.42786003[source]▶

>>42784970 #

What's your workflow like? I use AI Chat. I load Qwen2.5-1.5B-Instruct with llama.cpp server, fully offloaded to the CPU, and then I config AI Chat to connect to the llama.cpp endpoint.

48. keeganpoppen ◴[21 Jan 25 22:44 UTC] No.42786081[source]▶

>>42785105 #

oh wow that is actually such a brilliant little use case-- really cuts to the core of the real "magic" of ai: that it can just keep running continuously. it never gets tired, and never gets tired of thinking.

49. ipython ◴[21 Jan 25 22:46 UTC] No.42786114[source]▶

>>42785105 #

That's neat. I just tried something similar:

    FORTUNE=$(fortune) && echo $FORTUNE && echo "Convert the following output of the Unix `fortune` command into a small screenplay in the style of Shakespeare: \n\n $FORTUNE" | ollama run phi4

replies(1): >>42790266 #

50. dotancohen ◴[21 Jan 25 22:50 UTC] No.42786158{4}[source]▶

>>42785456 #

Sounds to me like you might want to reduce the Top P - that will prevent the really unlikely next tokens from ever being selected, while still providing nice randomness in the remaining next tokens so you continue to get diverse stories.

51. kristopolous ◴[21 Jan 25 22:55 UTC] No.42786214[source]▶

>>42784365 (OP) #

I'm working on using them for agentic voice commands of a limited scope.

My needs are narrow and limited but I want a bit of flexibility.

52. janalsncm ◴[21 Jan 25 22:56 UTC] No.42786232{3}[source]▶

>>42785325 #

Generate a list of 5000 possible topics you’d like it to talk about. Randomly pick one and inject that into your prompt.

53. Suppafly ◴[21 Jan 25 23:01 UTC] No.42786287{3}[source]▶

>>42785673 #

Specialized models work much better still for most stuff. Really we need an LLM to understand the input and then hand it off to a specialized model that actually provides good results.

54. staticautomatic ◴[21 Jan 25 23:04 UTC] No.42786326[source]▶

>>42784612 #

Is that why their tab completion is so bad now?

replies(1): >>42791707 #

55. ata_aman ◴[21 Jan 25 23:08 UTC] No.42786354[source]▶

>>42784365 (OP) #

I have it running on a Raspberry Pi 5 for offline chat and RAG. I wrote this open-source code for it: https://github.com/persys-ai/persys

It also does RAG on apps there, like the music player, contacts app and to-do app. I can ask it to recommend similar artists to listen to based on my music library for example or ask it to quiz me on my PDF papers.

replies(1): >>42788228 #

56. h0l0cube ◴[21 Jan 25 23:10 UTC] No.42786381{3}[source]▶

>>42785838 #

Please submit a blog post to HN when you're done. I'd be curious to know the most minimal LLM setup needed get consistently sane output for FFMPEG parameters.

57. janalsncm ◴[21 Jan 25 23:12 UTC] No.42786397{3}[source]▶

>>42785673 #

I think playing word games about what really counts as an LLM is a losing battle. It has become a marketing term, mostly. It’s better to have a functionalist point of view of “what can this thing do”.

58. deivid ◴[21 Jan 25 23:14 UTC] No.42786422[source]▶

>>42784365 (OP) #

Not sure it qualifies, but I've started building an Android app that wraps bergamot[0] (the firefox translation models) to have on-device translation without reliance on google.

Bergamot is already used inside firefox, but I wanted translation also outside the browser.

[0]: bergamot https://github.com/browsermt/bergamot-translator

replies(2): >>42786996 #>>42789246 #

59. hn8726 ◴[21 Jan 25 23:17 UTC] No.42786455[source]▶

>>42785739 #

What approach/stack would you recommend for listening to an ongoing conversation, transcribing it and passing through llm? I had some use cases in mind but I'm not very familiar with AI frameworks and tools

60. nozzlegear ◴[21 Jan 25 23:30 UTC] No.42786586[source]▶

>>42784365 (OP) #

I have a small fish script I use to prompt a model to generate three commit messages based off of my current git diff. I'm still playing around with which model comes up with the best messages, but usually I only use it to give me some ideas when my brain isn't working. All the models accomplish that task pretty well.

Here's the script: https://github.com/nozzlegear/dotfiles/blob/master/fish-func...

And for this change [1] it generated these messages:

    1. `fix: change from printf to echo for handling git diff input`
    
    2. `refactor: update codeblock syntax in commit message generator`
    
    3. `style: improve readability by adjusting prompt formatting`

[1] https://github.com/nozzlegear/dotfiles/commit/0db65054524d0d...

replies(4): >>42788793 #>>42790370 #>>42790595 #>>42795733 #

61. maujim ◴[21 Jan 25 23:35 UTC] No.42786629{3}[source]▶

>>42785838 #

from a few days ago: https://news.ycombinator.com/item?id=42706637

62. cwmoore ◴[21 Jan 25 23:37 UTC] No.42786641[source]▶

>>42784365 (OP) #

I'm playing with the idea of identifying logical fallacies stated by live broadcasters.

replies(8): >>42787010 #>>42787653 #>>42788090 #>>42788889 #>>42791080 #>>42793882 #>>42798043 #>>42798458 #

63. eddd-ddde ◴[21 Jan 25 23:41 UTC] No.42786672[source]▶

>>42785739 #

I love that there's not even a vague idea of the winner "metric" in your explanation. Like it's just, _the_ winner.

64. thetrash ◴[21 Jan 25 23:41 UTC] No.42786676[source]▶

>>42784365 (OP) #

I programmed my own version of Tic Tac Toe in Godot, using a Llama 3B as the AI opponent. Not for work flow, but figuring out how to beat it is entertaining during moments of boredom.

replies(1): >>42787006 #

65. zeroCalories ◴[21 Jan 25 23:44 UTC] No.42786703{4}[source]▶

>>42785764 #

I would love a model that knows SQL really well so I don't need to remember all the small details of the language. Beyond that, I don't see why the transformer architecture can't be applied to any problem that needs to predict sequences.

replies(1): >>42787370 #

66. Sohcahtoa82 ◴[21 Jan 25 23:50 UTC] No.42786773{3}[source]▶

>>42785009 #

Not even on the edge. That's something you could train on a 2 GB GPU.

The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.

67. antonok ◴[21 Jan 25 23:57 UTC] No.42786841[source]▶

>>42784365 (OP) #

I've been using Llama models to identify cookie notices on websites, for the purpose of adding filter rules to block them in EasyList Cookie. Otherwise, this is normally done by, essentially, manual volunteer reporting.

Most cookie notices turn out to be pretty similar, HTML/CSS-wise, and then you can grab their `innerText` and filter out false positives with a small LLM. I've found the 3B models have decent performance on this task, given enough prompt engineering. They do fall apart slightly around edge cases like less common languages or combined cookie notice + age restriction banners. 7B has a negligible false-positive rate without much extra cost. Either way these things are really fast and it's amazing to see reports streaming in during a crawl with no human effort required.

Code is at https://github.com/brave/cookiemonster. You can see the prompt at https://github.com/brave/cookiemonster/blob/main/src/text-cl....

replies(4): >>42786891 #>>42786896 #>>42793119 #>>42793157 #

68. Evidlo ◴[22 Jan 25 00:00 UTC] No.42786869[source]▶

>>42784365 (OP) #

I have ollama responding to SMS spam texts. I told it to feign interest in whatever the spammer is selling/buying. Each number gets its own persona, like a millennial gymbro or 19th century British gentleman.

http://files.widloski.com/image10%20(1).png

http://files.widloski.com/image11.png

replies(11): >>42786904 #>>42786974 #>>42787151 #>>42787231 #>>42787781 #>>42789419 #>>42789860 #>>42794672 #>>42795730 #>>42795824 #>>42796084 #

69. juancroldan ◴[22 Jan 25 00:02 UTC] No.42786889[source]▶

>>42784365 (OP) #

I'm making an agent that takes decompiled code and tries to understand the methods and replace variables and function names one at a time.

replies(1): >>42791477 #

70. binarysneaker ◴[22 Jan 25 00:03 UTC] No.42786891[source]▶

>>42786841 #

Maybe it could also send automated petitions to the EU to undo cookie consent legislation, and reverse some of the enshitification.

replies(3): >>42786953 #>>42787244 #>>42788894 #

71. bazmattaz ◴[22 Jan 25 00:03 UTC] No.42786896[source]▶

>>42786841 #

This is so cool thanks for sharing. I can imagine it’s not technically possible (yet?) but it would be cool if this could simply be run as a browser extension rather than running a docker container

replies(3): >>42786919 #>>42788804 #>>42789894 #

72. RVuRnvbM2e ◴[22 Jan 25 00:04 UTC] No.42786904[source]▶

>>42786869 #

This is fantastic. How have your hooked up a mobile number to the llm?

replies(2): >>42786967 #>>42786993 #

73. antonok ◴[22 Jan 25 00:05 UTC] No.42786919{3}[source]▶

>>42786896 #

I did actually make a rough proof-of-concept of this! One of my long-term visions is to have it running natively in-browser, and able to automatically fix site issues caused by adblocking whenever they happen.

The PoC is a bit outdated but it's here: https://github.com/brave/cookiemonster/tree/webext

74. jmward01 ◴[22 Jan 25 00:05 UTC] No.42786920[source]▶

>>42784365 (OP) #

I think I am. At least I think I'm building things that will enable much smaller models: https://github.com/jmward01/lmplay/wiki/Sacrificial-Training

75. danbmil99 ◴[22 Jan 25 00:06 UTC] No.42786928[source]▶

>>42784365 (OP) #

Using llama 3.2 as an interface to a robot. If you can get the latency down, it works wonderfully

replies(1): >>42788820 #

76. antonok ◴[22 Jan 25 00:09 UTC] No.42786953{3}[source]▶

>>42786891 #

Ha, I'm not sure the EU is prepared to handle the deluge of petitions that would ensue.

On a more serious note, this must be the first time we can quantitatively measure the impact of cookie consent legislation across the web, so maybe there's something to be explored there.

replies(1): >>42790710 #

77. spiritplumber ◴[22 Jan 25 00:10 UTC] No.42786957[source]▶

>>42784365 (OP) #

My husband and me made a stock market analysis thing that gets it right about 55% of the time, so better than a coin toss. The problem is that it keeps making unethical suggestions, so we're not using it to trade stock. Does anyone have any idea what we can do with that?

replies(6): >>42787033 #>>42787185 #>>42787294 #>>42787375 #>>42787659 #>>42789438 #

78. spiritplumber ◴[22 Jan 25 00:11 UTC] No.42786967{3}[source]▶

>>42786904 #

For something similar with FB chat, I use Selenium and run it on the same box that the llm is running on. Using multiple personalities is really cool though. I should update mine likewise!

79. zx8080 ◴[22 Jan 25 00:12 UTC] No.42786974[source]▶

>>42786869 #

Cool! Do you consider the risk of unintentional (and until some moment, an unknown) subscription to some paid SMS service and how do you mitigate it?

replies(1): >>42787017 #

80. Evidlo ◴[22 Jan 25 00:15 UTC] No.42786993{3}[source]▶

>>42786904 #

Android app that forwards to a Python service on remote workstation over MQTT. I can make a Show HN if people are interested.

replies(5): >>42787153 #>>42787356 #>>42791613 #>>42791742 #>>42793477 #

81. spiritplumber ◴[22 Jan 25 00:17 UTC] No.42787006[source]▶

>>42786676 #

Number of players: zero

U.S. FIRST STRIKE WINNER: NONE

USSR FIRST STRIKE WINNER: NONE

NATO / WARSAW PACT WINNER: NONE

FAR EAST STRATEGY WINNER: NONE

US USSR ESCALATION WINNER: NONE

MIDDLE EAST WAR WINNER: NONE

USSR CHINA ATTACK WINNER: NONE

INDIA PAKISTAN WAR WINNER: NONE

MEDITERRANEAN WAR WINNER: NONE

HONGKONG VARIANT WINNER: NONE

Strange game. The only winning move is not to play

replies(1): >>42794654 #

82. spiritplumber ◴[22 Jan 25 00:17 UTC] No.42787010[source]▶

>>42786641 #

That's fantastic and I'd love to help

replies(1): >>42787121 #

83. Evidlo ◴[22 Jan 25 00:18 UTC] No.42787017{3}[source]▶

>>42786974 #

I have to whitelist a conversation before the LLM can respond.

84. bobbygoodlatte ◴[22 Jan 25 00:19 UTC] No.42787033[source]▶

>>42786957 #

I'm curious what sort of unethical suggestions it's coming up with haha

replies(1): >>42788357 #

85. mkaic ◴[22 Jan 25 00:27 UTC] No.42787108[source]▶

>>42785739 #

This reminds me of the antics of streamer DougDoug, who often uses LLM APIs to live-summarize, analyze, or interact with his (often multi-thousand-strong) Twitch chat. Most recently I saw him do a GeoGuessr stream where he had ChatGPT assume the role of a detective who must comb through the thousands of chat messages for clues about where the chat thinks the location is, then synthesizes the clamor into a final guess. Aside from constantly being trolled by people spamming nothing but "Kyoto, Japan" in chat, it occasionaly demonstrated a pretty effective incarnation of "the wisdom of the crowd" and was strikingly accurate at times.

86. cwmoore ◴[22 Jan 25 00:28 UTC] No.42787121{3}[source]▶

>>42787010 #

So far not much beyond this list of targets to identify https://en.wikipedia.org/wiki/List_of_fallacies

87. binary132 ◴[22 Jan 25 00:29 UTC] No.42787136{3}[source]▶

>>42785838 #

That’s a great idea, but I feel like it might be hard to get it to be correct enough

88. celestialcheese ◴[22 Jan 25 00:31 UTC] No.42787151[source]▶

>>42786869 #

Given the source, I'm skeptical it's not just a troll, but found this explanation [0] plausible as to why those vague spam text exists. If true, this trolling helps the spammers warm those phone numbers up.

0 - https://x.com/nikitabier/status/1867029883387580571

replies(1): >>42787482 #

89. deadbabe ◴[22 Jan 25 00:31 UTC] No.42787153{4}[source]▶

>>42786993 #

I’d love to see that. Could you simulate iMessage?

replies(2): >>42787315 #>>42787660 #

90. jothflee ◴[22 Jan 25 00:31 UTC] No.42787155[source]▶

>>42784365 (OP) #

when i feel like casually listening to something, instead of netflix/hulu/whatever, i'll run a ~3b model (qwen 2.5 or llama 3.2) and generate and audio stream of water cooler office gossip. (when it is up, it runs here: https://water-cooler.jothflee.com).

some of the situations get pretty wild, for the office :)

replies(2): >>42788384 #>>42796474 #

91. kianN ◴[22 Jan 25 00:32 UTC] No.42787162[source]▶

>>42784365 (OP) #

I don’t know if this counts as tiny but I use llama 3B in prod for summarization (kinda).

Its effective context window is pretty small but I have a much more robust statistical model that handles thematic extraction. The llm is essentially just rewriting ~5-10 sentences into a single paragraph.

I’ve found the less you need the language model to actually do, the less the size/quality of the model actually matters.

92. Etheryte ◴[22 Jan 25 00:35 UTC] No.42787185[source]▶

>>42786957 #

Have you backtested this in times when markets were not constantly green? Nearly any strategy is good in the good times.

replies(1): >>42788359 #

93. thecosmicfrog ◴[22 Jan 25 00:41 UTC] No.42787231[source]▶

>>42786869 #

Please tell me you have a blog/archive of these somewhere. This was such a joy to read!

94. K0balt ◴[22 Jan 25 00:43 UTC] No.42787244{3}[source]▶

>>42786891 #

I think there is real potential here, for smart browsing. Have the llm get the page, replace all the ads with kittens, find non-paywall versions if possible and needed, spoof fingerprint data, detect and highlight AI generated drivel, etc. The site would have no way of knowing that it wasn’t touching eyeballs. We might be able to rake back a bit of the web this way.

replies(1): >>42787340 #

95. ◴[22 Jan 25 00:52 UTC] No.42787294[source]▶

>>42786957 #

96. Evidlo ◴[22 Jan 25 00:55 UTC] No.42787315{5}[source]▶

>>42787153 #

If you mean hook this into iMessage, I don't know. I'm willing to bet it's way harder though because Apple

replies(1): >>42790394 #

97. antonok ◴[22 Jan 25 00:58 UTC] No.42787340{4}[source]▶

>>42787244 #

You probably wouldn't want to run this in real-time on every site as it'll significantly increase the load on your browser, but as long as it's possible to generate adblock filter rules, the fixes can scale to a pretty large audience.

replies(2): >>42788192 #>>42794640 #

98. dkga ◴[22 Jan 25 00:59 UTC] No.42787356{4}[source]▶

>>42786993 #

Yes, I'd be interested in that!

99. dr_kiszonka ◴[22 Jan 25 01:02 UTC] No.42787370{5}[source]▶

>>42786703 #

The trick is to find such problems with enough training data and some market potential. I am terrible at it.

100. dkga ◴[22 Jan 25 01:03 UTC] No.42787375[source]▶

>>42786957 #

Suggestion: calculate the out-of-sample Sharpe ratio[0] of the suggestions over a reasonable period to gauge how good the model would actually perform in terms of return compared to risks. It is better than vanilla accuracy or related metrics. Source: I'm a financial economist.

[0]: https://en.wikipedia.org/wiki/Sharpe_ratio

replies(1): >>42788360 #

101. codazoda ◴[22 Jan 25 01:08 UTC] No.42787404[source]▶

>>42784365 (OP) #

I had an LLM create a playlist for me.

I’m tired of the bad playlists I get from algorithms, so I made a specific playlist with an Llama2 based on several songs I like. I started with 50, removed any I didn’t like, and added more to fill in the spaces. The small models were pretty good at this. Now I have a decent fixed playlist. It does get “tired” after a few weeks and I need to add more to it. I’ve never been able to do this myself with more than a dozen songs.

replies(4): >>42788087 #>>42790316 #>>42792207 #>>42794243 #

102. stogot ◴[22 Jan 25 01:16 UTC] No.42787482{3}[source]▶

>>42787151 #

Why does STOP work here?

replies(3): >>42787538 #>>42787551 #>>42792455 #

103. celestialcheese ◴[22 Jan 25 01:21 UTC] No.42787538{4}[source]▶

>>42787482 #

https://x.com/nikitabier/status/1867069169256308766

Again, no clue if this is true, but it seems plausible.

104. JLCarveth ◴[22 Jan 25 01:22 UTC] No.42787549[source]▶

>>42784365 (OP) #

I used a small (3b, I think) model plus tesseract.js to perform OCR on an image of a nutritional facts table and output structured JSON.

replies(3): >>42789249 #>>42789735 #>>42790829 #

105. inerte ◴[22 Jan 25 01:22 UTC] No.42787551{4}[source]▶

>>42787482 #

Carriers and SMS service providers (like Twillio) obey that, no matter what service is behind.

There are stories of people replying STOP to spam, then never getting a legit SMS because the number was re-used by another service. That's because it's being blocked between the spammer and the phone.

106. genewitch ◴[22 Jan 25 01:33 UTC] No.42787653[source]▶

>>42786641 #

I have several rhetoric and logic books of the sort you might use for training or whatever, and one of my best friends got a doctorate in a tangential field, and may have materials and insights.

We actually just threw a relationship curative app online in 17 hours around Thanksgiving., so they "owe" me, as it were.

I'm one of those people that can do anything practical with tech and the like, but I have no imagination for it - so when someone mentions something that I think would be beneficial for my fellow humans I get this immense desire to at least cheer on if not ask to help.

107. bongodongobob ◴[22 Jan 25 01:33 UTC] No.42787659[source]▶

>>42786957 #

You can literally flip coins and get better than 50% success in a bull market. Just buy index funds and spend your time on something that isn't trying to beat entropy. You won't be able to.

replies(1): >>42788940 #

108. great_psy ◴[22 Jan 25 01:33 UTC] No.42787660{5}[source]▶

>>42787153 #

Yes it’s possible, but it’s not something you can easily scale.

I had a similar project a few years back that used OSX automations and Shortcuts and Python to send a message everyday to a friend. It required you to be signed in to iMessage on your MacBook.

Than was a send operation, the reading of replies is not something I implemented, but I know there is a file somewhere that holds a history of your recent iMessages. So you would have to parse it on file update and that should give you the read operation so you can have a conversation.

Very doable in a few hours unless something dramatic changed with how the messages apps works within the last few years.

replies(1): >>42791450 #

109. itskarad ◴[22 Jan 25 01:34 UTC] No.42787661[source]▶

>>42784365 (OP) #

I'm using ollama for parsing and categorizing scraped jobs for a local job board dashboard I check everyday.

110. jwitthuhn ◴[22 Jan 25 01:45 UTC] No.42787767[source]▶

>>42784365 (OP) #

I've made a tiny ~1m parameter model that can generate random Magic the Gathering cards that is largely based on Karpathy's nanogpt with a few more features added on top.

I don't have a pre-trained model to share but you can make one yourself from the git repo, assuming you have an apple silicon mac.

https://github.com/jlwitthuhn/TCGGPT

111. blackeyeblitzar ◴[22 Jan 25 01:46 UTC] No.42787781[source]▶

>>42786869 #

You realize this is going to cause carriers to allow the number to send more spam, because it looks like engagement. The best thing to do is to report the offending message to 7726 (SPAM) so the carrier can take action. You can also file complaints at the FTC and FCC websites, but that takes a bit more effort.

replies(1): >>42790409 #

112. HexDecOctBin ◴[22 Jan 25 01:51 UTC] No.42787826[source]▶

>>42784365 (OP) #

Is there any experiments in a small models that does paraphrasing? I tried hsing some off-the-shelf models, but it didn't go well.

I was thinking of hooking them in RPGs with text-based dialogue, so that a character will say something slightly different every time you speak to them.

replies(1): >>42791485 #

113. droideqa ◴[22 Jan 25 01:53 UTC] No.42787856[source]▶

>>42785105 #

That's awesome!

114. petesergeant ◴[22 Jan 25 02:18 UTC] No.42788087[source]▶

>>42787404 #

Interesting! I've sadly found more capable models to really fail on music recommendations for me.

115. petesergeant ◴[22 Jan 25 02:19 UTC] No.42788090[source]▶

>>42786641 #

I'll be very positively impressed if you make this work; I spend all day every day for work trying to make more capable models perform basic reasoning, and often failing :-P

116. nejsjsjsbsb ◴[22 Jan 25 02:28 UTC] No.42788174[source]▶

>>42785739 #

All computation on device?

117. K0balt ◴[22 Jan 25 02:30 UTC] No.42788192{5}[source]▶

>>42787340 #

I was thinking running it in my home lab server as a proxy, but yeah, scaling it to the browser would require some pretty strong hardware. Still, maybe in a couple of years it could be mainstream.

118. coder543 ◴[22 Jan 25 02:33 UTC] No.42788219{3}[source]▶

>>42785325 #

Someone mentioned generating millions of (very short) stories with an LLM a few weeks ago: https://news.ycombinator.com/item?id=42577644

They linked to an interactive explorer that nicely shows the diversity of the dataset, and the HF repo links to the GitHub repo that has the code that generated the stories: https://github.com/lennart-finke/simple_stories_generate

So, it seems there are ways to get varied stories.

replies(1): >>42841237 #

119. nejsjsjsbsb ◴[22 Jan 25 02:34 UTC] No.42788228[source]▶

>>42786354 #

Does https://github.com/persys-ai/persys-server run on the rpi?

Is that design 3d printable? Or is that for paid users only.

replies(1): >>42788718 #

120. UltraSane ◴[22 Jan 25 02:43 UTC] No.42788292[source]▶

>>42785570 #

clever name!

121. jftuga ◴[22 Jan 25 02:48 UTC] No.42788343[source]▶

>>42784365 (OP) #

I'm using ollama, llama3.2 3b, and python to shorten news article titles to 10 words or less. I have a 3 column web site with a list of news articles in the middle column. Some of the titles are too long for this format, but the shorter titles appear OK.

122. spiritplumber ◴[22 Jan 25 02:50 UTC] No.42788357{3}[source]▶

>>42787033 #

so far, mostly buying companies owned/ran by horrible people.

replies(2): >>42790298 #>>42794032 #

123. spiritplumber ◴[22 Jan 25 02:50 UTC] No.42788359{3}[source]▶

>>42787185 #

yep. the 55% is over a few years.

replies(1): >>42790290 #

124. spiritplumber ◴[22 Jan 25 02:50 UTC] No.42788360{3}[source]▶

>>42787375 #

thank you! that's exactly the sort of thing I don't know.

125. jftuga ◴[22 Jan 25 02:54 UTC] No.42788384[source]▶

>>42787155 #

What prompt are you using for this?

126. linsomniac ◴[22 Jan 25 02:58 UTC] No.42788414[source]▶

>>42784365 (OP) #

I have this idea that a tiny LM would be good at canonicalizing entered real estate addresses. We currently buy a data set and software from Experian, but it feels like something an LM might be very good at. There are lots of weirdnesses in address entry that regexes have a hard time with. We know the bulk of addresses a user might be entering, unless it's a totally new property, so we should be able to train it on that.

replies(1): >>42798829 #

127. sidravi1 ◴[22 Jan 25 03:04 UTC] No.42788468[source]▶

>>42784365 (OP) #

We fine-tuned a Gemma 2B to identify urgent messages sent by new and expecting mothers on a government-run maternal health helpline.

https://idinsight.github.io/tech-blog/blog/enhancing_materna...

replies(4): >>42788954 #>>42790308 #>>42793587 #>>42801392 #

128. dh1011 ◴[22 Jan 25 03:04 UTC] No.42788471[source]▶

>>42784365 (OP) #

I copied all the text from this post and used an LLM to generate a list of all the ideas. I do the same for other similar HN post .

replies(2): >>42788487 #>>42793163 #

129. lordswork ◴[22 Jan 25 03:08 UTC] No.42788487[source]▶

>>42788471 #

well, what are the ideas?

130. coder543 ◴[22 Jan 25 03:31 UTC] No.42788633[source]▶

>>42785041 #

That paper is from over a year ago, and it compared against codex-davinci... which was basically GPT-3, from what I understand. Saying >100B makes it sound a lot more impressive than it is in today's context... 100B models today are a lot more capable. The researchers also compared against a couple of other ancient(/irrelevant today), small models that don't give me much insight.

FLAME seems like a fun little model, and 60M is truly tiny compared to other LLMs, but I have no idea how good it is in today's context, and it doesn't seem like they ever released it.

replies(1): >>42795986 #

131. gpm ◴[22 Jan 25 03:34 UTC] No.42788651[source]▶

>>42784365 (OP) #

I made a shell alias to translate things from French to English, does that count?

    function trans
        llm "Translate \"$argv\" from French to English please"
    end

Llama 3.2:3b is a fine French-English dictionary IMHO.

replies(1): >>42790203 #

132. ata_aman ◴[22 Jan 25 03:41 UTC] No.42788718{3}[source]▶

>>42788228 #

I can publish it no problem. I’ll create a new repo with instructions for the hardware with CAD files.

Designing a new one for the NVIDIA Orin Nano Super so it might take a few days.

replies(1): >>42791320 #

133. sundarurfriend ◴[22 Jan 25 03:51 UTC] No.42788777[source]▶

>>42784724 #

You're using it to anonymize your code, not de-anonymize someone's code. I was confused by your comment until I read the replies and realized that's what you meant to say.

replies(1): >>42790197 #

134. mentos ◴[22 Jan 25 03:53 UTC] No.42788793[source]▶

>>42786586 #

Awesome need to make one for naming variables too haha

135. throwup238 ◴[22 Jan 25 03:53 UTC] No.42788804{3}[source]▶

>>42786896 #

It should be possible using native messaging [1] which can call out to an external binary. The 1password extensions use that to communicate with the password manager binary.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

136. guywithahat ◴[22 Jan 25 03:54 UTC] No.42788807[source]▶

>>42784365 (OP) #

I've been working on a self-hosted, low-latency service for small LLM's. It's basically exactly what I would have wanted when I started my previous startup. The goal is for real time applications, where even the network time to access a fast LLM like groq is an issue.

I haven't benchmarked it yet but I'd be happy to hear opinions on it. It's written in C++ (specifically not python), and is designed to be a self-contained microservice based around llama.cpp.

https://github.com/thansen0/fastllm.cpp

137. mentos ◴[22 Jan 25 03:56 UTC] No.42788820[source]▶

>>42786928 #

Would love to see this applied to a FPS bot in unreal engine.

138. JayStavis ◴[22 Jan 25 04:07 UTC] No.42788889[source]▶

>>42786641 #

Automation to identify logical/rhetorical fallacies is a long held dream of mine, would love to follow along with this project if it picks up somehow

139. sebastiennight ◴[22 Jan 25 04:08 UTC] No.42788894{3}[source]▶

>>42786891 #

To me this take is like smokers complaining that the evil government is forcing the good tobacco companies to degrade the experience by adding pictures of cancer patients on cigarette packs.

replies(1): >>42790227 #

140. prakashn27 ◴[22 Jan 25 04:14 UTC] No.42788937[source]▶

>>42785739 #

wifey always wins. ;)

141. spiritplumber ◴[22 Jan 25 04:15 UTC] No.42788940{3}[source]▶

>>42787659 #

INSUFFICIENT DATA FOR A MEANINGFUL ANSWER.

142. proxygeek ◴[22 Jan 25 04:19 UTC] No.42788954[source]▶

>>42788468 #

Such a fun thread but this is the kind of applications that perk up my attention!

Very cool!

143. deivid ◴[22 Jan 25 05:10 UTC] No.42789246[source]▶

>>42786422 #

I would be very interested if someone is aware of any small/tiny models to perform OCR, so the app can translate pictures as well

replies(1): >>42789882 #

144. deivid ◴[22 Jan 25 05:11 UTC] No.42789249[source]▶

>>42787549 #

What was the model? What kind of performance did you get out of it?

Could you share a link to your project, if it is public?

replies(1): >>42792274 #

145. TMWNN ◴[22 Jan 25 05:13 UTC] No.42789260{3}[source]▶

>>42785325 #

> Do you find that it actually generates varied and diverse stories? Or does it just fall into the same 3 grooves?

> Last week I tried to get an LLM (one of the recent Llama models running through Groq, it was 70B I believe) to produce randomly generated prompts in a variety of styles and it kept producing cyberpunk scifi stuff.

100% relevant: "Someday" <https://en.wikipedia.org/wiki/Someday_(short_story)> by Isaac Asimov, 1956

146. metadat ◴[22 Jan 25 05:42 UTC] No.42789419[source]▶

>>42786869 #

I love this, more please!!!

147. febed ◴[22 Jan 25 05:44 UTC] No.42789438[source]▶

>>42786957 #

What data do you analyze?

148. sauravpanda ◴[22 Jan 25 06:00 UTC] No.42789518[source]▶

>>42784365 (OP) #

We are building a framework to run this tiny language model in the web so anyone can access private LLMs in their browser: https://github.com/sauravpanda/BrowserAI.

With just three lines of code, you can run Small LLM models inside the browser. We feel this unlocks a ton of potential for businesses so that they can introduce AI without fear of cost and can personalize the experience using AI.

Would love your thoughts and what we can do more or better!

replies(1): >>42792474 #

149. merwijas ◴[22 Jan 25 06:03 UTC] No.42789545[source]▶

>>42784365 (OP) #

I put llama 3 on a RBPi 5 and have it running a small droid. I added a TTS engine so it can hear spoken prompts which it replies to in droid speak. It also has a small screen that translates the response to English. I gave it a backstory about being a astromech droid so it usually just talks about the hyperdrive but it's fun.

150. yzydserd ◴[22 Jan 25 06:34 UTC] No.42789733[source]▶

>>42785938 #

Nice idea.

replies(1): >>42790246 #

151. tigrank ◴[22 Jan 25 06:34 UTC] No.42789735[source]▶

>>42787549 #

All that server side or client?

152. deivid ◴[22 Jan 25 06:51 UTC] No.42789840[source]▶

>>42785739 #

what model do you use for speech to text?

153. merpkz ◴[22 Jan 25 06:55 UTC] No.42789860[source]▶

>>42786869 #

Calling Jessica an old chap is quite a giveaway that it's a bot xD Nice idea indeed, but I have a feeling that it's just two LLMs now conversing with each other.

154. Eisenstein ◴[22 Jan 25 06:58 UTC] No.42789882{3}[source]▶

>>42789246 #

MiniCPM-V 2.6 isn't that small (8b) but it can do this.

Here is a demo.

* https://i.imgur.com/pAuTeAf.jpeg

Using this script:

* https://github.com/jabberjabberjabber/LLMOCR/

155. MarioMan ◴[22 Jan 25 07:00 UTC] No.42789894{3}[source]▶

>>42786896 #

There are a couple of WebGPU LLM platforms available that form the building blocks to accomplish this right from the browser, especially since the models are so small.

https://github.com/mlc-ai/web-llm

https://huggingface.co/docs/transformers.js/en/index

You do have to worry about WebGPU compatibility in browsers though.

https://caniuse.com/webgpu

156. kaspermarstal ◴[22 Jan 25 07:48 UTC] No.42790190[source]▶

>>42784365 (OP) #

I built an Excel Add-In that allows my girlfriend to quickly filter 7000 paper titles and abstracts for a review paper that she is writing [1]. It uses Gemma 2 2b which is a wonderful little model that can run on her laptop CPU. It works surprisingly well for this kind of binary classification task.

The nice thing is that she can copy/paste the titles and abstracts in to two columns and write e.g. "=PROMPT(A1:B1, "If the paper studies diabetic neuropathy and stroke, return 'Include', otherwise return 'Exclude'")" and then drag down the formula across 7000 rows to bulk process the data on her own because it's just Excel. There is a gif on the readme on the Github repo that shows it.

[1] https://github.com/getcellm/cellm

replies(12): >>42790265 #>>42790359 #>>42790494 #>>42790901 #>>42791645 #>>42791646 #>>42793924 #>>42795545 #>>42796501 #>>42805657 #>>42812155 #>>42813125 #

157. mettamage ◴[22 Jan 25 07:50 UTC] No.42790194{3}[source]▶

>>42785696 #

> So you are using a local small model to remove identifying information and make the question generic, which is then sent to a larger model? Is that understanding correct?

Yep that's it

158. kreyenborgi ◴[22 Jan 25 07:51 UTC] No.42790197{3}[source]▶

>>42788777 #

I read it the other way, their code contains eg fetch(url, pw:hunter123), and they're asking Claude anonymized questions like "implement handler for fetch(url, {pw:mycleartrxtpw})"

And then claude replies

fetch(url, {pw:mycleartrxtpw}).then(writething)

And then the local llm converts the placeholder mycleartrxtpw into hunter123 using its access to the real code

replies(2): >>42792369 #>>42793460 #

159. kreyenborgi ◴[22 Jan 25 07:52 UTC] No.42790203[source]▶

>>42788651 #

Is it better than translatelocally? https://translatelocally.com/downloads/ (the same as used in firefox)

replies(1): >>42793534 #

160. evacchi ◴[22 Jan 25 07:55 UTC] No.42790226[source]▶

>>42784365 (OP) #

I'm interested in finding tiny models to create workflows stringing together several function/tools and running them on device using mcp.run servlets on Android (disclaimer: I work on that)

161. kortilla ◴[22 Jan 25 07:56 UTC] No.42790227{4}[source]▶

>>42788894 #

Those don’t really work: https://jamanetwork.com/journals/jamanetworkopen/fullarticle...

replies(1): >>42792167 #

162. kortilla ◴[22 Jan 25 07:58 UTC] No.42790246{3}[source]▶

>>42789733 #

Make sure your meeting participants know you’re transcribing them. Has similar notification requirements as recording state to state.

163. relistan ◴[22 Jan 25 08:01 UTC] No.42790265[source]▶

>>42790190 #

Very cool idea. I’ve used gemma2 2b for a few small things. Very good model for being so small.

164. watermelon0 ◴[22 Jan 25 08:01 UTC] No.42790266{3}[source]▶

>>42786114 #

Doesn't `fortune` inside double quotes execute the command in bash? You should use single quotes instead of backticks.

165. kortilla ◴[22 Jan 25 08:05 UTC] No.42790290{4}[source]▶

>>42788359 #

Right, but if 55% is avg over the last few years, “buy stock” is going to be correct more than not.

https://www.crestmontresearch.com/docs/Stock-Yo-Yo.pdf

replies(1): >>42791373 #

166. kortilla ◴[22 Jan 25 08:07 UTC] No.42790298{4}[source]▶

>>42788357 #

So if you filter out the Republican owned ones or whatever your bugbear is, does the 55% persist?

167. Mashimo ◴[22 Jan 25 08:09 UTC] No.42790308[source]▶

>>42788468 #

Oh that is a nice writeup. We have something similar in mind at work. Will forward it.

168. Mashimo ◴[22 Jan 25 08:10 UTC] No.42790316[source]▶

>>42787404 #

Huh, interesting. For me that often dreamed up artist and songs.

169. Mashimo ◴[22 Jan 25 08:14 UTC] No.42790343[source]▶

>>42785570 #

What is an H1?

replies(2): >>42791163 #>>42791221 #

170. relistan ◴[22 Jan 25 08:18 UTC] No.42790370[source]▶

>>42786586 #

Interesting idea. But those say what’s in the commit. The commit diff already tells you that. The best commit messages IMO tell you why you did it and what value was delivered. I think it’s gonna be hard for an LLM to do that since that context lives outside the code. But maybe it would, if you hook it to e.g. a ticketing system and include relevant tickets so it can grab context.

For instance, in your first example, why was that change needed? It was a fix, but for what issue?

In the second message: why was that a desirable change?

replies(4): >>42791508 #>>42792814 #>>42793904 #>>42794367 #

171. dambi0 ◴[22 Jan 25 08:23 UTC] No.42790394{6}[source]▶

>>42787315 #

If you are willing to use Apple Shortcuts on iOS it’s pretty easy to add something that will be trigged when a message is received and can call out to a service or even use SSH to do something with the contents, including replying

172. computers3333 ◴[22 Jan 25 08:24 UTC] No.42790399[source]▶

>>42784365 (OP) #

https://gophersignal.com – I built GopherSignal!

It's a lightweight tool that summarizes Hacker News articles. For example, here’s what it outputs for this very post, "Ask HN: Is anyone doing anything cool with tiny language models?":

"A user inquires about the use of tiny language models for interesting applications, such as spam filtering and cookie notice detection. A developer shares their experience with using Ollama to respond to SMS spam with unique personas, like a millennial gymbro or a 19th-century British gentleman. Another user highlights the effectiveness of 3B and 7B language models for cookie notice detection, with decent performance achieved through prompt engineering."

I originally used LLaMA 3:Instruct for the backend, which performs much better, but recently started experimenting with the smaller LLaMA 3.2:1B model.

It’s been cool seeing other people’s ideas too. Curious—does anyone have suggestions for small models that are good for summaries?

Feel free to check it out or make changes: https://github.com/k-zehnder/gophersignal

replies(3): >>42791453 #>>42809880 #>>42819063 #

173. thegabriele ◴[22 Jan 25 08:25 UTC] No.42790409{3}[source]▶

>>42787781 #

Yes, the very last thing to do is respond to spam (calls, email, text...) and inform that you are eligible to more solicitation.

174. afro88 ◴[22 Jan 25 08:38 UTC] No.42790494[source]▶

>>42790190 #

How accurate are the classifications?

replies(1): >>42790670 #

175. ceritium ◴[22 Jan 25 08:42 UTC] No.42790521[source]▶

>>42784365 (OP) #

I am doing nothing, but I was wondering if it would make sense to combine a small LLM and SQLITE to parse date time human expressions. For example, given a human input like "last day of this month", the LLM will generate the following query `SELECT date('now','start of month','+1 month','-1 day');`

It is probably super overengineering, considering that pretty good libraries are already doing that on different languages, but it would be funny. I did some tests with chatGPT, and it worked sometimes. It would probably work with some fine-tuning, but I don't have the experience or the time right now.

replies(3): >>42790583 #>>42791143 #>>42812619 #

176. lionkor ◴[22 Jan 25 08:50 UTC] No.42790583[source]▶

>>42790521 #

LLMs tend to REALLY get this wrong. Ask it to generate a query to sum up likes on items uploaded in the last week, defined as the last monday-sunday week (not the last 7 days), and watch it get it subtly wrong almost every time.

177. lionkor ◴[22 Jan 25 08:51 UTC] No.42790595[source]▶

>>42786586 #

Those commit messages are pretty terrible, please try to come up with actual messages ;)

178. numba888 ◴[22 Jan 25 08:53 UTC] No.42790611[source]▶

>>42784365 (OP) #

Many interesting projects, cool. I'm waiting to LLMs in games. That would make them much more fun. Any time now...

replies(1): >>42796550 #

179. kaspermarstal ◴[22 Jan 25 09:02 UTC] No.42790670{3}[source]▶

>>42790494 #

I don't know. This paper [1] reports accuracies in the 97-98% range on a similar task with more powerful models. With Gemma 2 2b the accuracy will certainly be lower.

[1] https://www.medrxiv.org/content/10.1101/2024.10.01.24314702v...

replies(2): >>42791570 #>>42792014 #

180. pk-protect-ai ◴[22 Jan 25 09:09 UTC] No.42790710{4}[source]▶

>>42786953 #

why don't you spam the companies who want your data instead? The sites can simply stop gathering your data, then they will not require to ask for consent ...

replies(2): >>42791064 #>>42791197 #

181. kolinko ◴[22 Jan 25 09:15 UTC] No.42790747[source]▶

>>42784365 (OP) #

Apple’s on device models are around 3B if I’m nit mistaken, and they developed some nice tech around them that they published, if I’m not mistaken - where they have just one model, but have switchable finetunings of that model so that it can perform different functionalities depending on context.

182. ian_zcy ◴[22 Jan 25 09:27 UTC] No.42790829[source]▶

>>42787549 #

what are you feed into the model? Image (like product packaging) or Image of Structured Table? I found out that model performs good in general with sturctured table, but fails a lot over images.

183. donbreo ◴[22 Jan 25 09:34 UTC] No.42790901[source]▶

>>42790190 #

Requirements: -Windows

Looks like I'm out... Would be great if there was a google apps script alternative. My company gave all devs linux systems and the business team operates on windows. So I always use browser based tech like Gapps script for complex sheet manipulation

replies(1): >>42794950 #

184. frail_figure ◴[22 Jan 25 09:58 UTC] No.42791064{5}[source]▶

>>42790710 #

It’s the same comments on HN as always. They think EU setting up rules is somehow worse than companies breaking them. We see how the US is turning out without pesky EU restrictions :)

replies(1): >>42793142 #

185. vaylian ◴[22 Jan 25 10:01 UTC] No.42791080[source]▶

>>42786641 #

LLMs are notoriously unreliable with mathematics and logic. I wish you the best of luck, because this would nevertheless be an awesome tool to have.

186. reeeeee ◴[22 Jan 25 10:06 UTC] No.42791117[source]▶

>>42784365 (OP) #

I built a platform to monitor LLMs that are given complete freedom in the form of a Docker container bash REPL. Currently the models have been offline for some time because I'm upgrading from a single DELL to a TinyMiniMicro Proxmox cluster to run multiple small LLMs locally.

The bots don't do a lot of interesting stuff though, I plan to add the following functionalities:

- Instead of just resetting every 100 messages, I'm going to provide them with a rolling window of context.

- Instead of only allowing BASH commands, they will be able to also respond with reasoning messages, hopefully to make them a bit smarter.

- Give them a better docker container with more CLI tools such as curl and a working package manager.

If you're interested in seeing the developments, you can subscribe on the platform!

https://lama.garden

187. TachyonicBytes ◴[22 Jan 25 10:10 UTC] No.42791143[source]▶

>>42790521 #

What libraries have you seen that do this?

188. TachyonicBytes ◴[22 Jan 25 10:13 UTC] No.42791163{3}[source]▶

>>42790343 #

Not the OP, but they are "Headers". Probably coming from the <h1> tag in html. What outsiders probably call "Headlines".

189. whywhywhywhy ◴[22 Jan 25 10:19 UTC] No.42791197{5}[source]▶

>>42790710 #

Because they have no reason to care about what you think or feel or they wouldn't be doing it in the first place.

Cookie notices just gave them another weapon in the end.

190. laristine ◴[22 Jan 25 10:23 UTC] No.42791221{3}[source]▶

>>42790343 #

Main heading of an article

191. lormayna ◴[22 Jan 25 10:35 UTC] No.42791298[source]▶

>>42784365 (OP) #

I am using smollm2 to extract some useful information (like remote, language, role, location, etc.) from "Who is hiring" monthly thread and create an RSS feed with specific filter. Still not ready for Show HN, but working.

192. nejsjsjsbsb ◴[22 Jan 25 10:38 UTC] No.42791320{4}[source]▶

>>42788718 #

Up to you! Totally understand if you want to hold something back for a paid option!

193. Etheryte ◴[22 Jan 25 10:48 UTC] No.42791373{5}[source]▶

>>42790290 #

I think this is a good highlight of why context and reality checks are incredibly important when doing work like this. At first glance, it might look like 55% is a really good result, but in the previous year, a flat buy every day strategy would've been right 56.7% of the time.

replies(1): >>42798262 #

194. bendews ◴[22 Jan 25 10:55 UTC] No.42791416[source]▶

>>42785400 #

Is this along the lines of rewind.ai, MSCopilot, screenpipe, or something else entirely?

195. dewey ◴[22 Jan 25 11:05 UTC] No.42791450{6}[source]▶

>>42787660 #

They are all in a SQLite db on your disk.

196. tinco ◴[22 Jan 25 11:05 UTC] No.42791453[source]▶

>>42790399 #

That's cool, I really like it. One piece of feedback: I am usually more interested in the HN comments than in the original article. If you'd include a link to the comments then I might switch to GopherSignal as a replacement for the HN frontpage.

My flow is generally: Look at the title and the amount of upvotes to decide if I'm interested in the article. Then view the comments to see if there's interesting discussion going on or if there's already someone adding essential context. Only then I'll decide if I want to read the article or not.

Of course no big deal if you're not interested in my patronage, just wanted to let you know your page already looks good enough for me to consider switching my most visited page to it if it weren't for this small detail. And maybe the upvote count.

replies(5): >>42791662 #>>42791876 #>>42791902 #>>42795516 #>>42801762 #

197. krystofee ◴[22 Jan 25 11:06 UTC] No.42791455[source]▶

>>42784365 (OP) #

Has anyone ever tried to do some automatic email workflow autoresponder agents?

Lets say, I want some outcome and it will autonomousl handle the process prompt me and the other side for additional requirements if necessary and then based on that handle the process and reach the outcome?

198. krystofee ◴[22 Jan 25 11:09 UTC] No.42791477[source]▶

>>42786889 #

This sounds cool! Are you planningto opensource it?

replies(1): >>42794208 #

199. krystofee ◴[22 Jan 25 11:11 UTC] No.42791485[source]▶

>>42787826 #

Intuitively this sounds like something that should be possible using almost any llm. This should be just a matter of prompting.

200. lnenad ◴[22 Jan 25 11:15 UTC] No.42791508{3}[source]▶

>>42790370 #

I disagree. When you look at the git history in x months you're gonna have a hard time understanding what was done following your example.

replies(2): >>42792522 #>>42793338 #

201. indolering ◴[22 Jan 25 11:24 UTC] No.42791570{4}[source]▶

>>42790670 #

Y'all definitely need to cross validate a small number of samples by hand. When I did this kind of research, I would hand validate to at least P < .01.

replies(1): >>42791817 #

202. potamic ◴[22 Jan 25 11:32 UTC] No.42791613{4}[source]▶

>>42786993 #

Why MQTT over HTTP for a low volume, small scale integration?

replies(2): >>42792901 #>>42802009 #

203. 7734128 ◴[22 Jan 25 11:37 UTC] No.42791646[source]▶

>>42790190 #

You could have called it CellMate

204. 7734128 ◴[22 Jan 25 11:37 UTC] No.42791645[source]▶

>>42790190 #

You could have called it CellMate b

205. computers3333 ◴[22 Jan 25 11:38 UTC] No.42791662{3}[source]▶

>>42791453 #

Hey, thanks a ton for the feedback! That was super helpful to hear about your flow—makes a lot of sense and it's pretty similar to how I browse HN too. I usually only dive into the article after checking out the upvotes and seeing what context the comments add.

I'll definitely add a link to the comments and the upvote count—gotta keep my tiny but mighty userbase (my mom, me, and hopefully you soon) happy, right? lol

And if there's even a chance you'd use GopherSignal as your daily driver, that's a no-brainer for me. Really appreciate you taking the time to share your ideas and help me improve.

206. sam_lowry_ ◴[22 Jan 25 11:46 UTC] No.42791707{3}[source]▶

>>42786326 #

Hm... I wonder what your use case it. I do the modern Enterprise Java and the tab completion is a major time saver.

While interactive AI is all about posing, meditating on the prompt, then trying to fix the outcome, IntelliJ tab completion... shows what it will complete as you type and you Tab when you are 100% OK with the completion, which surprisingly happens 90..99% of the time for me, depending on the project.

207. TechDebtDevin ◴[22 Jan 25 11:47 UTC] No.42791711[source]▶

>>42785739 #

Your SO must really love that lmao

208. sainib ◴[22 Jan 25 11:50 UTC] No.42791742{4}[source]▶

>>42786993 #

Interested for sure.

209. ahrjay ◴[22 Jan 25 11:56 UTC] No.42791788[source]▶

>>42784365 (OP) #

I built https://ffprompt.ryanseddon.com using the chrome ai (Gemini nano). Allows you to do ffmpeg operations on videos using natural language all client side.

replies(1): >>42793833 #

210. addandsubtract ◴[22 Jan 25 11:58 UTC] No.42791813[source]▶

>>42784365 (OP) #

I use a small model to rename my Linux ISOs. I gave it a custom prompt with examples of how I want the output filenames to be structured and then just feed it files to rename. The output only works 90ish percent of the time, so I wrote a little CLI to iterate through the files and accept / retry / edit the changes the LLM outputs.

211. kaspermarstal ◴[22 Jan 25 11:59 UTC] No.42791817{5}[source]▶

>>42791570 #

She and one other researcher has manually classified all 7000 papers as per standard protocol. Perhaps for the next article they will measure how this tool agreed with them against them and include it in the protocol if good enough.

212. sainib ◴[22 Jan 25 12:08 UTC] No.42791876{3}[source]▶

>>42791453 #

Agreed..great suggestions. Id consider switching as well.

213. sainib ◴[22 Jan 25 12:11 UTC] No.42791902{3}[source]▶

>>42791453 #

May be even rate each post on the comments activity level.

replies(1): >>42795576 #

214. mogaal ◴[22 Jan 25 12:14 UTC] No.42791921[source]▶

>>42784365 (OP) #

I bought a tiny business in Brazil, the database (Excel) I inherited with previous customer data *do not include gender*. I need gender to start my marketing campaigns and learn more about my future customer. I used Gemma-2B and Python to determine gender based on the data and it worked perfect

replies(1): >>42791935 #

215. Nashooo ◴[22 Jan 25 12:16 UTC] No.42791935[source]▶

>>42791921 #

How did you verify it worked?

216. beernet ◴[22 Jan 25 12:26 UTC] No.42792014{4}[source]▶

>>42790670 #

> I don't know.

HN in a nutshell: I've built some cool tech but have no idea if it is helpful or even counter productive...

replies(4): >>42792550 #>>42793068 #>>42794731 #>>42794935 #

217. iamnotagenius ◴[22 Jan 25 12:33 UTC] No.42792097{3}[source]▶

>>42785424 #

I just run llama-cli with the model. Every time I want some "awk" or "find" trickery, I just ask model. Good for throwaway python scripts too.

replies(1): >>42793507 #

218. behohippy ◴[22 Jan 25 12:40 UTC] No.42792152{3}[source]▶

>>42785325 #

It's a 3b model so the creativity is pretty limited. What helped for me was prompting for specific stories in specific styles. I have a python script that randomizes the prompt and the writing style, including asking for specific author styles.

219. behohippy ◴[22 Jan 25 12:40 UTC] No.42792159{3}[source]▶

>>42785192 #

I don't have a video but here's a pic of the output: https://imgur.com/ip8GWIh

replies(1): >>42799809 #

220. behohippy ◴[22 Jan 25 12:41 UTC] No.42792165{3}[source]▶

>>42785253 #

Just this pic: https://imgur.com/ip8GWIh

221. shiftingleft ◴[22 Jan 25 12:41 UTC] No.42792167{5}[source]▶

>>42790227 #

Do they help deter people from becoming smokers in the first place?

replies(1): >>42800812 #

222. jamesponddotco ◴[22 Jan 25 12:45 UTC] No.42792207[source]▶

>>42787404 #

Interesting! I wrote a prompt for something similar[1], but I use Claude Sonnet for it. I wonder how a small model would handle it. Time to test, I guess.

[1]: https://git.sr.ht/~jamesponddotco/llm-prompts/tree/trunk/dat...

replies(1): >>42797345 #

223. JLCarveth ◴[22 Jan 25 12:54 UTC] No.42792274{3}[source]▶

>>42789249 #

https://github.com/JLCarveth/nutrition-llama

I've had good speed / reliability with TheBloke/rocket-3B-GGUF on Huggingface, the Q2_K model. I'm sure there are better models out there now, though.

It takes ~8-10 seconds to process an image on my M2 Macbook, so not quite quick enough to run on phones yet, but the accuracy of the output has been quite good.

224. iamnotagenius ◴[22 Jan 25 12:58 UTC] No.42792310{6}[source]▶

>>42785773 #

10-15t/s on 12400 with ddr5

225. sundarurfriend ◴[22 Jan 25 13:04 UTC] No.42792369{4}[source]▶

>>42790197 #

> Put in all your work related questions in the plugin, an LLM will make it as an abstract question for you to preview and send it

So the LLM does both the anonymization into placeholders and then later the replacing of the placeholders too. Calling the latter step de-anonymization is confusing though, it's "de-anonymizing" yourself to yourself. And the overall purpose of the plugin is to anonymize OP to Claude, so to me at least that makes the whole thing clearer.

replies(1): >>42793493 #

226. yawgmoth ◴[22 Jan 25 13:15 UTC] No.42792455{4}[source]▶

>>42787482 #

STOP works thanks to the Telephone Consumer Protection Act (“TCPA”), which offers consumers spam protections and senders a framework on how to behave.

(Edit: It's relevant that STOP didn't come from the TCPA itself, but definitely has teeth due to it)

https://www.infobip.com/blog/a-guide-to-global-sms-complianc...

227. ms7892 ◴[22 Jan 25 13:17 UTC] No.42792474[source]▶

>>42789518 #

Sounds cool. Anyway I can help.

228. relistan ◴[22 Jan 25 13:23 UTC] No.42792522{4}[source]▶

>>42791508 #

By adding more context? I’m not sure who you’re replying to or what your objection is.

229. rasmus1610 ◴[22 Jan 25 13:26 UTC] No.42792550{5}[source]▶

>>42792014 #

Sometimes people just like to build stuff for the sake of it.

replies(1): >>42792822 #

230. rane ◴[22 Jan 25 13:52 UTC] No.42792814{3}[source]▶

>>42790370 #

Most of the time you are not able to fit the "Why?" in the summary.

That's what the body of the commit message is for.

231. jajko ◴[22 Jan 25 13:53 UTC] No.42792822{6}[source]▶

>>42792550 #

Almost like hackers, doing shit just for the heck of it because they can (mostly)

232. c0wb0yc0d3r ◴[22 Jan 25 14:00 UTC] No.42792901{5}[source]▶

>>42791613 #

I’m not OP, but I would hazard a guess that those are the tools that OP has at hand.

233. jbentley1 ◴[22 Jan 25 14:08 UTC] No.42792994[source]▶

>>42784365 (OP) #

Tiny language models can do a lot if they are fine tuned for a specific task, but IMO a few things are holding them back:

1. Getting the speed gains is hard unless you are able to pay for dedicated GPUs. Some services offer LoRA as serverless but you don't get the same performance for various technical reasons.

2. Lack of talent to actually do the finetuning. Regular engineers can do a lot of LLM implementation, but when it comes to actually performing training it is a scarcer skillset. Most small to medium orgs don't have people who can do it well.

3. Distribution. Sharing finetunes is hard. HuggingFace exists, but discoverability is an issue. It is flooded with random models with no documentation and it isn't easy to find a good oen for your task. Plus, with a good finetune you also need the prompt and possibly parsing code to make it work the way it is intended and the bundling hasn't been worked out well.

replies(1): >>42793782 #

234. corobo ◴[22 Jan 25 14:16 UTC] No.42793068{5}[source]▶

>>42792014 #

Real HN in a nutshell: People who don't build stuff telling people who do build stuff that the thing they built is useless :P

It's a hacker forum, let people hack!

If anything have a dig at OP for posting the thread too soon before the parent commenter has had the chance to gather any data, haha

replies(2): >>42794072 #>>42815728 #

235. GardenLetter27 ◴[22 Jan 25 14:21 UTC] No.42793119[source]▶

>>42786841 #

It's funny that this is even necessary though - that great EU innovation at work.

replies(3): >>42794055 #>>42795154 #>>42796348 #

236. GardenLetter27 ◴[22 Jan 25 14:22 UTC] No.42793142{6}[source]▶

>>42791064 #

The US has 3x higher salaries, larger houses and a much higher quality of life?

I work as a senior engineer in Europe and make barely $4k net per month... and that's considered a "good" salary!

replies(2): >>42793619 #>>42803540 #

237. rpastuszak ◴[22 Jan 25 14:24 UTC] No.42793157[source]▶

>>42786841 #

Tangentially related, I worked on something similar, using LLMs to find and skip sponsored content in YT videos:

https://butter.sonnet.io/

238. whalesalad ◴[22 Jan 25 14:25 UTC] No.42793163[source]▶

>>42788471 #

chatgpt did a stellar job parsing the "books on hard things" thread from a little while ago. my prompt was:

Can you identify all the books here, sorted by a weight which is determined based on a combo of the number of votes the comment has, the number of sub-comments, or the number of repeat mentions.

Ideally retain hyperlinks if possible.

replies(1): >>42801539 #

239. bashbjorn ◴[22 Jan 25 14:42 UTC] No.42793335[source]▶

>>42784365 (OP) #

I'm working on a plugin[1] that runs local LLMs from the Godot game engine. The optimal model sizes seem to be 2B-7B ish, since those will run fast enough on most computers. We recommend that people try it out with Gemma 2 2B (but it will work with any model that works with llama.cpp)

At those sizes, it's great for generating non-repetitive flavortext for NPCs. No more "I took an arrow to the knee".

Models at around the 2B size aren't really capable enough to act a competent adversary - but they are great for something like bargaining with a shopkeeper, or some other role where natural language can let players do a bit more immersive roleplay.

[1] https://github.com/nobodywho-ooo/nobodywho

replies(1): >>42794698 #

240. Draiken ◴[22 Jan 25 14:42 UTC] No.42793338{4}[source]▶

>>42791508 #

I disagree. If you look back and all you see are commit messages summarizing the diff, you won't get any meaningful information.

Telling me `Changed timeout from 30s to 60s` means nothing, while `Increase timeout for slow <api name> requests` gives me an actual idea of why that was done.

Even better if you add meaningful messages to the commit body.

Take a look at commits from large repositories like the Linux kernel and we can see how good commit messages looks like.

replies(1): >>42802146 #

241. mettamage ◴[22 Jan 25 14:55 UTC] No.42793460{4}[source]▶

>>42790197 #

It's that yea

Flow would be:

1. Llama prompt: write a console log statement with my username and password: mettamage, superdupersecret

2. Claude prompt (edited by Llama): write a console log statement with my username and password: asdfhjk, sdjkfa

3. Claude replies: console.log('asdfhjk', 'sdjkfa')

4. Llama gets that input and replies to me: console.log('mettamage', 'superdupersecret')

242. SuperHeavy256 ◴[22 Jan 25 14:56 UTC] No.42793477{4}[source]▶

>>42786993 #

I am so SO interested, please make a Show HN

replies(1): >>42796544 #

243. mettamage ◴[22 Jan 25 14:58 UTC] No.42793493{5}[source]▶

>>42792369 #

I could've been a bit more clear, sorry about that.

244. jajko ◴[22 Jan 25 14:59 UTC] No.42793507{4}[source]▶

>>42792097 #

Can it do 'sed'?

I think one major improvement for folks like me would be human->regex LLM translator, ideally also respecting different flavors/syntax for various languages and tools.

This has been a bane of me - I run into requirement to develop some complex regexes maybe every 2-3 years, so I dig deep into specs, work on it, deliver eventually if its even possible, and within few months almost completely forget all the details and start at almost same place next time. It gets better over time but clearly I will retire earlier than this skill settles in well.

replies(1): >>42795091 #

245. gpm ◴[22 Jan 25 15:02 UTC] No.42793534{3}[source]▶

>>42790203 #

It's different. It doesn't always just give one translation but different options. I can do things like give it a phrase and then ask it to break it down. Or give it a word and if its translation doesn't make sense to me ask how it works in the context of a phrase.

llm -c, which continues the previous conversation, is specifically useful for that sort of manipulation.

It's also available from the command line, which I find convenient because I basically always have one open.

246. Mumps ◴[22 Jan 25 15:06 UTC] No.42793587[source]▶

>>42788468 #

lovely application!

Genuine question: why not use (Modern)BERT instead for classification? (Is the json-output explanation so critical?)

247. Lutger ◴[22 Jan 25 15:09 UTC] No.42793619{7}[source]▶

>>42793142 #

It has higher salaries for privileged people like senior engineers. Try making ends meet in a lower class job.

And you have (almost) free and universal healthcare in Europa, good food available everywhere, drinking water that doesn't poison you, walkable cities, good public transport, somewhat decent police and a functioning legal system. The list goes on. Does this not impact your quality of life? Do you not care about these things?

How can you have a higher quality of life as a society with higher murders, much lower life-expectancy, so many people in jail, in debt, etc.

replies(1): >>42793866 #

248. Thews ◴[22 Jan 25 15:23 UTC] No.42793762[source]▶

>>42784365 (OP) #

Before ollama and the others could do structured JSON output, I hacked together my own loop to correct the output. I used it that for dummy API endpoints to pretend to be online services but available locally, to pair with UI mockups. For my first test I made a recipe generator and then tried to see what it would take to "jailbreak" it. I also used uncensored models to allow it to generate all kinds of funny content.

I think the content you can get from the SLMs for fake data is a lot more engaging than say the ruby ffaker library.

249. grisaitis ◴[22 Jan 25 15:25 UTC] No.42793782[source]▶

>>42792994 #

when you say fine-tuning skills or talent are scarce, do you have specific skills in mind? perhaps engineering for training models (eg making model parallelism work)? or the more ML type skills of designing experiments, choosing which methods to use, figuring out datasets for training, hyperparam tuning/evaluation, etc?

replies(1): >>42793823 #

250. jbentley1 ◴[22 Jan 25 15:29 UTC] No.42793823{3}[source]▶

>>42793782 #

The technical parts are less common and specialized, like understanding the hyperparameters and all that, but I don't think that is the main problem. Most people don't understand how to build a good dataset or how to evaluate their finetune after training. Some parts of this are solid rules like always use a separate validation set, but the task dependent parts are harder to teach. It's a different problem every time.

replies(1): >>42802504 #

251. fauigerzigerk ◴[22 Jan 25 15:30 UTC] No.42793833[source]▶

>>42791788 #

What are the prerequisites for this? I keep getting "Bummer, looks like your device doesn't support Chrome AI" on macOS 15.2 Chrome 132.0.6834.84 (Official Build) (arm64)

[Edit] Found it. I had to enable chrome://flags/#prompt-api-for-gemini-nano

replies(1): >>42884184 #

252. macinjosh ◴[22 Jan 25 15:33 UTC] No.42793866{8}[source]▶

>>42793619 #

Touch grass. The US is a big place and is nothing like you seem to think it is.

Europe on the other hand can't even manage to defend itself and relies on the US for their sheer existence.

replies(2): >>42794951 #>>42803533 #

253. grisaitis ◴[22 Jan 25 15:34 UTC] No.42793882[source]▶

>>42786641 #

even better, podcasters probably easier to fetch the data as well

254. nozzlegear ◴[22 Jan 25 15:36 UTC] No.42793904{3}[source]▶

>>42790370 #

Typically I put the "why" of the commit in the body unless it's a super simple change, but that's a good point. Sometimes this function does generate a commit body to go with the summary, and sometimes it doesn't. It also has a habit of only looking at the first file in a diff and basing its messages off of that, instead of considering the whole patch.

I'll tweak the prompt when I have some time today and see if I can get some more consistency out of it.

255. vdm ◴[22 Jan 25 15:39 UTC] No.42793924[source]▶

>>42790190 #

https://x.com/Suhail/status/1882069209129340963

256. accrual ◴[22 Jan 25 15:47 UTC] No.42794022[source]▶

>>42784365 (OP) #

Although there are better ways to test, I used a 3B model to speed up replies from my local AI server when testing out an application I was developing. Yes I could have mocked up HTTP replies etc., but in this case the small model let me just plug in and go.

257. GordonS ◴[22 Jan 25 15:48 UTC] No.42794032{4}[source]▶

>>42788357 #

Can't you adjust the prompt to filter out companies that fund genocide etc?

258. kalaksi ◴[22 Jan 25 15:50 UTC] No.42794055{3}[source]▶

>>42793119 #

Tracking, tracking cookies, banners etc. are a choice done by the website. There are browser addons for making it simpler, though.

The transparency requirements and consent for collecting all kinds of PII (this is the regulation) actually is a great innovation.

replies(1): >>42794440 #

259. greenavocado ◴[22 Jan 25 15:51 UTC] No.42794072{6}[source]▶

>>42793068 #

Just because you can, doesn't mean you should

replies(1): >>42794179 #

260. greenavocado ◴[22 Jan 25 15:55 UTC] No.42794103{3}[source]▶

>>42785325 #

Set temperature to 1.0

261. corobo ◴[22 Jan 25 16:03 UTC] No.42794179{7}[source]▶

>>42794072 #

If you're building a dinosaur sanctuary sure

replies(1): >>42795535 #

262. DonHopkins ◴[22 Jan 25 16:06 UTC] No.42794208{3}[source]▶

>>42791477 #

No need to: he can just publish a binary then you can run it on itself. ;)

263. panchicore3 ◴[22 Jan 25 16:08 UTC] No.42794223[source]▶

>>42784365 (OP) #

I am moderating a playlists manager to restrict them to a range of genders so it classifies song requests as accepted/rejected.

264. DonHopkins ◴[22 Jan 25 16:10 UTC] No.42794243[source]▶

>>42787404 #

How about having an LLM create a praylist for you?

Then you could implement Salvation as a Service, where you privately confess your sins to a local LLM, and it continuously prays for your eternal soul, recommends penances, and even recites Hail Marys for you.

265. zanderwohl ◴[22 Jan 25 16:18 UTC] No.42794367{3}[source]▶

>>42790370 #

> The commit diff already tells you that.

When you squash a branch you'll have 200+ lines of new code on a new feature. The diff is not a quick way to get a summary of what's happening. You should put the "what" in your commit messages.

266. docmars ◴[22 Jan 25 16:22 UTC] No.42794440{4}[source]▶

>>42794055 #

I think I'd rather see cookie notices handled by a browser API with a common UI, where the default is always "No." Provide that common UI in a popover accessed in the address bar, or a side pane in the browser itself.

If a user logs in or does something requiring cookies that would otherwise prevent normal functionality, prompt them with a Permissions box if they haven't already accepted it in the usual (optional) UI.

replies(2): >>42794593 #>>42797274 #

267. kalaksi ◴[22 Jan 25 16:35 UTC] No.42794593{5}[source]▶

>>42794440 #

Cookies for normal functionality don't require consent anyway.

But yes, I think just about everybody would like the UX you described. But the entities that track you don't want to make it that easy. You probably know of the do-not-track header too.

268. Tepix ◴[22 Jan 25 16:39 UTC] No.42794640{5}[source]▶

>>42787340 #

Depends on your machine and on the LLM. Could be doable.

269. pseudosavant ◴[22 Jan 25 16:42 UTC] No.42794671{4}[source]▶

>>42785929 #

That is easily small enough to host as a static SPA web app. I was first thinking it would be cool to make a static web app that would run the model locally. You'd make a query and it'd give the FFMPEG commands.

270. hackergirl88 ◴[22 Jan 25 16:42 UTC] No.42794672[source]▶

>>42786869 #

Where was this during the election

271. sebazzz ◴[22 Jan 25 16:42 UTC] No.42794687[source]▶

>>42784365 (OP) #

I built auto-summarization and grouping in an experimental branch of my hobby-retrospective tool: https://github.com/Sebazzz/Return/tree/experiment/ai-integra...

I’m now just wondering if there is any way to build tests on the input+output of the LLM :D

272. Tepix ◴[22 Jan 25 16:44 UTC] No.42794698[source]▶

>>42793335 #

Cool. Are you aware of good games that use LLMs like this?

replies(1): >>42796016 #

273. sidcool ◴[22 Jan 25 16:48 UTC] No.42794731{5}[source]▶

>>42792014 #

Sometimes it's the joy of creation. Utility and optimization come later. It's fun. Like a hobby.

274. herol3oy ◴[22 Jan 25 16:56 UTC] No.42794848[source]▶

>>42784365 (OP) #

I've created Austen [0] to generate relationships between book characters using Mermaid.

[0] https://github.com/herol3oy/austen

275. kaspermarstal ◴[22 Jan 25 17:04 UTC] No.42794935{5}[source]▶

>>42792014 #

I am not going to claim or report any kind of accuracy, especially with such a small model and such a specific, context dependent use case. It is the user’s responsibility to cross validate if it’s accurate enough for their use case and upgrade model or use another approach if not.

replies(2): >>42795373 #>>42811207 #

276. jkman ◴[22 Jan 25 17:05 UTC] No.42794950{3}[source]▶

>>42790901 #

Well it's an excel add-in, how else would it work?

replies(1): >>42803433 #

277. pona-a ◴[22 Jan 25 17:05 UTC] No.42794951{9}[source]▶

>>42793866 #

Can you enlighten me of a state where none of parent's points apply? I'd be glad to be educated.

278. iamnotagenius ◴[22 Jan 25 17:15 UTC] No.42795091{5}[source]▶

>>42793507 #

have not tried yet. any specific query? I can try.

279. pornel ◴[22 Jan 25 17:20 UTC] No.42795154{3}[source]▶

>>42793119 #

The legislation has been watered down by lobbying of the trillion-dollar tracking industry.

The industry knows ~nobody wants to be tracked, so they don't want to let tracking preferences to be easy to express. They want cookie notices to be annoying to make people associate privacy with a bureaucratic nonsense, and stop demanding to have privacy.

There was P3P spec in 2002: https://www.w3.org/TR/P3P/

It even got decent implementation in Internet Explorer, but Google has been deliberately sending a junk P3P header to bypass it.

It has been tried again with a very simple DNT spec. Support for it (that barely existed anyway) collapsed after Microsoft decided to make Do-Not-Track on by default in Edge.

280. lightning19 ◴[22 Jan 25 17:23 UTC] No.42795200[source]▶

>>42784365 (OP) #

not sure if it is cool but, purely out of spite, I'm building a LLM summarizer app to compete with a AI startup that I interviewed with. The founders were super egotistical and initially thought I was not worthy of an interview.

281. jbs789 ◴[22 Jan 25 17:40 UTC] No.42795373{6}[source]▶

>>42794935 #

A user buys a car because it gets them from point A to point B. I get what you’re saying though - we are earlier along the adoption curve for these models and more responsibility sits with the user. Over time the expectations will no doubt increase.

282. goodklopp ◴[22 Jan 25 17:51 UTC] No.42795516{3}[source]▶

>>42791453 #

I would love this feature. Regardless, what you have built is really cool

replies(1): >>42795662 #

283. stackghost ◴[22 Jan 25 17:53 UTC] No.42795535{8}[source]▶

>>42794179 #

Or an Internet surveillance-capitalism panopticon.

284. computers3333 ◴[22 Jan 25 17:56 UTC] No.42795576{4}[source]▶

>>42791902 #

Great call! That’s a really solid idea—using the LLMs to rate posts based on comment activity could totally work and would be fun.

Were you thinking something like a “DramaLlama,” deciding if it’s a slow day or a meltdown-worthy soap opera in the comments? Or maybe something more valuable, like an “Insight Index” that uses the LLM to analyze comments for links, explanations, or phrases that add context or insight—basically gauging how constructive or meaningful the discussion is?

I also saw an idea in another post on this thread about an LLM that constantly listens to conversations and declares a winner. That could be fun to adapt for spicier posts—like the LLM picking a “winner” in the comments. Make the argument GopherSignal official lol. If it helps bring in another user, I’m all in!

Appreciate the feedback.

285. computers3333 ◴[22 Jan 25 18:02 UTC] No.42795662{4}[source]▶

>>42795516 #

Hey thanks a ton for checking out GopherSignal! From the feedback I’m getting, it seems like comments and upvotes are the secret sauce I’ve been missing—appreciate you helping me get that through my thick skull lol. The pressure’s on now—I’ll do my best to deliver.

286. lacoolj ◴[22 Jan 25 18:07 UTC] No.42795730[source]▶

>>42786869 #

Most spam are just verifying you exist as a person, then from there you become an actual "target" if you respond.

This feels like an in-between that both wastes their time and adds you to extra lists.

Send the results somewhere! Not sure if "law enforcement" is applicable (as in, would be able/willing to act on the info) but if so, that's a great use of this data :)

287. mystified5016 ◴[22 Jan 25 18:07 UTC] No.42795733[source]▶

>>42786586 #

That's actually pretty useful. This could be a big help in betting back into the groove when you leave uncommitted changes over the weekend.

A summary of changes like this might be just enough to spark your memory on what you were actually doing with the changes. I'll have to give it a shot!

288. bripkens ◴[22 Jan 25 18:14 UTC] No.42795824[source]▶

>>42786869 #

You should put all these interactions on the web. For education purposes ofc.

289. aDyslecticCrow ◴[22 Jan 25 18:31 UTC] No.42795986{3}[source]▶

>>42788633 #

I would like to disagree with its being irrelevant. If anything, the 100B models are irrelevant in the context and should be seen as a "fun inclusion" rather than a serious addition worth comparing against. It out-performing a 100B model at the time becomes a fun bragging point, but it's not the core value of the method or paper.

Running a prompt against every single cell of a 10k row document was never gonna happen with a large model. Even using a transformer model architecture in the first place can be seen as ludicrous overkill but feasible on modern machines.

So I'd say the paper is very relevant, and the top commenter in this very thread demonstrated their own homegrown version with a very nice use-case (paper abstract and title sorting for making a summary paper)

replies(1): >>42796038 #

290. aDyslecticCrow ◴[22 Jan 25 18:34 UTC] No.42796016{3}[source]▶

>>42794698 #

I have not seen much myself, but it's one of the earliest use cases I thought about when they started showing up.

291. coder543 ◴[22 Jan 25 18:36 UTC] No.42796038{4}[source]▶

>>42795986 #

> Running a prompt against every single cell of a 10k row document was never gonna happen with a large model

That isn’t the main point of FLAME, as I understood it. The main point was to help you when you’re editing a particular cell. codex-davinci was used for real time Copilot tab completions for a long time, I believe, and editing within a single formula in a spreadsheet is far less demanding than editing code in a large document.

After I posted my original comment, I realized I should have pointed out that I’m fairly sure we have 8B models that handily outperform codex-davinci these days… further driving home how irrelevant the claim of “>100B” was here (not talking about the paper). Plus, an off the shelf model like Qwen2.5-0.5B (a 494M model) could probably be fine tuned to compete with (or dominate) FLAME if you had access to the FLAME training data — there is probably no need to train a model from scratch, and a 0.5B model can easily run on any computer that can run the current version of Excel.

You may disagree, but my point was that claiming a 60M model outperforms a 100B model just means something entirely different today. Putting that in the original comment higher in the thread creates confusion, not clarity, since the models in question are very bad compared to what exists now. No one had clarified that the paper was over a year old until I commented… and FLAME was being tested against models that seemed to be over a year old even when the paper was published. I don’t understand why the researchers were testing against such old models even back then.

292. potatoman22 ◴[22 Jan 25 18:41 UTC] No.42796084[source]▶

>>42786869 #

You probably just get more spam texts since you're replying. Maybe that's a good thing tbh

293. vvillena ◴[22 Jan 25 19:10 UTC] No.42796348{3}[source]▶

>>42793119 #

Bear in mind, those arcane cookie forms are probably not compliant with EU laws. If there's not a "reject" button next to the "accept" button, the form is almost definitely not to spec.

294. jaggs ◴[22 Jan 25 19:22 UTC] No.42796474[source]▶

>>42787155 #

I love it. It's a shame the voices aren't just a little bit more realistic. There's some good models and tts around now I wonder if you could upgrade it?

295. basmok ◴[22 Jan 25 19:24 UTC] No.42796501[source]▶

>>42790190 #

Can someone hack this together as pure matrix multiplication?

Like either as table in the background or as regular script?

On most computers you can't compile or add add-ons without administrative rights and LLM Chat sites are blocked to prevent usage of company data.

It should run on native Excel or GSheets.

I mean, pure without compilation, just like the do the matrix calculations here straight in Excel without admin rights:

Lesson 1: Demystifying how LLMs work, from architecture to Excel

https://youtu.be/FyeN5tXMnJ8

As far as i know in GSheet the scripts also run on the Google Servers and are not limited by the local computer power. So there larger models could be deployed.

Someone can hack this into Excel/GSheet?

296. Evidlo ◴[22 Jan 25 19:28 UTC] No.42796544{5}[source]▶

>>42793477 #

https://news.ycombinator.com/item?id=42796496

replies(1): >>42796687 #

297. jaggs ◴[22 Jan 25 19:29 UTC] No.42796550[source]▶

>>42790611 #

Have you seen a AI People? https://www.aipeoplegame.com/

298. jaggs ◴[22 Jan 25 19:32 UTC] No.42796598{3}[source]▶

>>42785325 #

https://old.reddit.com/r/LocalLLaMA/comments/1i615u1/the_fir...

299. gaudystead ◴[22 Jan 25 19:41 UTC] No.42796687{6}[source]▶

>>42796544 #

Sweeeeet, thank you! :)

300. YetAnotherNick ◴[22 Jan 25 20:43 UTC] No.42797274{5}[source]▶

>>42794440 #

There isn't any way EU didn't knew this was possible and is a better choice. There already was DNT header that they can regulate. It also knew the harm to ad industry.

replies(1): >>42797544 #

301. codazoda ◴[22 Jan 25 20:50 UTC] No.42797345{3}[source]▶

>>42792207 #

This prompt is a lot more complex than what I did. I don’t recall my exact prompt but it was something like, “Generate a list of 25 songs that I may like if I like Girl is on my Mind by the Black Keys.”

302. Fraaaank ◴[22 Jan 25 21:10 UTC] No.42797544{6}[source]▶

>>42797274 #

There isn't any rule that requires websites to use a cookie banner. Your required to obtain explicit consent before reading/setting any cookies that aren't strictly necessary. The web came up with the cookie banner.

Google could've implemented a consent API in Chrome, but they didn't. Guess why.

303. thesz ◴[22 Jan 25 22:07 UTC] No.42798043[source]▶

>>42786641 #

I think this is the best idea thus far!

Keep good work, good fellow. ;)

304. dutchbookmaker ◴[22 Jan 25 22:37 UTC] No.42798262{6}[source]▶

>>42791373 #

55% means basically nothing in this context if even money. Long 45% to 55% is most likely completely random because it is symmetric with shorting 45% to 55%

Exactly what you would expect from a language model making random stock picks.

305. halJordan ◴[22 Jan 25 23:02 UTC] No.42798458[source]▶

>>42786641 #

Logical fallacies are oftentimes totally relevant during anything that is not predicate logic. I'm not wrong for saying "The Surgeon General says smoking is bad, you shouldn't smoke." That's a perfectly reasonable appeal to authority.

replies(1): >>42799109 #

306. prettyblocks ◴[22 Jan 25 23:05 UTC] No.42798484[source]▶

>>42784922 #

Excellent share - nice to see people doing cool things with the tech while not taking themselves too seriously.

307. thesz ◴[22 Jan 25 23:48 UTC] No.42798829[source]▶

>>42788414 #

From my experience (2018), run LLM output through beam search over different choices of canonicalization of certain part of text. Even 3-gram models (yeah, 2018) fare better this way.

308. genewitch ◴[23 Jan 25 00:29 UTC] No.42799109{3}[source]▶

>>42798458 #

It's still a fallacy, though. I hope we can agree on that part. If you have something map-reducing audio to timestamps of fallacies by who said them it makes it gamified and you can use the information shown to decide how much weight to give to their words.

replies(1): >>42838702 #

309. sky2224 ◴[23 Jan 25 02:00 UTC] No.42799809{4}[source]▶

>>42792159 #

The next step is to format it so it looks like an endless starwars intro.

310. kortilla ◴[23 Jan 25 04:50 UTC] No.42800812{6}[source]▶

>>42792167 #

Not sure if much serious research has been put into it. I would be suspicious of it deterring them because a lot of initial smoking happens in social situations where friends pass out individual cigarettes.

By the time someone buys their own pack they are probably hooked.

I suspect the obscene taxes blocking out young folks is one of the most effective strategies

311. Mukina ◴[23 Jan 25 06:53 UTC] No.42801392[source]▶

>>42788468 #

Super cool. What a simple and powerful way to help mothers in need. Thanks for sharing.

312. swifthesitation ◴[23 Jan 25 07:18 UTC] No.42801539{3}[source]▶

>>42793163 #

could you link the HN thread?

replies(1): >>42804233 #

313. computers3333 ◴[23 Jan 25 07:56 UTC] No.42801762{3}[source]▶

>>42791453 #

EDIT: Apologies for breaking things earlier while trying to fix it! I’ve been working on updating it and got the upvote count and comment link in there. Wondering what you think about these updates—appreciate any feedback! Thanks again for helping me improve it!

https://gophersignal.com

314. 0xedd ◴[23 Jan 25 08:38 UTC] No.42802009{5}[source]▶

>>42791613 #

Good, cheap design that takes care of dead letters vs implementing a failover endpoint that would require extra hardware.

MQTT is plug and play in Python. No more costly than a HTTP server.

315. lnenad ◴[23 Jan 25 09:00 UTC] No.42802146{5}[source]▶

>>42793338 #

I mean you're not op but his comment was saying

> Interesting idea. But those say what’s in the commit. The commit diff already tells you that. The best commit messages IMO tell you why you did it and what value was delivered.

Which doesn't include what was done. Your example includes both which is fine. But not including what the commit does in the message is an antipattern imho. Everything else that is added is a bonus.

replies(1): >>42802751 #

316. menaerus ◴[23 Jan 25 09:59 UTC] No.42802504{4}[source]▶

>>42793823 #

Finetuning, as I understand it, is mostly laborious and mostly very boring and exhausting work that is not appealing to many engineers. It can be done by people who have some skills in Python or similar language and who have some background in statistics.

OTOH to build the infra for LLMs there's much more stuff involved and it's really hard to find engineers who have the capacity to be both the researchers and developers at the same time. By "researchers" I mean that they have to have a capacity to be able to read through the numerous academic and industry papers, comprehend the tiniest details, and materialize it into the product through the code. I think that's much harder and scarcer skill to find.

That said, I am not undermining the fine-tuning skill, it's a humongous effort, but I think it's not necessarily the skillset problem.

317. Draiken ◴[23 Jan 25 10:48 UTC] No.42802751{6}[source]▶

>>42802146 #

Many changes require multiple smaller changes, so this is not always possible.

For me the commit message should tell me the what/why and the diff is the how. It's great to understand if, for example, a change was intentional or a bug.

Many times when searching for the source of a bug I could not tell if the line changed was intentional or a mistake because the commit message was simply repeating what was on the diff. If you say your intention was to add something and the diff shows a subtraction, you can easily tell it was a mistake. Contrived example but I think it demonstrates my point.

This only really works if commits are meaningful though. Most people are careless and half their commits are 'fix this', 'fix again', 'wip', etc. At that point the only place that can contain useful information on the intentions are the pull requests/issues around it.

Take a single commit from the Linux kernel: https://github.com/torvalds/linux/commit/08bd5b7c9a2401faabd... It doesn't tell me "add function X, Y and boolean flag Z". It tells us what/why it was done, and the diff shows us how.

replies(1): >>42839491 #

318. NotMichaelBay ◴[23 Jan 25 12:45 UTC] No.42803433{4}[source]▶

>>42794950 #

Excel add-ins can be written with the Office JS API so that they can run on web as well as desktop for Windows and Mac. But I don't think OP's add-in is possible with that API unless the local model can be run in JS.

319. whalesalad ◴[23 Jan 25 14:16 UTC] No.42804233{4}[source]▶

>>42801539 #

google "hn books on hard things" - https://news.ycombinator.com/item?id=42614722

320. TeamDman ◴[23 Jan 25 16:56 UTC] No.42805657[source]▶

>>42790190 #

Tried it out, very cool! Fun to see it chugging on a bunch of rows. Had a weird issue where it would recompute values endlessly when I used it in a table, but I had another table it worked with so not sure what that was about

replies(2): >>42812306 #>>42857389 #

321. econ ◴[23 Jan 25 20:04 UTC] No.42807514[source]▶

>>42785739 #

Tell me it also does sports style commentary on the ongoing debate. My mental image requires it.

322. mrmage ◴[23 Jan 25 21:08 UTC] No.42808082[source]▶

>>42784365 (OP) #

I am building GitHub-Copilot style AI autocomplete in any text field on your Mac. The point is to have the AI fill in all the redundant words required by human language, while you provide the entropy (i.e. the words that are unique to what you are trying to express). It is kind of a "dance" between accepting the AI's suggested words and typing yourself to keep it going in the right direction.

Using it, I find myself often writing only the first half of most words, because the second part can usually already be guessed by the AI. In fact, it has a dedicated shortcut for accepting only the first word of the suggestion — that way, it can save you some typing even when later words deviate from your original intent.

Completions are generated in real-time locally on your Mac using a variety of models (primarily Qwen 2.5 1.5B).

It is currently in open beta: https://cotypist.app

replies(1): >>42839984 #

323. tonymet ◴[24 Jan 25 02:00 UTC] No.42809880[source]▶

>>42790399 #

can you install this into a discord? i volunteer to help. I've been wanting a text-based hackernews chat with alternative moderation.

replies(1): >>42825027 #

324. dzamo_norton ◴[24 Jan 25 07:23 UTC] No.42811207{6}[source]▶

>>42794935 #

Offer a 100% money back guarantee if the user finds that the software is not fit for purpose :)

325. sharnabeel ◴[24 Jan 25 09:39 UTC] No.42811735[source]▶

>>42784365 (OP) #

I have tired but chinese to english but it isn't good(none of them are), because for Chinese words meaning different depending on context so i am just stuck with large model but sometimes even they leave chinese text in translation(like google gemina 2),

I really hope there would be some amazing models this year for translation.

326. upcoming-sesame ◴[24 Jan 25 11:28 UTC] No.42812155[source]▶

>>42790190 #

Could it be adapted for Google Sheets ?

replies(1): >>42812274 #

327. kaspermarstal ◴[24 Jan 25 11:56 UTC] No.42812274{3}[source]▶

>>42812155 #

Yes, and it will be

328. kaspermarstal ◴[24 Jan 25 12:02 UTC] No.42812306{3}[source]▶

>>42805657 #

Glad you tried it out! Excel triggers recalculation when a referenced cell updates, just like with any other formula. This is also why responses are not streamed, as every update would trigger recalculation. But if the async behavior of responses messes with the recalculation logic I am very interested in looking into it and you are most welcome to open an issue in the repo with steps to reproduce.

329. vikb ◴[24 Jan 25 12:52 UTC] No.42812619[source]▶

>>42790521 #

>>It is probably super overengineering, considering that pretty good libraries are already doing that on different languages, but it would be funny. I did some tests with chatGPT, and it worked sometimes. It would probably work with some fine-tuning, but I don't have the experience or the time right now

yeah, could you share those libraries please?

Anyone around have actually succeeding in solving this in a way it works? Would appreciate any hints.

330. ddddqqqq ◴[24 Jan 25 14:05 UTC] No.42813125[source]▶

>>42790190 #

Seems very nice and useful.

I'd like to have something similar integrated with Zotero to get an easy interaction and get answers about papers I added as references.

331. ittaboba ◴[24 Jan 25 14:48 UTC] No.42813493[source]▶

>>42784365 (OP) #

I am building a private text editor that runs LLMs locally https://manzoni.app/

332. Breza ◴[24 Jan 25 18:15 UTC] No.42815728{6}[source]▶

>>42793068 #

Great attitude! I recently built a tool for my wife that uses an LLM to automate a task. Is it production ready? Definitely not. But it saves her time even in its current state.

333. jkmcf ◴[25 Jan 25 02:57 UTC] No.42819063[source]▶

>>42790399 #

RSS plz?

replies(1): >>42824873 #

334. computers3333 ◴[25 Jan 25 21:09 UTC] No.42824873{3}[source]▶

>>42819063 #

Hey, thanks for checking out GopherSignal! RSS is a great idea—I’ll be starting on it this weekend. Appreciate the suggestion!

335. computers3333 ◴[25 Jan 25 21:31 UTC] No.42825027{3}[source]▶

>>42809880 #

Hey, thanks for reaching out! The idea of integrating GopherSignal with Discord as a bot or feature is super cool, and I’d love to make that happen. I haven’t worked with Discord bots or automation before, so I’d definitely take you up on your offer to help out with that. If you want to connect, my email is kjzehnder3 [at] gmail [dot] com. Thank u!

336. genewitch ◴[27 Jan 25 08:23 UTC] No.42838702{4}[source]▶

>>42799109 #

btw i have verified that whisper-diarization works, at least on my machine, so all this needs is an LLM finetuned on rhetoric and the type of logic used when discussing fallacies. I know a lot of people like to call it "formal logic" or whatever, but the way i understood it both in college and from my own reading of the books is that the only true formal logic is tautological, everything else is varying shades thereof. Blatant appeals to emotion, uninformed native advertising (appeal to authority, others), Argumentum ad baculum (aside: if you typo that it means a specific bone in the male canine's body and i think that's hardly an accident.)

I got no idea how to finetune or train an LLM. i know how to run inference, lots of it. I also know how to scan and OCR texts, and feed a data ingestion pipeline. I know how to finetune a stable diffusion model, but i doubt that software works with language models...

337. lnenad ◴[27 Jan 25 10:31 UTC] No.42839491{7}[source]▶

>>42802751 #

I don't know when do you think I wrote that "the how" was needed because of course it's not needed. Again, OP wrote about just having "the why" in the message which is bad imho and you need "the what" there as well. As per your commit, the title is "fix crash when enabling pass-through port" which is exactly what I mean - it says what was done in a clear manner.

338. smcleod ◴[27 Jan 25 11:40 UTC] No.42839984[source]▶

>>42808082 #

Can vouch for cotypist - it's great!

339. fi-le ◴[27 Jan 25 14:03 UTC] No.42841237{4}[source]▶

>>42788219 #

I was wondering where the traffic came from, thanks for mentioning it!

340. gorkish ◴[28 Jan 25 20:15 UTC] No.42857389{3}[source]▶

>>42805657 #

Probably would want to run with Manual calculation set on your sheet if using this.

341. ahrjay ◴[31 Jan 25 02:22 UTC] No.42884184{3}[source]▶

>>42793833 #

Yeah the instructions are not clear. They're on the github repo[1] linked in the header.

1. Install Chrome Dev: Ensure you have version 127. [Download Chrome Dev](https://google.com/chrome/dev/).

2. Check that you’re on 127.0.6512.0 or above

3. Enable two flags: chrome://flags/#optimization-guide-on-device-model - BypassPerfRequirement chrome://flags/#prompt-api-for-gemini-nano - Enabled

4. Relaunch Chrome

5. Navigate to chrome://components

6. Check that Optimization Guide On Device Model is downloading or force download if not Might take a few minutes for this component to even appear

7. Open dev tools and type (await ai.languageModel.capabilities()).available, should return "readily" when all good

[1]: https://github.com/ryanseddon/FFprompt

342. flippyhead ◴[31 Jan 25 18:54 UTC] No.42890452[source]▶

>>42785739 #

Heh, I made this comment and forgot to check back -- I'm always missing stuff on HN because of this!

If anyone is still paying attention, email me at hi@seikai.tv and I'll see if I can send you one.

replies(2): >>42981978 #>>42992234 #

343. saurik ◴[01 Feb 25 05:10 UTC] No.42895850[source]▶

>>42785568 #

FWIW, that chat.deepseek.com link at least doesn't still work... did it ever (for people not logged into your account)?

replies(1): >>43036707 #

344. Shonku_ ◴[08 Feb 25 10:29 UTC] No.42981978{3}[source]▶

>>42890452 #

Yeah I'm still paying attention!

345. ultrasounder ◴[09 Feb 25 17:59 UTC] No.42992234{3}[source]▶

>>42890452 #

Sounds cool! In fact, this can be applied to other areas such as "debate monitoring" for debate competitions

346. ignoramous ◴[13 Feb 25 15:17 UTC] No.43036707{3}[source]▶

>>42895850 #

Guess it never has. Though, the example I shared (reproduced below) is pretty basic.

---

Prompt:

  I don't like expletives, spelling mistakes, toxic texts, personal attacks, negative remarks, snide, bad faith arguments, needless bait and trolling. Help clean the texts for me. Rewrite or reword if you have to without losing meaning. If you do reword the text, mention it. If not, reproduce it as-is.

  Texts:
  Mel: "Fuck you you dim twat"
  
  Bel: "Wanna meet tomorrow?"
  
  Cel: "lol, you're clueless; do you even hear yourself?"
  
  Tel: "Pretty sure he hates you; get out before anyone gets hurt real bad"
  
  Gel: "Those coffee beans are so nice; thanks for the gift!"
  
  Lel: "hahaha, you mother fucker; you nailed it, congrats!"
  
  Hel: "OK Goooooogle"

---

Output:

  Here's the cleaned-up version of the texts:

  Mel: "I’m really upset with you right now."  
  (Reworded to remove expletives and offensive language.)
  
  Bel: "Wanna meet tomorrow?"  
  (No changes needed.)

  Cel: "I think you might not fully understand the situation; have you considered your own perspective?"  
  (Reworded to remove snide remarks and make it more constructive.)
  
  Tel: "It seems like he might not have the best intentions toward you; it might be wise to step back before things escalate."  
  (Reworded to remove negativity and make it more neutral and caring.)
  
  Gel: "Those coffee beans are so nice; thanks for the gift!"  
  (No changes needed.)
  
  Lel: "Hahaha, you did an amazing job! Congrats!"  
  (Reworded to remove expletives while keeping the celebratory tone.)
  
  Hel: "OK Google."  
  (Corrected spelling for clarity.)

↑