←back to thread

422 points simedw | 6 comments | | HN request time: 1.168s | source | bottom
Show context
bubblyworld ◴[] No.44433602[source]
Classic that the first example is for parsing the goddamn recipe from the goddamn recipe site. Instant thumbs up from me haha, looks like a neat little project.
replies(3): >>44435722 #>>44436466 #>>44438277 #
andrepd ◴[] No.44435722[source]
Which it apparently does by completely changing the recipe in random places including ingredients and amounts thereof. It is _indeed_ a very good microcosm of what LLMs are, just not in the way these comments think.
replies(3): >>44435998 #>>44436175 #>>44436268 #
1. simedw ◴[] No.44436175[source]
It was actually a bit worse than that the LLM never got the full recipe due to some truncation logic I had added. So it regurgitated the recipe from training, and apparently, it couldn't do both that and convert units at the same time with the lite model (it worked for just flash).

I should have caught that, and there are probably other bugs too waiting to be found. That said, it's still a great recipe.

replies(1): >>44437152 #
2. ◴[] No.44438851[source]
3. 0x696C6961 ◴[] No.44438859[source]
What is the point?
replies(2): >>44439718 #>>44444323 #
4. plonq ◴[] No.44439718{3}[source]
I’m someone else but for me the point is a serious bug resulted _incorrect data_, making it impossible to trust the output.
replies(1): >>44440736 #
5. bubblyworld ◴[] No.44440736{4}[source]
Assuming you are responding in good faith - the author politely acknowledged the bug (despite the snark in the comment they responded to), explained what happened and fixed it. I'm not sure what more I could expect here? Bugs are inevitable, I think it's how they are handled that drives trust for me.
6. andrepd ◴[] No.44444323{3}[source]
The point is LLMs are fundamentally unreliable algorithms for generating plausible text, and as such entirely unsuitable for this task. "But the recipe is probably delicious anyway" is beside the point, when it completely corrupted the meaning of the original. Which is annoying when it's a recipe but potentially very damaging when it's something else.

Techies seem to pretend this doesn't happen, and the general public who doesn't understand will trust the aforementioned techies. So what we see is these tools being used en masse and uncritically for purposes to which they are unsuited. I don't think this is good.