←back to thread

422 points simedw | 8 comments | | HN request time: 0.45s | source | bottom
1. mossTechnician ◴[] No.44435170[source]
Changes Spegel made to the linked recipe's ingredients:

Pounds of lamb become kilograms (more than doubling the quantity of meat), a medium onion turns large, one celery stalk becomes two, six cloves of garlic turn into four, tomato paste vanishes, we lose nearly half a cup of wine, beef stock gets an extra ¾ cup, rosemary is replaced with oregano.

replies(4): >>44435185 #>>44435234 #>>44435464 #>>44435844 #
2. achierius ◴[] No.44435185[source]
Did you actually observe this, or is just meant to be illustrative of what could happen?
replies(1): >>44435223 #
3. mossTechnician ◴[] No.44435223[source]
This is what actually happened in the linked article. The recipe is around the text that says

> Sometimes you don't want to read through someone's life story just to get to a recipe... That said, this is a great recipe

I compared the list of ingredients to the screenshot, did a couple unit conversions, and these are the discrepancies I saw.

4. orliesaurus ◴[] No.44435234[source]
oh damn...
5. jugglinmike ◴[] No.44435464[source]
Great catch. I was getting ready to mention the theoretical risk of asking an LLM be your arbiter of truth; it didn't even occur to me to check the chosen example for correctness. In a way, this blog post is a useful illustration not just of the hazards of LLMs, but also of our collective tendency to eschew verity for novelty.
replies(2): >>44435877 #>>44443768 #
6. simedw ◴[] No.44435844[source]
Fantastic catch! It led me down a rabbit hole, and I finally found the root cause.

The recipe site was so long that it got truncated before being sent to the LLM. Then, based on the first 8000 characters, Gemini hallucinated the rest of the recipe, it was definitely in its training set.

I have fixed it and pushed a new version of the project. Thanks again, it really highlights how we can never fully trust models.

7. andrepd ◴[] No.44435877[source]
> Great catch. I was getting ready to mention the theoretical risk of asking an LLM be your arbiter of truth; it didn't even occur to me to check the chosen example for correctness.

It's beyond parody at this point. Shit just doesn't work, but this fundamental flaw of LLMs is just waved away or simply not acknowledged at all!

You have an algorithm that rewrites textA to textB (so nice), where textB potentially has no relation to textB (oh no). Were it anything else this would mean "you don't have an algorithm to rewrite textA to textB", but for gen ai? Apparently this is not a fatal flaw, it's not even a flaw at all!

I should also note that there is no indication that this fundamental flaw can be corrected.

8. throwawayoldie ◴[] No.44443768[source]
> the theoretical risk of asking an LLM be your arbiter of truth

"Theoretical"? I think you misspelled "ubiquitous".