Most active commenters
  • bubblyworld(11)
  • throwaway290(5)
  • (3)

←back to thread

422 points simedw | 34 comments | | HN request time: 1.715s | source | bottom
1. bubblyworld ◴[] No.44433602[source]
Classic that the first example is for parsing the goddamn recipe from the goddamn recipe site. Instant thumbs up from me haha, looks like a neat little project.
replies(3): >>44435722 #>>44436466 #>>44438277 #
2. andrepd ◴[] No.44435722[source]
Which it apparently does by completely changing the recipe in random places including ingredients and amounts thereof. It is _indeed_ a very good microcosm of what LLMs are, just not in the way these comments think.
replies(3): >>44435998 #>>44436175 #>>44436268 #
3. throwawayoldie ◴[] No.44435998[source]
The output was then posted to the Internet for everyone to see, without the minimal amount of proofreading that would be necessary to catch that, which gives us a good microcosm of how LLMs are used.

On a more pleasant topic the original recipe sounds delicious, I may give it a try when the weather cools off a little.

4. simedw ◴[] No.44436175[source]
It was actually a bit worse than that the LLM never got the full recipe due to some truncation logic I had added. So it regurgitated the recipe from training, and apparently, it couldn't do both that and convert units at the same time with the lite model (it worked for just flash).

I should have caught that, and there are probably other bugs too waiting to be found. That said, it's still a great recipe.

replies(1): >>44437152 #
5. bubblyworld ◴[] No.44436268[source]
What do you mean? The recipes in the screenshot look more or less the same, the formatting has just changed in the Spiegel one (which is what was asked for, so no surprises there).

Edit: just saw the author's comment, I think I'm looking at the fixed page

6. IncreasePosts ◴[] No.44436466[source]
There are extensions that do that for you, in a deterministic way and not relying on LLMs. For example, Recipe Filter for chrome. It just shows a pop up over the page when it loads if it detects a recipe
replies(1): >>44437154 #
7. bubblyworld ◴[] No.44437154[source]
Thanks, I already use that plugin, actually, I just found the problem amusingly familiar. Recipe sites are the original AI slop =P
8. lpribis ◴[] No.44438277[source]
Another great example of LLM hype train re-inventing something that already existed [1] (and was actually thought out) but making it worse and non-deterministic in the worst ways possible.

https://schema.org/Recipe

replies(6): >>44438799 #>>44439573 #>>44440529 #>>44440626 #>>44440664 #>>44440708 #
9. ◴[] No.44438799[source]
10. ◴[] No.44438851{4}[source]
11. 0x696C6961 ◴[] No.44438859{4}[source]
What is the point?
replies(2): >>44439718 #>>44444323 #
12. ◴[] No.44439573[source]
13. plonq ◴[] No.44439718{5}[source]
I’m someone else but for me the point is a serious bug resulted _incorrect data_, making it impossible to trust the output.
replies(1): >>44440736 #
14. soap- ◴[] No.44440529[source]
And that would be great, if anyone used it.

LLMs are specifically good at a task like this because they can extract content from any webpage, regardless of it supports whatever standard that no one implements

15. komali2 ◴[] No.44440626[source]
That's a cool schema, but the LLM solution is necessary because recipe website makers will never use the schema because they want you to have to read through garbage, with some misguided belief that this helps their SEO or something. Or maybe they get more money if you scroll through more ads?
replies(2): >>44440719 #>>44442528 #
16. VMG ◴[] No.44440664[source]
The LLM thing actually works. Who cares if it's deterministic. Maybe the same people who come up with arcane schemas that nobody ever uses?
17. bubblyworld ◴[] No.44440708[source]
Can we stop with the unprovoked dissing of anyone using LLMs for anything? Or at least start your own thread for it. It's an unpleasant, incredibly boring/predictable standard for discourse (more so than the LLMs themselves lol).
replies(2): >>44440863 #>>44442494 #
18. bubblyworld ◴[] No.44440719{3}[source]
I'm genuinely a bit confused by the recipe blog business model. Like there's got to be one, right? People don't usually spew the same story about their grandma hundreds of times on a real blog.

Just hitting keywords for search? Many of them don't even have ads so I feel like that can't be it. Maybe referrals?

replies(1): >>44440918 #
19. bubblyworld ◴[] No.44440736{6}[source]
Assuming you are responding in good faith - the author politely acknowledged the bug (despite the snark in the comment they responded to), explained what happened and fixed it. I'm not sure what more I could expect here? Bugs are inevitable, I think it's how they are handled that drives trust for me.
20. alt187 ◴[] No.44440863{3}[source]
It's in fact very provoked. The LLM just changes the instructions of the recipe and creates new ones. That's an unpleasant standard of user experience.
replies(1): >>44441952 #
21. Revisional_Sin ◴[] No.44440918{4}[source]
SEO. Longer articles get ranked higher.
replies(1): >>44441956 #
22. bubblyworld ◴[] No.44441952{4}[source]
That is a terrible reason to be a dick to someone. Especially someone who has created free software that you have no obligation to use.
23. bubblyworld ◴[] No.44441956{5}[source]
Makes sense, thanks, but how do you actually make money from that without tons of ads? I realise this is a super naive question haha
replies(2): >>44442193 #>>44442355 #
24. gpm ◴[] No.44442355{6}[source]
> without tons of ads

This is a requirement? I literally only browse the web with an ad blocker but I always assumed those sites had tons of ads.

replies(1): >>44443725 #
25. throwaway290 ◴[] No.44442494{3}[source]
When they stop training these LLM on stolen content?
replies(1): >>44442789 #
26. throwaway290 ◴[] No.44442528{3}[source]
How do they make money then? Or do you think they are doing some sort of public service and you are entitled?
replies(1): >>44445346 #
27. bubblyworld ◴[] No.44442789{4}[source]
Then go join a thread where someone is actually talking about that issue, which is very valid, and make a meaningful contribution to the conversation. Jumping on people for no reason other than they have touched a language model is just rude.
replies(1): >>44442964 #
28. throwaway290 ◴[] No.44442964{5}[source]
No you go. It is allowed to diss a thing in the thread about that thing if we think that thing is bad. People here dissed Dropbox when it was launched... and this is no Dropbox.

There are things that are not allowed. But here someone made a good point without any personal attacks. You silencing people is probably the least appropriate thing in this thread.

> make a meaningful contribution

What was not a meaningful contribution, mentioning the relevant schema? Saying using the LLM is bad for example because it is trained on our content without permission and payment? or that this steals from people who provide you content for free and make money from ads?

replies(1): >>44443040 #
29. bubblyworld ◴[] No.44443040{6}[source]
I'm not interested in continuing this conversation.
replies(1): >>44443092 #
30. throwaway290 ◴[] No.44443092{7}[source]
I wish you didn't start it... I see it's you who posted the top level comment but it doesn't mean you "own" the thread.
31. bubblyworld ◴[] No.44443725{7}[source]
Lol, that's funny - good point, I completely forgot I had an ad blocker running 24/7. I don't think I've browsed the raw internet in more than a decade...
32. andrepd ◴[] No.44444323{5}[source]
The point is LLMs are fundamentally unreliable algorithms for generating plausible text, and as such entirely unsuitable for this task. "But the recipe is probably delicious anyway" is beside the point, when it completely corrupted the meaning of the original. Which is annoying when it's a recipe but potentially very damaging when it's something else.

Techies seem to pretend this doesn't happen, and the general public who doesn't understand will trust the aforementioned techies. So what we see is these tools being used en masse and uncritically for purposes to which they are unsuited. I don't think this is good.

33. anExcitedBeast ◴[] No.44445346{4}[source]
They make long articles to maximize ad exposure and SEO. It's good faith --they're doing what they have to to make money with the underlying tech ecosystem-- but it's not a good outcome.

LLMs are shifting that ecosystem (at least temporarily) and new revenue models will emerge. It'll take time to figure out. But we shouldn't artificially support a bad system just because it's the existing system.

Transitions are always awkward. In the meantime, I'm inclined to give people rope to experiment.

replies(1): >>44451552 #
34. throwaway290 ◴[] No.44451552{5}[source]
It seems that your answer is a long euphemism for "I don't give a damn how they make money, they better find a new way I guess".