Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages

1. bubblyworld ◴[01 Jul 25 13:15 UTC] No.44433602[source]▶

Classic that the first example is for parsing the goddamn recipe from the goddamn recipe site. Instant thumbs up from me haha, looks like a neat little project.

replies(3): >>44435722 #>>44436466 #>>44438277 #

2. andrepd ◴[01 Jul 25 16:44 UTC] No.44435722[source]▶

>>44433602 (TP) #

Which it apparently does by completely changing the recipe in random places including ingredients and amounts thereof. It is _indeed_ a very good microcosm of what LLMs are, just not in the way these comments think.

replies(3): >>44435998 #>>44436175 #>>44436268 #

3. throwawayoldie ◴[01 Jul 25 17:10 UTC] No.44435998[source]▶

>>44435722 #

The output was then posted to the Internet for everyone to see, without the minimal amount of proofreading that would be necessary to catch that, which gives us a good microcosm of how LLMs are used.

On a more pleasant topic the original recipe sounds delicious, I may give it a try when the weather cools off a little.

4. simedw ◴[01 Jul 25 17:28 UTC] No.44436175[source]▶

>>44435722 #

It was actually a bit worse than that the LLM never got the full recipe due to some truncation logic I had added. So it regurgitated the recipe from training, and apparently, it couldn't do both that and convert units at the same time with the lite model (it worked for just flash).

I should have caught that, and there are probably other bugs too waiting to be found. That said, it's still a great recipe.

replies(1): >>44437152 #

5. bubblyworld ◴[01 Jul 25 17:38 UTC] No.44436268[source]▶

>>44435722 #

What do you mean? The recipes in the screenshot look more or less the same, the formatting has just changed in the Spiegel one (which is what was asked for, so no surprises there).

Edit: just saw the author's comment, I think I'm looking at the fixed page

6. IncreasePosts ◴[01 Jul 25 17:57 UTC] No.44436466[source]▶

>>44433602 (TP) #

There are extensions that do that for you, in a deterministic way and not relying on LLMs. For example, Recipe Filter for chrome. It just shows a pop up over the page when it loads if it detects a recipe

replies(1): >>44437154 #

7. bubblyworld ◴[01 Jul 25 19:18 UTC] No.44437154[source]▶

>>44436466 #

Thanks, I already use that plugin, actually, I just found the problem amusingly familiar. Recipe sites are the original AI slop =P

8. lpribis ◴[01 Jul 25 21:45 UTC] No.44438277[source]▶

>>44433602 (TP) #

Another great example of LLM hype train re-inventing something that already existed [1] (and was actually thought out) but making it worse and non-deterministic in the worst ways possible.

https://schema.org/Recipe

replies(6): >>44438799 #>>44439573 #>>44440529 #>>44440626 #>>44440664 #>>44440708 #

9. ◴[01 Jul 25 23:12 UTC] No.44438799[source]▶

>>44438277 #

10. ◴[01 Jul 25 23:23 UTC] No.44438851{4}[source]▶

>>44437152 #

11. 0x696C6961 ◴[01 Jul 25 23:24 UTC] No.44438859{4}[source]▶

>>44437152 #

What is the point?

replies(2): >>44439718 #>>44444323 #

12. ◴[02 Jul 25 01:49 UTC] No.44439573[source]▶

>>44438277 #

13. plonq ◴[02 Jul 25 02:21 UTC] No.44439718{5}[source]▶

>>44438859 #

I’m someone else but for me the point is a serious bug resulted _incorrect data_, making it impossible to trust the output.

replies(1): >>44440736 #

14. soap- ◴[02 Jul 25 05:34 UTC] No.44440529[source]▶

>>44438277 #

And that would be great, if anyone used it.

LLMs are specifically good at a task like this because they can extract content from any webpage, regardless of it supports whatever standard that no one implements

15. komali2 ◴[02 Jul 25 05:57 UTC] No.44440626[source]▶

>>44438277 #

That's a cool schema, but the LLM solution is necessary because recipe website makers will never use the schema because they want you to have to read through garbage, with some misguided belief that this helps their SEO or something. Or maybe they get more money if you scroll through more ads?

replies(2): >>44440719 #>>44442528 #

16. VMG ◴[02 Jul 25 06:04 UTC] No.44440664[source]▶

>>44438277 #

The LLM thing actually works. Who cares if it's deterministic. Maybe the same people who come up with arcane schemas that nobody ever uses?

17. bubblyworld ◴[02 Jul 25 06:15 UTC] No.44440708[source]▶

>>44438277 #

Can we stop with the unprovoked dissing of anyone using LLMs for anything? Or at least start your own thread for it. It's an unpleasant, incredibly boring/predictable standard for discourse (more so than the LLMs themselves lol).

replies(2): >>44440863 #>>44442494 #

18. bubblyworld ◴[02 Jul 25 06:19 UTC] No.44440719{3}[source]▶

>>44440626 #

I'm genuinely a bit confused by the recipe blog business model. Like there's got to be one, right? People don't usually spew the same story about their grandma hundreds of times on a real blog.

Just hitting keywords for search? Many of them don't even have ads so I feel like that can't be it. Maybe referrals?

replies(1): >>44440918 #

19. bubblyworld ◴[02 Jul 25 06:22 UTC] No.44440736{6}[source]▶

>>44439718 #

Assuming you are responding in good faith - the author politely acknowledged the bug (despite the snark in the comment they responded to), explained what happened and fixed it. I'm not sure what more I could expect here? Bugs are inevitable, I think it's how they are handled that drives trust for me.

20. alt187 ◴[02 Jul 25 06:51 UTC] No.44440863{3}[source]▶

>>44440708 #

It's in fact very provoked. The LLM just changes the instructions of the recipe and creates new ones. That's an unpleasant standard of user experience.

replies(1): >>44441952 #

21. Revisional_Sin ◴[02 Jul 25 07:02 UTC] No.44440918{4}[source]▶

>>44440719 #

SEO. Longer articles get ranked higher.

replies(1): >>44441956 #

22. bubblyworld ◴[02 Jul 25 10:17 UTC] No.44441952{4}[source]▶

>>44440863 #

That is a terrible reason to be a dick to someone. Especially someone who has created free software that you have no obligation to use.

23. bubblyworld ◴[02 Jul 25 10:18 UTC] No.44441956{5}[source]▶

>>44440918 #

Makes sense, thanks, but how do you actually make money from that without tons of ads? I realise this is a super naive question haha

replies(2): >>44442193 #>>44442355 #

24. gpm ◴[02 Jul 25 11:21 UTC] No.44442355{6}[source]▶

>>44441956 #

> without tons of ads

This is a requirement? I literally only browse the web with an ad blocker but I always assumed those sites had tons of ads.

replies(1): >>44443725 #

25. throwaway290 ◴[02 Jul 25 11:42 UTC] No.44442494{3}[source]▶

>>44440708 #

When they stop training these LLM on stolen content?

replies(1): >>44442789 #

26. throwaway290 ◴[02 Jul 25 11:46 UTC] No.44442528{3}[source]▶

>>44440626 #

How do they make money then? Or do you think they are doing some sort of public service and you are entitled?

replies(1): >>44445346 #

27. bubblyworld ◴[02 Jul 25 12:15 UTC] No.44442789{4}[source]▶

>>44442494 #

Then go join a thread where someone is actually talking about that issue, which is very valid, and make a meaningful contribution to the conversation. Jumping on people for no reason other than they have touched a language model is just rude.

replies(1): >>44442964 #

28. throwaway290 ◴[02 Jul 25 12:32 UTC] No.44442964{5}[source]▶

>>44442789 #

No you go. It is allowed to diss a thing in the thread about that thing if we think that thing is bad. People here dissed Dropbox when it was launched... and this is no Dropbox.

There are things that are not allowed. But here someone made a good point without any personal attacks. You silencing people is probably the least appropriate thing in this thread.

> make a meaningful contribution

What was not a meaningful contribution, mentioning the relevant schema? Saying using the LLM is bad for example because it is trained on our content without permission and payment? or that this steals from people who provide you content for free and make money from ads?

replies(1): >>44443040 #

29. bubblyworld ◴[02 Jul 25 12:41 UTC] No.44443040{6}[source]▶

>>44442964 #

I'm not interested in continuing this conversation.

replies(1): >>44443092 #

30. throwaway290 ◴[02 Jul 25 12:46 UTC] No.44443092{7}[source]▶

>>44443040 #

I wish you didn't start it... I see it's you who posted the top level comment but it doesn't mean you "own" the thread.

31. bubblyworld ◴[02 Jul 25 13:50 UTC] No.44443725{7}[source]▶

>>44442355 #

Lol, that's funny - good point, I completely forgot I had an ad blocker running 24/7. I don't think I've browsed the raw internet in more than a decade...

32. andrepd ◴[02 Jul 25 14:38 UTC] No.44444323{5}[source]▶

>>44438859 #

The point is LLMs are fundamentally unreliable algorithms for generating plausible text, and as such entirely unsuitable for this task. "But the recipe is probably delicious anyway" is beside the point, when it completely corrupted the meaning of the original. Which is annoying when it's a recipe but potentially very damaging when it's something else.

Techies seem to pretend this doesn't happen, and the general public who doesn't understand will trust the aforementioned techies. So what we see is these tools being used en masse and uncritically for purposes to which they are unsuited. I don't think this is good.

33. anExcitedBeast ◴[02 Jul 25 16:01 UTC] No.44445346{4}[source]▶

>>44442528 #

They make long articles to maximize ad exposure and SEO. It's good faith --they're doing what they have to to make money with the underlying tech ecosystem-- but it's not a good outcome.

LLMs are shifting that ecosystem (at least temporarily) and new revenue models will emerge. It'll take time to figure out. But we shouldn't artificially support a bad system just because it's the existing system.

Transitions are always awkward. In the meantime, I'm inclined to give people rope to experiment.

replies(1): >>44451552 #

34. throwaway290 ◴[03 Jul 25 04:09 UTC] No.44451552{5}[source]▶

>>44445346 #

It seems that your answer is a long euphemism for "I don't give a damn how they make money, they better find a new way I guess".