Most active commenters

fuzztester(8)
(5)
baryphonic(4)
raincole(4)
gverrilla(4)
fragmede(4)
WWLink(4)
throwaway2037(3)
jaynate(3)
banish-m4(3)

Popular/hot comments

>>40140388 #
>>40138255 #
>>40138995 #
>>40137620 #
>>40137737 #
>>40137898 #
>>40137911 #
>>40138204 #
>>40139189 #
>>40138977 #
>>40138119 #
>>40139142 #
>>40141596 #
>>40139496 #
>>40138051 #
>>40139546 #
>>40140491 #
>>40140561 #
>>40140801 #
>>40136833 #

←back to thread

The man who killed Google Search?

(www.wheresyoured.at)

1. gregw134 ◴[23 Apr 24 20:15 UTC] No.40136741[source]▶

>>40133976 (OP) #

Ex-Google search engineer here (2019-2023). I know a lot of the veteran engineers were upset when Ben Gomes got shunted off. Probably the bigger change, from what I've heard, was losing Amit Singhal who led Search until 2016. Amit fought against creeping complexity. There is a semi-famous internal document he wrote where he argued against the other search leads that Google should use less machine-learning, or at least contain it as much as possible, so that ranking stays debuggable and understandable by human search engineers. My impression is that since he left complexity exploded, with every team launching as many deep learning projects as they can (just like every other large tech company has).

The problem though, is the older systems had obvious problems, while the newer systems have hidden bugs and conceptual issues which often don't show up in the metrics, and which compound over time as more complexity is layered on. For example: I found an off by 1 error deep in a formula from an old launch that has been reordering top results for 15% of queries since 2015. I handed it off when I left but have no idea whether anyone actually fixed it or not.

I wrote up all of the search bugs I was aware of in an internal document called "second page navboost", so if anyone working on search at Google reads this and needs a launch go check it out.

replies(11): >>40136833 #>>40136879 #>>40137570 #>>40137898 #>>40137957 #>>40138051 #>>40140388 #>>40140614 #>>40141596 #>>40146159 #>>40166064 #

2. JohnFen ◴[23 Apr 24 20:24 UTC] No.40136833[source]▶

>>40136741 (TP) #

> where he argued against the other search leads that Google should use less machine-learning

This better echoes my personal experience with the decline of Google search than TFA: it seems to be connected to the increasing use of ML in that the more of it Google put in, the worse the results I got were.

replies(3): >>40137620 #>>40137737 #>>40137885 #

3. AlbertCory ◴[23 Apr 24 20:29 UTC] No.40136879[source]▶

>>40136741 (TP) #

Amit was definitely against ML, long before "AI" had become a buzzword.

replies(1): >>40138025 #

4. zem ◴[23 Apr 24 21:33 UTC] No.40137570[source]▶

>>40136741 (TP) #

i worked on ranking during singhal's tenure, and it was definitely refreshing to see a "no black box ML ranking" stance.

5. potatolicious ◴[23 Apr 24 21:37 UTC] No.40137620[source]▶

>>40136833 #

It's also a good lesson for the new AI cycle we're in now. Often inserting ML subsystems into your broader system just makes it go from "deterministically but fixably bad" to "mysteriously and unfixably bad".

replies(5): >>40137968 #>>40138119 #>>40138995 #>>40139020 #>>40147693 #

6. fuzztester ◴[23 Apr 24 21:48 UTC] No.40137737[source]▶

>>40136833 #

Same here with YouTube, assuming they use ML, which is likely.

They routinely give me brain-dead suggestions such as to watch a video I just watched today or yesterday, among other absurdities.

replies(5): >>40138204 #>>40138215 #>>40138255 #>>40139304 #>>40139333 #

7. jokoon ◴[23 Apr 24 22:03 UTC] No.40137885[source]▶

>>40136833 #

that's not something ML people would like to hear

replies(2): >>40137911 #>>40144802 #

8. banish-m4 ◴[23 Apr 24 22:04 UTC] No.40137898[source]▶

>>40136741 (TP) #

Thanks for writing this insightful piece.

The pathologies of big companies that fail to break themselves up into smaller non-siloed entities like Virgin Group does. Maintaining the successful growing startup ways and fighting against politics, bureaucracy, fiefdoms, and burgeoning codebases is difficult but is a better way than chasing short-term profits, massive codebases, institutional inertia, dealing with corporate bullshit that gets in the way of the customer experience and pushes out solid technical ICs and leaders.

I'm surprised there aren't more people on here who decide "F-it, MAANG megacorps are too risky and backwards not representative of their roots" and form worker-owned co-ops to do what MAANGs are doing, only better, and with long-term business sustainability, long tenure, employee perks like the startup days, and positive civil culture as their central mission.

replies(5): >>40138159 #>>40138551 #>>40139151 #>>40140147 #>>40140217 #

9. oblio ◴[23 Apr 24 22:05 UTC] No.40137911{3}[source]▶

>>40137885 #

Is ML the new SOAP? Looks like a silver bullet and 5 years later you're drowning in complexity for no discernible reason?

replies(5): >>40137975 #>>40137976 #>>40138686 #>>40139546 #>>40141708 #

10. __loam ◴[23 Apr 24 22:11 UTC] No.40137968{3}[source]▶

>>40137620 #

This is why hallucinations will never be fixed in language models. That's just how they work.

11. __loam ◴[23 Apr 24 22:12 UTC] No.40137975{4}[source]▶

>>40137911 #

Don't forget about that expensive GPU infrastructure you invested in.

replies(1): >>40138296 #

12. ChrisMarshallNY ◴[23 Apr 24 22:12 UTC] No.40137976{4}[source]▶

>>40137911 #

> SOAP

Argh. My PTSD from writing ONVIF drivers just kicked in.

replies(1): >>40138096 #

13. mike_hearn ◴[23 Apr 24 22:17 UTC] No.40138025[source]▶

>>40136879 #

He wasn't the only one. I built a couple of systems there integrated into the accounts system and "no ML" was an explicit upfront design decision. It was never regretted and although I'm sure they put ML in it these days, last I heard as of a few years ago was that at the core were still pages and pages of hand written logic.

I got nothing against ML in principle, but if the model doesn't do the right thing then you can just end up stuck. Also, it often burns a lot of resources to learn something that was obvious to human domain experts anyway. Plus the understandability issues.

14. jokoon ◴[23 Apr 24 22:21 UTC] No.40138051[source]▶

>>40136741 (TP) #

simplicity is always the recipe for success, unfortunately, most engineers are drawn to complexity like moth to fire

if they were unable to do some AB testing between a ML search and a non-ML search, they deserve their failure 100%

there are not enough engineers blowing the whistle against ML

replies(3): >>40139005 #>>40139537 #>>40147562 #

15. eschneider ◴[23 Apr 24 22:28 UTC] No.40138096{5}[source]▶

>>40137976 #

Been there, Done that. Slides over a bottle of single malt.

replies(1): >>40138793 #

16. munk-a ◴[23 Apr 24 22:32 UTC] No.40138119{3}[source]▶

>>40137620 #

I think - I hope, rather - that technically minded people who are advocating for the use of ML understand the short comings and hallucinations... but we need to be frank about the fact that the business layer above us (with a few rare exceptions) absolutely does not understand the limitations of AI and views it as a magic box where they type in "Write me a story about a bunny" and get twelve paragraphs of text out. As someone working in a healthcare adjacent field I've seen the glint in executive's eyes when talking about AI and it can provide real benefits in data summarization and annotation assistance... but there are limits to what you should trust it with and if it's something big-i Important then you'll always want to have a human vetting step.

replies(4): >>40138577 #>>40138723 #>>40138897 #>>40139084 #

17. barfbagginus ◴[23 Apr 24 22:37 UTC] No.40138159[source]▶

>>40137898 #

I formed a worker co-op - but it's just me! And I do CAD reverse engineering, nothing really life-giving.

I would love to join a co-op producing real human survival values in an open source way. Where would you suggest that I look for leads on that kind of organization?

replies(2): >>40138739 #>>40138977 #

18. 998244353 ◴[23 Apr 24 22:43 UTC] No.40138204{3}[source]▶

>>40137737 #

For what it's worth, I do not remember a time when YouTube's suggestions or search results were good. Absurdities like that happened 10 and 15 years ago as well.

These days my biggest gripe is that they put unrelated ragebait or clickbait videos in search results that I very clearly did not search for - often about American politics.

replies(5): >>40138761 #>>40139567 #>>40139761 #>>40141227 #>>40143825 #

19. layer8 ◴[23 Apr 24 22:44 UTC] No.40138215{3}[source]▶

>>40137737 #

This is happening to me to, but from the kind of videos it's suggested for I suspect that people actually do tend to rewatch those particular videos, hence the recommendation.

20. gverrilla ◴[23 Apr 24 22:49 UTC] No.40138255{3}[source]▶

>>40137737 #

YT Shorts recommendations are a joke. I'm an atheist and very rarely watch anything related to religion, and even so Shorts put me in 3 or 4 live prayers/scams (not sure) the last few months.

replies(6): >>40138312 #>>40138566 #>>40138595 #>>40138673 #>>40139142 #>>40141197 #

21. jokoon ◴[23 Apr 24 22:55 UTC] No.40138296{5}[source]▶

>>40137975 #

and the power bill

and how difficult it is to program those GPU to do ML

22. epcoa ◴[23 Apr 24 22:57 UTC] No.40138312{4}[source]▶

>>40138255 #

Prayers for the unbelievers makes some sense.

But I associate YouTube promotions with garbage any how. The few things I might buy like Tide laundry detergent are entirely despite occasional YouTube promotion.

replies(1): >>40138772 #

23. delfinom ◴[23 Apr 24 23:23 UTC] No.40138551[source]▶

>>40137898 #

Problem is, worker owned co-ops would still require money to do anything even remotely competitive to existing businesses.

So... people go walk up for handouts from VCs....and the story begins lol.

24. delfinom ◴[23 Apr 24 23:26 UTC] No.40138566{4}[source]▶

>>40138255 #

I imagine my blocked channels list is stress testing YouTube at this point from the amount of shit Shorts results it's fed me after 2 years. Lol

Besides the religious crap, ill randomly get shit in India in hindu, having had not watched anything Indian and not even remotely Indian.

replies(2): >>40138752 #>>40139285 #

25. acdha ◴[23 Apr 24 23:26 UTC] No.40138577{4}[source]▶

>>40138119 #

I’m not optimistic on that point: the executive class is very openly salivating at the prospect of mass layoffs, and that means a lot of technical staff aren’t quick to inject some reality – if Gartner is saying it’s rainbows and unicorns, saying they’re exaggerating can be taken as volunteering to be laid off first even if you’re right.

replies(1): >>40163488 #

26. dekhn ◴[23 Apr 24 23:29 UTC] No.40138595{4}[source]▶

>>40138255 #

Similarly, Google News. The "For You" section shows me articles about astrology because I'm interested in astronomy. I get suggestions for articles about I-80 because I search for I-80 traffic cams to get traffic cam info for Tahoe, but it shows me I-80 news all the way across the country, suggestions about MOuntain View because I worked there (for google!) over 3 years ago, commanders being fired from the Navy (because I read a couple articles once), it goes on and on. From what I can tell, there are no News Quality people actually paying attention to their recommendations (and "Show Fewer" doesn't actually work. I filed a bug and was told that while the desktop version of the site shows Show Fewer for Google News, it doesn't actually have an effect).

replies(1): >>40140310 #

27. bitwize ◴[23 Apr 24 23:41 UTC] No.40138686{4}[source]▶

>>40137911 #

ML is somewhere between the new SOAP and the new cryptocurrency.

replies(2): >>40138735 #>>40143062 #

28. bitwize ◴[23 Apr 24 23:45 UTC] No.40138714{5}[source]▶

>>40138673 #

That's a feature, not a bug.

29. munificent ◴[23 Apr 24 23:46 UTC] No.40138723{4}[source]▶

>>40138119 #

> I hope, rather - that technically minded people who are advocating for the use of ML understand the short comings and hallucinations.

The people I see who are most excited about ML are business types who just see it as a black boxes that makes stock valuation go vroom.

The people that deeply love building things, really enjoy the process of making itself, are profoundly sceptical.

I look at generative AI as sort of like an army of free interns. If your idea of a fun way to make a thing is to dictate orders to a horde of well-meaning but untrained highly-caffienated interns, then using generative AI to make your thing is probably thrilling. You get to feel like an executive producer who can make a lot of stuff happen by simply prompting someone/something to do your bidding.

But if you actually care about the grit and texture of actual creation, then that workflow isn't exactly appealing.

replies(2): >>40138898 #>>40139496 #

30. peoplenotbots ◴[23 Apr 24 23:48 UTC] No.40138735{5}[source]▶

>>40138686 #

Well thats grim

31. hsbauauvhabzb ◴[23 Apr 24 23:48 UTC] No.40138739{3}[source]▶

>>40138159 #

I would imagine GitHub and technology social media

32. gverrilla ◴[23 Apr 24 23:50 UTC] No.40138752{5}[source]▶

>>40138566 #

I only get those when it's new content with <20 likes and they are testing it out. Doesn't bother me, I like to receive some untested content - even though 99% of it is pure crap (like some random non-sense film with a trendy music on top).

33. peoplenotbots ◴[23 Apr 24 23:51 UTC] No.40138761{4}[source]▶

>>40138204 #

Long long time ago; youtube "staff" would manually put certain videos on the top of the front page when they started. Im sure there we're biases and prioritization of marketing dollars but at least there was human recommending it compared to poorly recorded early family guy clips. I dont know when they stopped manually adding "editors/staff" choice videos but I recall some of my favorite early youtubers like CGPGgrey claim that recommendation built the career.

replies(1): >>40139212 #

34. gverrilla ◴[23 Apr 24 23:52 UTC] No.40138772{5}[source]▶

>>40138312 #

Lmao. I'm very positive that the conversion rate for placing an atheist in a live mass out of the blue is very very very low. Because I never stayed for more than 3 seconds, I'm not sure if it's real religious content or a scam, though - and they don't even let me report live shorts :(

replies(1): >>40140316 #

35. cgh ◴[23 Apr 24 23:56 UTC] No.40138793{6}[source]▶

>>40138096 #

Horrifying memories of Microsoft Biztalk

36. jorblumesea ◴[24 Apr 24 00:11 UTC] No.40138897{4}[source]▶

>>40138119 #

> technically minded people who are advocating for the use of ML understand the short comings and hallucinations

really, my impression is the opposite. They are driven by doing cool tech things and building fresh product, while getting rid of "antiquated, old" product. Very little thought given to the long term impact of their work. Criticism of the use cases are often hand waved away because you are messing with their bread and butter.

37. spacemadness ◴[24 Apr 24 00:11 UTC] No.40138898{5}[source]▶

>>40138723 #

They wouldn’t think this way if stock investors weren’t so often such naive lemmings ready to jump off yet another cliff with each other.

38. atif089 ◴[24 Apr 24 00:21 UTC] No.40138977{3}[source]▶

>>40138159 #

Let's start with replacing Google. Count me in.

While DDG, Brave, Kagi etc are working generously to replace Google search. The other areas that I think get less attention and needs to be targeted to successfully dismantle them and their predatory practices are Google maps and Google docs.

Maps are hard because it requires a lot of resources and money and whatever but replacing docs should be relatively easier.

replies(4): >>40139109 #>>40140291 #>>40148691 #>>40162092 #

39. ytdytvhxgydvhh ◴[24 Apr 24 00:24 UTC] No.40138995{3}[source]▶

>>40137620 #

I think that’ll define the industry for the coming decades. I used to work in machine translation and it was the same. The older rules-based engines that were carefully crafted by humans worked well on the test suite and if a new case was found, a human could fix it. When machine learning came on the scene, more “impressive” models that were built quicker came out - but when a translation was bad no one knew how to fix it other than retraining and crossing one’s fingers.

replies(6): >>40139153 #>>40139716 #>>40141022 #>>40141626 #>>40142531 #>>40142534 #

40. ants_everywhere ◴[24 Apr 24 00:25 UTC] No.40139005[source]▶

>>40138051 #

I definitely think the ML search results are much worse. But complexity or not, strategically it's an advantage for the company to use ML in production over a long period of time so they can develop organizational expertise in it.

It would have been a worse outcome for Google if they had stuck to their no ML stance and then had Bing take over search because they were a generation behind in technology.

41. chrisweekly ◴[24 Apr 24 00:27 UTC] No.40139020{3}[source]▶

>>40137620 #

I've heard AI described as the payday loan (or "high-interest credit card") of technical debt.

42. godelski ◴[24 Apr 24 00:35 UTC] No.40139084{4}[source]▶

>>40138119 #

> but we need to be frank about the fact that the business layer above us (with a few rare exceptions) absolutely does not understand the limitations of AI and views it as a magic box where they type in

I think we also need to be aware that this business layer above us that often sees __computers__ as a magic box where they type in. There's definitely a large spectrum of how magical this seems to that layer, but the issue remains that there are subtleties that are often important but difficult to explain without detailed technical knowledge. I think there's a lot of good ML can do (being a ML researcher myself), but I often find it ham-fisted into projects simply to say that the project has ML. I think the clearest flag to any engineer that this layer above them has limited domain knowledge is by looking at how much importance they place on KPIs/metrics. Are they targets or are they guides? Because I can assure you, all metrics are flawed -- but some metrics are less flawed than others (and benchmark hacking is unfortunately the norm in ML research[0]).

[0] There's just too much happening so fast and too many papers to reasonably review in a timely manner. It's a competitive environment, where gatekeepers are competitors, and where everyone is absolutely crunched for time and pressured to feel like they need to move even faster. You bet reviews get lazy. The problems aren't "posting preprints on twitter" or "LLMs giving summaries", it's that the traditional peer review system (especially in conference settings) poorly scales and is significantly affected by hype. Unfortunately I think this ends up railroading us in research directions and makes it significantly challenging for graduate students to publish without being connected to big labs (aka, requiring big compute) (tuning is another common way to escape compute constraints, but that falls under "railroading"). There's still some pretty big and fundamental questions that need to be chipped away at but are difficult to publish given the environment. /rant

43. disqard ◴[24 Apr 24 00:38 UTC] No.40139109{4}[source]▶

>>40138977 #

(paid user of Kagi here)

FWIW, Kagi is built on top of Google search, so yes it's "replacing" (for you and me) a dependence on Google search, but it is categorically not a from-the-ground-up replacement for Google search.

replies(1): >>40140077 #

44. AlexCoventry ◴[24 Apr 24 00:44 UTC] No.40139142{4}[source]▶

>>40138255 #

YT Shorts itself is kind of a mystery to me. It's an objective degradation of the interface; why on earth would I want to use it? It doesn't even allow adjustment of the playback speed or scrubbing!

replies(4): >>40139268 #>>40139586 #>>40139877 #>>40140765 #

45. godelski ◴[24 Apr 24 00:45 UTC] No.40139151[source]▶

>>40137898 #

What's odd to me is how everything is so metricized. Clearly over metricization is the downfall of any system that looks meritocratic. Due to the limitations of metrics and how they are often far easier to game than to reach through the intended means.

An example of this I see is how new leaders come in and hit hard to cut costs. But the previous leader did this (and the one before them) so the system/group/company is fairly lean already. So to get anywhere near similar reductions or cost savings it typically means cutting more than fat. Which it's clear that many big corps are not running with enough fat in the first place (you want some fat! You just don't want to be obese!). This seems to create a pattern that ends up being indistinguishable from "That worked! Let's not do that anymore."

replies(2): >>40140242 #>>40163330 #

46. space_fountain ◴[24 Apr 24 00:45 UTC] No.40139153{4}[source]▶

>>40138995 #

Yes, but I think the other lesson might be that those black box machine translations have ended up being more valuable? It sucks when things don't always work, but that is also kind of life and if the AI version worked more often that is usually ok (as long as the occasional failures aren't so catastrophic as to ruin everything)

replies(2): >>40139189 #>>40139532 #

47. ytdytvhxgydvhh ◴[24 Apr 24 00:51 UTC] No.40139189{5}[source]▶

>>40139153 #

Can’t help but read that and think of Tesla’s Autopilot and “Full Self Driving”. For some comparisons they claim to be safer per mile than human drivers … just don’t think too much about the error modes where the occasional stationary object isn’t detected and you plow into it at highway speed.

replies(4): >>40139224 #>>40139253 #>>40139730 #>>40141021 #

48. superluserdo ◴[24 Apr 24 00:55 UTC] No.40139212{5}[source]▶

>>40138761 #

See this >15-year-old video "How to get featured on YouTube" - https://www.youtube.com/watch?v=-uzXeP4g_qA, which I remember as being originally uploaded to the official Youtube channel but looks like it's been removed now, this reupload is from October 2008.

49. space_fountain ◴[24 Apr 24 00:57 UTC] No.40139224{6}[source]▶

>>40139189 #

Well Tesla might be the single worst actor in the entire AI space, but I do somewhat understand your point. The lake of predictable failures is a huge problem with AI, I'm not sure that understandability is by itself. I will never understand the brain of an Uber driver for example

50. Terr_ ◴[24 Apr 24 01:03 UTC] No.40139253{6}[source]▶

>>40139189 #

Or in some cases, the Tesla slows down, then changes its mind and starts accelerating again to run over child-like obstructions.

Ex: https://www.youtube.com/watch?v=URpTJ1Xpjuk&t=293s

replies(1): >>40154704 #

51. fuzztester ◴[24 Apr 24 01:07 UTC] No.40139268{5}[source]▶

>>40139142 #

Solid point. Not to mention that Shorts content is mainly linkbait and/or garbage.

52. fuzztester ◴[24 Apr 24 01:09 UTC] No.40139285{5}[source]▶

>>40138566 #

>in hindu

Hindi is the word for the language, bro.

replies(1): >>40139422 #

53. makeitdouble ◴[24 Apr 24 01:12 UTC] No.40139304{3}[source]▶

>>40137737 #

I think it's probably pushing pattern it sees in other users.

There's videos I'll watch multiple times, music videos are the obvious kind, but for some others I'm just not watching/understanding it the first time and will go back and rewatch later.

But I guess youtube has no way to understand which one I'll rewatch and which other I don't want to see ever again, and if my behavior is used as training data for the other users like you, they're probably screwed.

replies(1): >>40139364 #

54. sakesun ◴[24 Apr 24 01:15 UTC] No.40139333{3}[source]▶

>>40137737 #

Install "Unhook" chrome extension. That changed my life.

55. godshatter ◴[24 Apr 24 01:21 UTC] No.40139364{4}[source]▶

>>40139304 #

A simple "rewatch?" line along the top would make this problem not so brain dead bad, imho. Without it you just think the algorithm is bad (although maybe it is? I don't know).

56. etc-hosts ◴[24 Apr 24 01:30 UTC] No.40139422{6}[source]▶

>>40139285 #

I knew I could count on you.

replies(1): >>40140297 #

57. fragmede ◴[24 Apr 24 01:43 UTC] No.40139496{5}[source]▶

>>40138723 #

We get it, you're skeptical of the current hype bubble. But that's one helluva no true Scotsman you've got going on there. Because a true builder, one that deeply loves building things wouldn't want to use text to create an image. Anyone who does is a business type or an executive producer. A true builder wouldn't think about what they want to do in such nasty thing as words. Creation comes from the soul, which we all know machines, and business people, don't have.

Using English, instead of C, to get a computer to do something doesn't turn you into a beaurocrat any more than using Python or Javascript instead does.

Only a person that truly loves building things, far deeper than you'll ever know, someone that's never programmed in a compiled language, would get that.

replies(4): >>40139565 #>>40139626 #>>40140078 #>>40140255 #

58. ethbr1 ◴[24 Apr 24 01:48 UTC] No.40139532{5}[source]▶

>>40139153 #

> Yes, but I think the other lesson might be that those black box machine translations have ended up being more valuable?

The key difference is how tolerant the specific use case is of a probably-correct answer.

The things recent-AI excels at now (generative, translation, etc.) are very tolerant of "usually correct." If a model can do more, and is right most of the time, then it's more valuable.

There are many other types of use cases, though.

replies(1): >>40140528 #

59. 1024core ◴[24 Apr 24 01:49 UTC] No.40139537[source]▶

>>40138051 #

> most engineers are drawn to complexity like moth to fire

Unfortunately, Google evaluates employees by the complexity of their work. "Demonstrates complexity" is a checkbox on promo packets, from what I've heard.

Naturally, every engineer will try to over-complicate things just so they can get the raises and promos. You get what you value.

replies(1): >>40147483 #

60. ajross ◴[24 Apr 24 01:49 UTC] No.40139546{4}[source]▶

>>40137911 #

So... obviously SOAP was dumb[1], and lots of people saw that at the time. But SOAP was dumb in obvious ways, and it failed for obvious reasons, and really no one was surprised at all.

ML isn't like that. It's new. It's different. It may not succeed in the ways we expect; it may even look dumb in hindsight. But it absolutely represents a genuinely new paradigm for computing and is worth studying and understanding on that basis. We look back to SOAP and see something that might as well be forgotten. We'll never look back to the dawn of AI and forget what it was about.

[1] For anyone who missed that particular long-sunken boat, SOAP was a RPC protocol like any other. Yes, that's really all it was. It did nothing special, or well, or that you couldn't do via trivially accessible alternative means. All it had was the right adjective ("XML" in this case) for the moment. It's otherwise forgettable, and forgotten.

replies(3): >>40140801 #>>40144845 #>>40147759 #

61. ethbr1 ◴[24 Apr 24 01:53 UTC] No.40139565{6}[source]▶

>>40139496 #

> Using English, instead of C, to get a computer to do something doesn't turn you into a beaurocrat any more than using Python or Javascript instead does.

If one uses English in as precise a way as one crafts code, sure.

Most people do not (cannot?) use English that precisely.

There's little technical difference between using English and using code to create...

... but there is a huge difference on the other side of the keyboard, as lots of people know English, including people who aren't used to fully thinking through a problem and tackling all the corner cases.

replies(1): >>40140179 #

62. FullstakBlogger ◴[24 Apr 24 01:54 UTC] No.40139567{4}[source]▶

>>40138204 #

15 years ago, I used to keep many tabs of youtube videos open just because the "related" section was full of interesting videos. Then each of those videos had interesting relations. There was so much to explore before hitting a dead-end and starting somewhere else.

Now the "related" section is gone in favor of "recommended" samey clickbait garbage. The relations between human interests are too esoteric for current ML classifiers to understand. The old Markov-chain style works with the human, and lets them recognize what kind of space they've gotten themselves into, and make intelligent decisions, which ultimately benefit the system.

If you judge the system by the presence of negative outliers, rather than positive, then I can understand seeing no difference.

replies(2): >>40140561 #>>40141829 #

63. barnabyjones ◴[24 Apr 24 01:55 UTC] No.40139586{5}[source]▶

>>40139142 #

I think there is a large demo of people now who actually prefer to watch videos in portrait.

replies(2): >>40140605 #>>40141429 #

64. pbar ◴[24 Apr 24 02:02 UTC] No.40139626{6}[source]▶

>>40139496 #

Was it intentional to reply with another no true Scotsman in turn here?

replies(2): >>40139742 #>>40140352 #

65. satvikpendem ◴[24 Apr 24 02:15 UTC] No.40139716{4}[source]▶

>>40138995 #

As someone who worked in rules-based ML before the recent transformers (and unsupervised learning in general) hype, rules-based approaches were laughably bad. Only now are nondeterministic approaches to ML surpassing human level tasks, something which would not have been feasible, perhaps not even possible in a finite amount of human development time, via human-created rules.

replies(1): >>40140927 #

66. someguydave ◴[24 Apr 24 02:16 UTC] No.40139730{6}[source]▶

>>40139189 #

relevant to the grandparent’s point: I am demoing FSD in my Tesla and what I find really annoying is that the old Autopilot allowed you to select a maximum speed that the car will drive. Well, on “FSD” apparently you have no choice but to hand full longitudinal control over to the model.

I am probably the 0.01% of Tesla drivers who have the computer chime when I exceed the speed limit by some offset. Very regularly, even when FSD is in “chill” mode, the model will speed by +7-9 mph on most roads. (I gotta think that the young 20 somethings who make up Tesla's audience also contributed their poor driving habits to Tesla's training data set) This results in constant beeps, even as the FSD software violates my own criteria for speed warning.

So somehow the FSD feature becomes "more capable" while becoming much less legible to the human controller. I think this is a bad thing generally but it seems to be the fad today.

replies(1): >>40141213 #

67. satvikpendem ◴[24 Apr 24 02:18 UTC] No.40139742{7}[source]▶

>>40139626 #

Yeah, I was also reading their response and was confused. "Creation comes from the soul, which we all know machines, and business people, don't have" ... "far deeper than you'll ever know", I mean, come on.

68. Narishma ◴[24 Apr 24 02:21 UTC] No.40139761{4}[source]▶

>>40138204 #

I do remember when Youtube would show more than 2 search results per page on my 23" display.

Or when they would show more than 3 results before spamming irrelevant videos.

Or when they didn't show 3 unskippable ads in a 5 minute video.

Or when they had a dislike button so you would know to avoid wasting time on low quality videos.

replies(2): >>40140281 #>>40141257 #

69. minetest2048 ◴[24 Apr 24 02:40 UTC] No.40139877{5}[source]▶

>>40139142 #

You can scrub on the mobile player, that's what makes it so much frustrating because you can't do that on desktop

replies(1): >>40140267 #

70. ninjaa ◴[24 Apr 24 03:11 UTC] No.40140077{5}[source]▶

>>40139109 #

Oh that's pretty smart

71. xarope ◴[24 Apr 24 03:12 UTC] No.40140078{6}[source]▶

>>40139496 #

using English has been tried many times in the history computing; Cobol, SQL, just to name a very few.

Still needed domain experts back then, and, IMHO, in years/decades to come

replies(1): >>40140262 #

72. dbingham ◴[24 Apr 24 03:22 UTC] No.40140147[source]▶

>>40137898 #

The hard part about starting worker owned co-ops is financing. We need good financing systems for them. People/firms who are willing to give loans for a reasonable interest, but on the scale of equity investment in tech start ups.

replies(1): >>40142744 #

73. dragonwriter ◴[24 Apr 24 03:25 UTC] No.40140179{7}[source]▶

>>40139565 #

> Most people do not (cannot?) use English that precisely.

No one can, which is why any place human interaction needs anything anywhere close to the determinancy of code, normal natural langauge is abandoned for domain-specific constructed languages built from pieces of natural language with meanings crafted especially for the particular domain as the interface language between the people (and often formalized domain-specific human-to-human communication protocols with specs as detailed as you’d see from the IETF.)

replies(1): >>40142085 #

74. jaynate ◴[24 Apr 24 03:30 UTC] No.40140217[source]▶

>>40137898 #

I guess it depends on how much equity you own as to what is better (to your first paragraph), and how large the paycheck is (to the 2nd paragraph.

75. jaynate ◴[24 Apr 24 03:35 UTC] No.40140242{3}[source]▶

>>40139151 #

Agree you have to mix qualitative with the quantitative, but the best metrics systems don't just measure one quantity metric. They should be paired with a quality metric.

Example: User Growth & Customer Engagement

Have to have user growth and retention. If you looked at just one or the other, you'd be missing half the equation.

replies(2): >>40143399 #>>40163425 #

76. WWLink ◴[24 Apr 24 03:36 UTC] No.40140255{6}[source]▶

>>40139496 #

Getting drunk off that AI kool-aid aren't ya

replies(1): >>40140366 #

77. WWLink ◴[24 Apr 24 03:37 UTC] No.40140262{7}[source]▶

>>40140078 #

Or you can draw pretty pictures in LabVIEW lol

78. fuzztester ◴[24 Apr 24 03:37 UTC] No.40140267{6}[source]▶

>>40139877 #

What does scrubbing mean in this context? Blocking the Shorts?

replies(2): >>40140405 #>>40140435 #

79. WWLink ◴[24 Apr 24 03:40 UTC] No.40140281{5}[source]▶

>>40139761 #

> I do remember when Youtube would show more than 2 search results per page on my 23" display.

Wait what?! You "Consume Content" on a COMPUTER? What are you some kinda grandpa? Why aren't you consuming content from your phone like everyone else? Or casting it from your phone to your SMART TV! Great way to CONSUME CONTENT!

CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT

replies(2): >>40140566 #>>40140568 #

80. jaynate ◴[24 Apr 24 03:42 UTC] No.40140291{4}[source]▶

>>40138977 #

Using OSS to commoditize complements plays a big role in breaking up big advantages.

There is big tech open source consortium working on maps now to commoditize it: https://siliconangle.com/2022/12/15/aws-microsoft-meta-tomto...

Not sure it'll work. I think half the advantage comes from the integration across all these tools (maps, search, etc). Have you ever tried to use duckduckgo? It surprised me what I take for granted in Google's user experience.

replies(1): >>40160923 #

81. fuzztester ◴[24 Apr 24 03:43 UTC] No.40140297{7}[source]▶

>>40139422 #

You bet. Think nought of it. We gave the world zero, after all. Even computers owe us. ;)

https://en.m.wikipedia.org/wiki/0

82. WWLink ◴[24 Apr 24 03:45 UTC] No.40140310{5}[source]▶

>>40138595 #

Part of the reason I switched from google to duckduckgo for searching was I didn't WANT "personalization" I want my search results to be deterministic. If I am in Seattle and search for "ducks" I want the exact fucking same search results as if I travel to Rio de Janeiro and search for "ducks".

Honestly, I'd prefer my voice assistant (siri mostly) to be like that as well. It was at first, and I think everyone hated that lol.

83. BuyMyBitcoins ◴[24 Apr 24 03:46 UTC] No.40140316{6}[source]▶

>>40138772 #

“Conversion rate”. I’m not sure if you intended that pun but it’s pretty good.

84. fragmede ◴[24 Apr 24 03:52 UTC] No.40140352{7}[source]▶

>>40139626 #

If you have to ask, then you missed it

85. fragmede ◴[24 Apr 24 03:54 UTC] No.40140366{7}[source]▶

>>40140255 #

the othering of creators because they use a different paintbrush was bothering me.

replies(2): >>40140526 #>>40143041 #

86. barbariangrunge ◴[24 Apr 24 03:58 UTC] No.40140388[source]▶

>>40136741 (TP) #

Machine learning or not, seo spam sort of killed search. It’s more or less impossible to find real sites by interesting humans these days. Almost all results are Reddit, YouTube, content marketing, or seo spam. And google’s failure here killed the old school blogosphere (medium and substack only slightly count), personal websites, and forums

Same is happening to YouTube as well. Feels like it’s nothing but promoters pushing content to gain followers to sell ads or other stuff because nobody else’s videos ever surface. Just a million people gaming the algorithm and the only winners are the people who devote the most time to it. And by the way, would I like to sign up for their patreon and maybe one of their online courses?

replies(16): >>40140491 #>>40140498 #>>40140642 #>>40140643 #>>40140674 #>>40141129 #>>40141155 #>>40141191 #>>40141598 #>>40141729 #>>40141971 #>>40142421 #>>40143040 #>>40143790 #>>40146457 #>>40241886 #

87. mondobe ◴[24 Apr 24 04:00 UTC] No.40140405{7}[source]▶

>>40140267 #

Seeking to a certain part of the video. On mobile, you can do it by dragging the progress bar at the bottom of the screen.

88. nevster ◴[24 Apr 24 04:05 UTC] No.40140435{7}[source]▶

>>40140267 #

Scrubbing means quickly moving the current playback position back and forward

89. baryphonic ◴[24 Apr 24 04:13 UTC] No.40140491[source]▶

>>40140388 #

What I don't understand about this explanation is that Google's results are abysmal compared to e.g. DuckDuckGo or even Brave search. (I haven't tried Kagi, but people here rave about it as well.) Sure, all the SEO is targeting googlebot, but Google has by far more resources to mitigate SEO spam than just about anyone else. If this is the full explanation, couldn't Google just copy the strategies the (much) smaller rivals are using?

replies(3): >>40140751 #>>40141579 #>>40141604 #

90. codegladiator ◴[24 Apr 24 04:14 UTC] No.40140498[source]▶

>>40140388 #

spam didn't kill search. Google willingness to promote spam for ads killed Google. Google is not search.

91. stavros ◴[24 Apr 24 04:18 UTC] No.40140526{8}[source]▶

>>40140366 #

I can relate, AI is a tool, and if I want to write my code by LEGOing a bunch of AI-generated functions together, I should be able to.

92. nojs ◴[24 Apr 24 04:18 UTC] No.40140528{6}[source]▶

>>40139532 #

A case in point is the ubiquity of Pleco in the Chinese/English space. It’s a dictionary, not a translator, and pretty much every non-native speaker who learns or needs to speak Chinese uses it. It has no ML features and hasn’t changed much in the past decade (or even two). People love it because it does one specific task extremely well.

On the other hand ML has absolutely revolutionised translation (of longer text), where having a model containing prior knowledge about the world is essential.

93. Aerroon ◴[24 Apr 24 04:24 UTC] No.40140561{5}[source]▶

>>40139567 #

>The relations between human interests are too esoteric for current ML classifiers to understand.

I would go further and say that it is impossible. Human interests are contextual and change over time, sometimes in the span of minutes.

Imagine that all the videos on the internet would be on one big video website. You would watch car videos, movie trailers, listen to music, and watch porn in one place. Could the algorithm correctly predict when you're in the mood for porn and when you aren't? No, it couldn't.

The website might know what kind of cars, what kind of music, and what kind of porn you like, but it wouldn't be able to tell which of these categories you would currently be interested in.

I think current YouTube (and other recommendation-heavy services) have this problem. Sometimes I want to watch videos about programming, but sometimes I don't. But the algorithm doesn't know that. It can't know that without being able to track me outside of the website.

replies(3): >>40141005 #>>40141819 #>>40192374 #

94. skydhash ◴[24 Apr 24 04:26 UTC] No.40140566{6}[source]▶

>>40140281 #

Lol, Youtube on Apple TV is great. Mostly because I either need to find something fast or I switch it off because the remote is not conducive to skipping. But the only time I watch Youtube on my computer is for a specific video. The waste of space is horrendous. Same with Twitter (rarely visited), just a 3/4 inches wide column of posts on my 24 inch screen.

95. Aerroon ◴[24 Apr 24 04:26 UTC] No.40140568{6}[source]▶

>>40140281 #

I'm not consuming the content on my phone, because the user experience of using these services on my phone sucks. Just the app vs website difference with urls is a difference in behavior I hate let alone all the UI differences that make the mobile experience awkward.

I don't know about the TV though.

96. skydhash ◴[24 Apr 24 04:32 UTC] No.40140605{6}[source]▶

>>40139586 #

If you’re watching a single subject of interest video on your phone (TikTok type of content), it’s great. But landscape videos is more pleasant and there’s a reason we move from 4:3 for media. But that actually means watching the videos, but what I see is a lot of skipping.

replies(1): >>40192455 #

97. baryphonic ◴[24 Apr 24 04:33 UTC] No.40140614[source]▶

>>40136741 (TP) #

I'm glad you shared this.

My priors before reading this article were that an uncritical over-reliance on ML was responsible for the enshittification of Google search (and Google as a whole). Google seemed to give ML models carte blanche, rather than using the 80-20 rule to handle the boring common cases, while leaving the hard stuff to the humans.

I now think it's possible both explanations are true. After all, what better way to mask a product's descent into garbage than more and more of the core algorithm being a black box? Managers can easily take credit for its successes and blame the opacity for failures. After all, the "code yellow" was called in the first place because search growth was apparently stagnant. Why was that? We're the analysts manufacturing a crisis, or has search already declined to some extent?

98. madcoderme ◴[24 Apr 24 04:39 UTC] No.40140642[source]▶

>>40140388 #

It's like "Do some SEO magic and Tada!"

And who forgot the recent Reddit story.

replies(1): >>40140953 #

99. re5i5tor ◴[24 Apr 24 04:39 UTC] No.40140643[source]▶

>>40140388 #

Hard disagree. As another reply mentions, just compare the alternatives such as Kagi that aren’t breaking search by pursuing ad growth.

replies(1): >>40142527 #

100. freetinker ◴[24 Apr 24 04:44 UTC] No.40140674[source]▶

>>40140388 #

A bit chicken-and-egg. Another perspective: Google’s system incentivizes SEO spam.

Search for a while hasn’t been about searching the web as much as it has been about commerce. It taps commercial intent and serves ads. It is now an ad engine; no longer a search engine.

replies(2): >>40141000 #>>40141660 #

101. yannickt ◴[24 Apr 24 04:59 UTC] No.40140751{3}[source]▶

>>40140491 #

I've been using Kagi for a while, and I find that it delivers better results in a cleaner presentation.

102. kmeisthax ◴[24 Apr 24 05:01 UTC] No.40140765{5}[source]▶

>>40139142 #

So, there's a few ways to explain it. From a business strategy level, TikTok exists, and is a threat to YouTube, so we need to compete with it.

From a user perspective, Shorts highlights a specific format of YouTube that happened to have been around for a lot longer than people realize. TikTok isn't anything new, Vine was doing exactly the same thing TikTok was a decade prior. It was shut down for what I can only assume was really dumb reasons. A lot of Viners moved to YouTube, but they had to change their creative process to fit what the YouTube algorithm valued at the time: longer videos.

Pre-Shorts, there really wasn't a good place on YouTube for short videos. Animators were getting screwed by the algorithm because you really can't do daily uploads of animation[0] and whatever you upload is going to be a few minutes max. A video essayist can rack up hundreds of thousands of hours of watch time while you get maybe a thousand.

(Fun fact: YouTube Shorts status was applied retroactively to old short videos, so there's actually Shorts that are decades old. AFAIK, some of the Petscop creator's old videos are Shorts now.)

But that's why users or creators would want to use Shorts. A lot of the UX problems with Shorts boils down to YouTube building TikTok inside of YouTube out of sheer corporate envy. To be clear, they could have used the existing player and added short-video features on top (e.g. swipe-to-skip). In fact, any Short can be opened in the standard player by just changing the URL! There's literally no difference other than a worse UI because SOMEONE wanted "launched a new YouTube vertical" on their promo packet!

FWIW the Shorts player is gradually getting its missing features back but it's still got several pain points for me. One in particular that I think exemplifies Shorts: if I watch Shorts on a portrait 1080p monitor - i.e. the perfect thing to watch vertical video on - you can't see comments. When you open the comments drawer it doesn't move over enough and the comments get cut off. The desktop experience is also really bad; occasionally scrolling just stops working, or it skips two videos per mousewheel event, or one video will just never play no matter how much I scroll back and forth.

[0] Vtubers don't count

replies(1): >>40269296 #

103. tensor ◴[24 Apr 24 05:07 UTC] No.40140801{5}[source]▶

>>40139546 #

ML has already succeeded to the point that it is ubiquitous and taken for granted. OCR, voice recognition, spam filters, and many other now boring technologies are all based on ML.

Anyone claiming it’s some sort of snake oil shouldn’t be taken seriously. Certainly the current hype around it has given rise to many inappropriate applications, but it’s a wildly successful and ubiquitous technology class that has no replacement.

replies(3): >>40141824 #>>40142095 #>>40144116 #

104. Anotheroneagain ◴[24 Apr 24 05:28 UTC] No.40140927{5}[source]▶

>>40139716 #

The thing is that AI is completely unpredictable without human curated results. Stable diffusion made me relent and admit that AI is here now for real, but I no longer think so. It's more like artificial schizophrenia. It does have some results, often plausible seeming results, but it's not real.

105. bergen ◴[24 Apr 24 05:35 UTC] No.40140953{3}[source]▶

>>40140642 #

Could you link it please? I have unfortunately no idea what you are referencing

106. dazc ◴[24 Apr 24 05:42 UTC] No.40141000{3}[source]▶

>>40140674 #

Best exercise bike articles, and such, are what lots of people people actually search for. There is no incentive to provide quality work which answers these queries hence the abundance of spam and ads.

If you want to purchase consumer products at your own expense and offer an impartial opinion on each of them then you will have no problem getting ranked highly on google. You will lose a lot of money doing so, however, and will also be plagiarized to death in a month. The sites you want to be rid of will outrank you for your own content, I have been there and have the t-shirt.

replies(1): >>40178205 #

107. FullstakBlogger ◴[24 Apr 24 05:43 UTC] No.40141005{6}[source]▶

>>40140561 #

>I would go further and say that it is impossible. Human interests are contextual and change over time, sometimes in the span of minutes.

Theres a general problem in the tech world where people seem to inexplicably disregard the issue of non-reducibility. The point about the algorithm lacking access to necessary external information is good.

A dictionary app obviously can't predict what word I want to look up without simulating my mind-state. A set of probabilistic state transitions is at least a tangible shadow of typical human mind-states who make those transitions.

108. beefnugs ◴[24 Apr 24 05:46 UTC] No.40141022{4}[source]▶

>>40138995 #

yes, who exactly looked at the 70% accuracy of "live automatic closed captioning" and decided Great! ship it boys!

replies(1): >>40141225 #

109. simion314 ◴[24 Apr 24 05:46 UTC] No.40141021{6}[source]▶

>>40139189 #

> For some comparisons they claim to be safer per mile than human drivers

They are lying with statistics, for the more challenging locations and conditions the AI will give up and let the human take over or the human notices something bad and takes over. So Tesla miles are miles are cherry picked and their data is not open so a third party can make real statistics and compare apples to apples.

110. choppaface ◴[24 Apr 24 06:03 UTC] No.40141155[source]▶

>>40140388 #

SEO Spam didn't kill search so much as Google failed to retain Matt Cutts or replicate his community involvement https://www.searchenginejournal.com/matt-cutts-resigns-googl...

replies(1): >>40141732 #

111. haspok ◴[24 Apr 24 06:10 UTC] No.40141191[source]▶

>>40140388 #

I don't know, but Youtube seems to have a more solid algorithm. I'm typically not subscribed to any channel, yet the content I want to watch does find me reasonably well. Of course, heavily promoted material also, but I just click "not interested in channel" and it disappears for a while. And I still get some meaningful recommendations if I watch a video in a certain topic. Youtube has its problems, of course, but in the end I can't complain.

replies(1): >>40141909 #

112. alovelace ◴[24 Apr 24 06:11 UTC] No.40141197{4}[source]▶

>>40138255 #

Just because you're an atheist doesn't mean you won't engage with religious content though. YT rewards all kinds of engagement not just positive ones. I.e. if you leave a snide remark or just a dislike on a religious short that still counts as engagement.

replies(1): >>40143636 #

113. throwaway2037 ◴[24 Apr 24 06:14 UTC] No.40141213{7}[source]▶

>>40139730 #

I have no experience with Tesla and their self-driving features. When you wrote "chill" mode, I assume it means the lowest level of aggressiveness. Did you contact Tesla to complain the car is still too aggressive? There should be a mode that tries to drive exactly the speed limit, where reasonable -- not over or under.

replies(1): >>40143611 #

114. throwaway2037 ◴[24 Apr 24 06:15 UTC] No.40141225{5}[source]▶

>>40141022 #

My guess: They are hoping user feedback will help them to fix the bugs later -- iterate to 99%. Plus, they are probably under unrealistic deadlines to delivery _something_.

115. alovelace ◴[24 Apr 24 06:16 UTC] No.40141227{4}[source]▶

>>40138204 #

It all depends on your use case but a lot of people seem to be in agreement it fell off in the mid to late 10s and the suggestions became noticeably worse.

116. throwaway2037 ◴[24 Apr 24 06:20 UTC] No.40141257{5}[source]▶

>>40139761 #

    > Or when they didn't show 3 unskippable ads in a 5 minute video.

On desktop Chrome, a modern ad-blocking browser extension will block 100% of YouTube adverts. I haven't watched one, literally, in years. I don't watch YouTube from a mobile phone, but I think the situation is different. (Can anyone else comment about the mobile experience?)

replies(1): >>40141695 #

117. watwut ◴[24 Apr 24 06:49 UTC] No.40141429{6}[source]▶

>>40139586 #

I dont mind portrait. I mind inability to jump forward in the video.

118. freeone3000 ◴[24 Apr 24 07:17 UTC] No.40141579{3}[source]▶

>>40140491 #

When a large search engine deranks spam websites, the spam websites complain! Loudly! With Google they have a big juicy target with lots of competing ventures for an antitrust case; no such luck for Kagi or DDG.

replies(1): >>40146100 #

119. mrkeen ◴[24 Apr 24 07:20 UTC] No.40141596[source]▶

>>40136741 (TP) #

> There is a semi-famous internal document he wrote where he argued against the other search leads that Google should use less machine-learning, or at least contain it as much as possible, so that ranking stays debuggable and understandable by human search engineers.

There's a lot of ML hate here, and I simply don't see the alternative.

To rank documents, you need to score them. Google uses hundreds of scoring factors (I've seen the number 200 thrown about, but it doesn't really matter if it's 5 or 1000.) The point is you need to sum these weights up into a single number to find out if a result should be above or below another result.

So, if:

  - document A is 2Kb long, has 14 misspellings, matches 2 of your keywords exactly, matches a synonym of another of your keywords, and was published 18 months ago, and

  - document B is 3Kb long, has 7 misspellings, matches 1 of your keywords exactly, matches two more keywords by synonym, and was published 5 months ago

Are there any humans out there who want to write a traditional forward-algorithm to tell me which result is better?

replies(4): >>40141644 #>>40141688 #>>40144593 #>>40165827 #

120. underdeserver ◴[24 Apr 24 07:21 UTC] No.40141598[source]▶

>>40140388 #

I've heard this argument again and again, but I never see any explanation as to why SEO is suddenly in the lead in this cat-and-mouse game. They were trying ever since Google got 90%+ market share.

I think it's more likely that Google stopped really caring.

replies(3): >>40141654 #>>40154580 #>>40156768 #

121. raincole ◴[24 Apr 24 07:21 UTC] No.40141604{3}[source]▶

>>40140491 #

Have you read the article this thread is about?

To summarize it: Google reverted an algorithm that detected SEO spams in 2019.

(Note that I never work for Google and I don't know whether it's true or not. It's just what this article says.)

replies(1): >>40142939 #

122. raincole ◴[24 Apr 24 07:24 UTC] No.40141626{4}[source]▶

>>40138995 #

But rule-based machine translation, from what I've seen, is just so bad. ChatGPT (and other LLM) is miles ahead. After seeing what ChatGPT does, I can't even call rule-based machine translation "tranlation".

*Disclaimer: as someone who's not an AI researcher but did quite some human translation works before.

123. ◴[24 Apr 24 07:27 UTC] No.40141644[source]▶

>>40141596 #

124. rob74 ◴[24 Apr 24 07:29 UTC] No.40141654{3}[source]▶

>>40141598 #

Well yeah, it's in the article - at some point, they switched completely to metrics (i.e. revenue) driven management and forgot that it's the quality of results that actually made Google what it is. And, with a largely captive audience (Google being the default-search-engine-that-most-people-don't-bother-or-don't-know-how-to-change in Chrome, Android, on Chromebooks etc.), they arguably don't have to care anymore...

125. somenameforme ◴[24 Apr 24 07:31 UTC] No.40141660{3}[source]▶

>>40140674 #

Absolutely this. I don't think many people consider how odd it is that the largest internet advertising company in the world and the largest search engine company in the world are one and the same, and just how overt a conflict of interest that is, so far as providing quality service goes. It would be akin to if the largest telephone service company in the world was also the largest phone maker in the world. Oh wait, that did happen [1] - and we broke them up because it's obviously extremely detrimental to the functioning of a healthy market.

[1] - https://en.wikipedia.org/wiki/Breakup_of_the_Bell_System

126. datadeft ◴[24 Apr 24 07:34 UTC] No.40141688[source]▶

>>40141596 #

You do not need to. Counting how many links are pointing to each document is sufficient if you know how long that link existed (spammers link creation time distribution is widely differnt to natural link creation times, and many other details that you can use to filter out spammers)

replies(2): >>40141733 #>>40142033 #

127. snickerer ◴[24 Apr 24 07:35 UTC] No.40141695{6}[source]▶

>>40141257 #

On Android devices I use the app PipePipe to avoid the YouTube ad hell. I recommend it.

I also use Firefox for Android, which has Addon support. Ublock Origin works on the phone and disables a a lot of the ad horror.

replies(1): >>40150822 #

128. raincole ◴[24 Apr 24 07:37 UTC] No.40141708{4}[source]▶

>>40137911 #

ML is a quite well adopted technology. iPhones has ML bulit in since about 2017. It has been more than 5 years.

129. willvarfar ◴[24 Apr 24 07:41 UTC] No.40141729[source]▶

>>40140388 #

I think a case can be made that the spam problem can be traced all the way back to Google buying Doubleclick.

Its really easy to spot the crap websites that are scaping content-creating websites ... because they monetize by adding ads.

If Google was _only_ selling ads on the search results page, then it could promote websites that are sans ads.

Instead, it is incentivised to push users to websites that contain ads, because it also makes money there.

And that means scraping other sites to slap your ads onto them can be very profitable for the scammers.

replies(2): >>40141810 #>>40143563 #

130. arromatic ◴[24 Apr 24 07:41 UTC] No.40141732{3}[source]▶

>>40141155 #

What did he used to do ? Your comment seems contradictory cutts seem to be on anti spam but your comment implies seo did not kill search . Is seo not part of spam?

replies(2): >>40143509 #>>40146962 #

131. raincole ◴[24 Apr 24 07:41 UTC] No.40141733{3}[source]▶

>>40141688 #

> spammers link creation time distribution is widely differnt to natural link creation times

Yes, this is a statistical method. Guess what machine learning is and what it actually excels?

132. rvba ◴[24 Apr 24 07:58 UTC] No.40141810{3}[source]▶

>>40141729 #

They hired people who introduce Jack Welch methods.

This is like in that Steve Jobs video about product people being kicked out and exchanged by ones who dont care about product:

https://m.youtube.com/watch?v=P4VBqTViEx4

They will not make good search. That is not their priority.

replies(1): >>40144986 #

133. nox101 ◴[24 Apr 24 08:00 UTC] No.40141819{6}[source]▶

>>40140561 #

I think there are things they could do and that ML could maybe help?

* They could let me directly enter my interests instead of guessing

* They could classify videos by expertise (tags or ML) and stop recommending beginner videos to someone who expresses an interest in expert videos.

* They could let me opt out of recommending videos I've already watched

* They could separate sites into larger categories and stop recommending things not in that category. For me personally, when I got to youtube.com I don't want music but 30-70% of the recommendations are for music. If the split into 2 categories (videos.youtube.com - no music) and (music.youtube.com - only music) they'd end up recommending far more to me that I'm actually interested in at the time. They could add other broad categories like (gaming.youtube.com, documentaries.youtube.com, science.youtube.com, cooking.youtube.com, ...., as deep as they want). Classifying a video could be ML or creator decided. If you're only allowed one category they would be incentive to not mis-classify. If they need more incentive they could dis-recommend your videos if you mis-classify too many/too often).

* They could let me mark videos as watched and actually track that the same as read/unread email. As it is, if you click "not interested -> already watched" they don't mark the video as visibly watched (the red bar under the video). Further, if you start watching again you lose the red-bar (it gets reset to your current position). I get that tracking where you are in a video is something that's different for email vs video but at the same time (1) if I made it to 90% of the way through then for me at least, that's "watched" - same as "read" for email and I'd like it "archived" (don't recommend this to me again) even if I start watching it again (same as reading an email marked as "read)

replies(1): >>40142586 #

134. yen223 ◴[24 Apr 24 08:01 UTC] No.40141824{6}[source]▶

>>40140801 #

Thank you for this.

Reading these comments I thought I stepped into some alternate timeline when we don't already have widespread ML all over the place.

Like, nobody does rules-based image recognition for a decade now already!

135. rvba ◴[24 Apr 24 08:02 UTC] No.40141829{5}[source]▶

>>40139567 #

They probably optimize your engagement NOW - with clickbaity videos. So their KPIs show big increases. But in long term you realize that what you watch is garbage and stop watching alltogether.

Someone probably changed the engine that shows videos for you - exactly as with search.

replies(1): >>40149937 #

136. jajko ◴[24 Apr 24 08:21 UTC] No.40141909{3}[source]▶

>>40141191 #

I don't think youtube is trying that hard to desperately sell stuff to you via home screen recommendation algorithm. And I agree its bearable and what you describe works cca well, albeit ie I am still trying to get rid of anything related to Jordan Peterson whom I liked before and detest now after his drug addiction / mental breakdown, it just keeps popping back from various sources, literal whack-a-mole.

I wish there was some way to tell "please ignore all videos that contain these strings, and I don't mean only for next 2 weeks".

Youtube gets their ads revenue from before/during video, so they can be nicer to users.

137. Ambolia ◴[24 Apr 24 08:33 UTC] No.40141971[source]▶

>>40140388 #

For me what killed search was 2016, after that year if some search term is "hot news" it becomes impossible to learn anything about it that wasn't published in the last week and you just get the same headline repeated 20 times in slightly different wording about it.

After that I only use search for technical problems, and mouth to mouth or specific authors for everything else.

replies(1): >>40148553 #

138. mrkeen ◴[24 Apr 24 08:44 UTC] No.40142033{3}[source]▶

>>40141688 #

> You do not need to.

Ranking means deciding which document (A or B) is better to return to the user when queried.

Not writing a traditional forward-algorithm to rank these documents implies one of the following:

- You write a "backward" algorithm (ML, regression, statistics, whatever you want to call it).

- You don't use algorithms to solve it. An army of humans chooses the rankings in real time.

- You don't rank documents at all.

> Counting how many links are pointing to each document is sufficient if you know how long that link existed

- Link-counting (e.g. PageRank) is query-independent evidence. If that's sufficient for you, you'll always return the same set of documents to each user, regardless of what they typed into the search box.

At best you've just added two more ranking factors to the mix:

  - document A
    qie:
        length: 2Kb
        misspellings: 14
        age: 18 months
      + in-links: 4
      + in-link-spamminess: 2.31E4
    qde:
        matches 2 of your keywords exactly
        matches a synonym of another of your keywords

  - document B
    qie:
        length: 3Kb
        misspellings: 7
        age: 5 months
      + in-links: 2
      + in-link-spamminess: 2.54E3
    qde:
        matches 1 of your keywords exactly
        matches 2 keywords by synonym

So I ask again:

- Which document matches your query better, A or B?

- How did you decide that, such that not only can you program a non-ML algorithm to perform the scoring, but you're certain enough of your decision that you can fix the algorithm when it disagrees with you ( >> debuggable and understandable by human search engineers )

replies(3): >>40142262 #>>40146216 #>>40155577 #

139. cultofmetatron ◴[24 Apr 24 08:53 UTC] No.40142085{8}[source]▶

>>40140179 #

I gotta say, I love how you use english to perfectly demonstrate how imprecise english is without pre-understood context to disambiguate meaning.

140. oblio ◴[24 Apr 24 08:55 UTC] No.40142095{6}[source]▶

>>40140801 #

That ML I have no problem with.

This new ML that's supposed to be the basis for an entire new economic wave, that I mostly dislike.

But I guess that's how we build new things... We explore and throw away 80% of what we've built.

141. datadeft ◴[24 Apr 24 09:24 UTC] No.40142262{4}[source]▶

>>40142033 #

Statistical methods are debuggable. Is PageRank not debuggable? I am not sure where ML starts and statistics end.

142. seospamsuck ◴[24 Apr 24 09:44 UTC] No.40142365{3}[source]▶

>>40141129 #

This is the correct insight. Google has enough machine learning prowess that they could absolutely offload, with minimal manhours, the creation of a list ranking a bunch of blogspam sites and give them a reverse score by how much they both spam articles or how much they spread the content over the page. Then apply that score to their search result weights.

And I know they could because someone did make that list and posted it here last year.

replies(1): >>40143600 #

143. raxxorraxor ◴[24 Apr 24 09:52 UTC] No.40142421[source]▶

>>40140388 #

Machine learning is probably as much or even more susceptible to SEO spam.

Problem is that the rules of search engines created the dubious field of SEO in the first place. They are not entirely the innocent victim here.

Arcane and intransparent measures get you ahead. So arcane that you instantly see that it does not correspond with quality content at all, which evidently leads to a poor result.

I wish there was an option to hide every commercial news or entertainment outlet completely. Those are of course in on SEO for financial reaesons.

replies(1): >>40142503 #

144. faeriechangling ◴[24 Apr 24 10:05 UTC] No.40142503{3}[source]▶

>>40142421 #

>I wish there was an option to hide every commercial news or entertainment outlet completely.

There's alway plugins or you can subscribe to Kagi, although I don't think there's any blocklist preconfigured for "all commercial news websites"

145. faeriechangling ◴[24 Apr 24 10:08 UTC] No.40142527{3}[source]▶

>>40140643 #

Kagi isn't amazing, it's just not bad and it really makes plain how badly Google has degraded into an ad engine. All it takes to beat Google is giving okay quality search results.

146. otikik ◴[24 Apr 24 10:08 UTC] No.40142531{4}[source]▶

>>40138995 #

Perhaps using a ML to craft the deterministic rules and then have a human go over them is the sweet spot.

replies(1): >>40149981 #

147. rugina ◴[24 Apr 24 10:08 UTC] No.40142534{4}[source]▶

>>40138995 #

I think NM translation was broken all along. Not in the neural network part but in choosing the right answer. https://aclanthology.org/2020.coling-main.398.pdf

replies(2): >>40149976 #>>40167342 #

148. fuzztester ◴[24 Apr 24 10:16 UTC] No.40142586{7}[source]▶

>>40141819 #

Those are some good suggestions, particularly the first one:

>let me directly enter my interests

replies(1): >>40192451 #

149. reverius42 ◴[24 Apr 24 10:41 UTC] No.40142744{3}[source]▶

>>40140147 #

The problem is risk —- most new businesses will go under. Who’s going to take on that unreasonable risk without commensurate reward (high interest loan rate, if any, or equity).

Co-ops could go the angel/VC route for funding if they don’t give up a controlling share.

150. baryphonic ◴[24 Apr 24 11:12 UTC] No.40142939{4}[source]▶

>>40141604 #

I wasn't responding to the article; I was responding to the claim that Google's results are bad because of all the SEO. It's a claim I've heard from Google apologists including some people I know at Google. I think it's nonsense both for the reasons I stated and for the reasons enumerated in the article.

replies(1): >>40146914 #

151. deepGem ◴[24 Apr 24 11:27 UTC] No.40143040[source]▶

>>40140388 #

This explodes for search terms dealing with questions related to bugs or issues or how to dos. Almost all top results are YT videos, each of which will follow the same pattern. First 10 secs garbage followed by request for subscribe and/or sponsorship content then followed by what you want.

152. karma_pharmer ◴[24 Apr 24 11:27 UTC] No.40143041{8}[source]▶

>>40140366 #

please go other yourself somewhere else

replies(1): >>40143823 #

153. karma_pharmer ◴[24 Apr 24 11:30 UTC] No.40143062{5}[source]▶

>>40138686 #

Dear sir, may I interest you in the initial coin airdrop of WSDLCoin? It is going straight to the moon.

154. DanielHB ◴[24 Apr 24 12:08 UTC] No.40143399{4}[source]▶

>>40140242 #

I think that a good portion of the problem is that there are groups involved in metrics:

1) People setting the metrics

2) People implementing/calculating the metrics

3) People working on improving the metrics (ie product work)

2 is specially complicated for a lot of software products because it can some times be really hard to measure and can be tweaked/manipulated. For example, the MAU twitter figures from the buyout that Musk keeps complaining about, or Blizzard constantly switching their MAU definition.

Often 2 and 3 are the same people and 1 is almost always upper management. I argue that 1 and 2 should be a single group of people (that doesn't work on the product at all) and not directly subject to upper management and not tracked by the same metrics they implement (or tracked by any metrics at all).

155. ◴[24 Apr 24 12:21 UTC] No.40143509{4}[source]▶

>>40141732 #

156. octopusRex ◴[24 Apr 24 12:28 UTC] No.40143563{3}[source]▶

>>40141729 #

We need a Reverse Google search that will weed out the garbage.

replies(2): >>40147864 #>>40182732 #

157. octopusRex ◴[24 Apr 24 12:33 UTC] No.40143600{4}[source]▶

>>40142365 #

I'm waiting for folks to implement a Reverse Google Search.

158. someguydave ◴[24 Apr 24 12:35 UTC] No.40143611{8}[source]▶

>>40141213 #

Yes there is a “chill” mode that refers to maximum allowed acceleration and “chill mode” that refers to the level if aggressiveness with autopilot. With both turned on the car still exceeds the speed limit by quite a bit. I am sure Tesla is aware.

159. gverrilla ◴[24 Apr 24 12:37 UTC] No.40143636{5}[source]▶

>>40141197 #

Yes I know, not the case, and before you ask, I also don't engage with atheist videos. But that's only one example: the recommendations are really bad in a lot of ways for me.

160. jesuslop ◴[24 Apr 24 12:55 UTC] No.40143790[source]▶

>>40140388 #

Much agreed, and this is prompting me to experiment with other search engines to see if they cut off also the interesting humans sites. With todays google I feel herded.

161. fragmede ◴[24 Apr 24 12:58 UTC] No.40143823{9}[source]▶

>>40143041 #

Hit a nerve, it seems. Apologies.

162. pfannkuchen ◴[24 Apr 24 12:59 UTC] No.40143825{4}[source]▶

>>40138204 #

YouTube seems to treat popular videos as their own interest category and it’s very aggressive about recommending them if you show any interest at all. If you watch even one or two popular videos (like in the millions of views), suddenly the quality of the recommendations drops off a cliff, since it is suggesting things that aren’t relevant to your interest categories, it’s just suggesting popular things.

If I entirely avoid watching any popular videos, the recommendations are quite good and don’t seem to include anything like what you are seeing. If I don’t entirely avoid them, then I do get what you are seeing (among other nonsense).

163. Nullabillity ◴[24 Apr 24 13:25 UTC] No.40144116{6}[source]▶

>>40140801 #

Call me back when you have voice recognition that doesn't constantly fail spectacularly.

replies(1): >>40144538 #

164. tensor ◴[24 Apr 24 14:03 UTC] No.40144538{7}[source]▶

>>40144116 #

Voice recognition will never be rule based.

165. JohnFen ◴[24 Apr 24 14:28 UTC] No.40144802{3}[source]▶

>>40137885 #

Well, it depends on the ML person. I work on industrial ML and DL systems every day and I'm the one who made that comment.

166. ◴[24 Apr 24 14:31 UTC] No.40144845{5}[source]▶

>>40139546 #

167. ◴[24 Apr 24 14:44 UTC] No.40144986{4}[source]▶

>>40141810 #

168. baryphonic ◴[24 Apr 24 16:10 UTC] No.40146100{4}[source]▶

>>40141579 #

This is an interesting theory. Is there evidence that it's happening? Is Big SEO unreasonably effective at lobbying the Justice Department?

replies(2): >>40147345 #>>40151821 #

169. skyfallsin ◴[24 Apr 24 16:14 UTC] No.40146159[source]▶

>>40136741 (TP) #

@gregw134 Thank you for sharing! I've never worked at Google, but really curious what the engineering context is when you say "needs a launch" in the last line.

replies(1): >>40152275 #

170. srean ◴[24 Apr 24 16:18 UTC] No.40146216{4}[source]▶

>>40142033 #

A few minor nitpicks. Pagerank is not just link counting, who is linking to the page matters. Among the linking pages those that are ranked higher matter more -- and how does one figure out their rank ? its by Pagerank. It may sound a bit like chicken and egg but its fine, its the fixed point of the self-referential. definition.

Pagerank based ranking will not return the same set of pages. Its true that the ranking is global in vanilla version of Pagerank, but what gets returned in rank order is the set of qualifying pages. The set of qualifying pages are very much query sensitive. Pagerank also depends on a seed set of initial pages, these may also be set on a query dependent way.

All this is a little moot now, because Pagerank even defined in this way stopped being useful a long time ago.

171. eitland ◴[24 Apr 24 16:32 UTC] No.40146457[source]▶

>>40140388 #

Most of the problems I complain about are not related to SEO spam but to Google including sites that does not contain my search terms anywhere despite my use of doublequotes and the verbatim operator.

As for SEO spam a huge chunk of it would have disappeared I think if Google had created the much requested personal blacklist that we used to ask them for.

It was always "actually much harder than anyone of you who don't work here can imagine for reasons we cannot tell or you cannot understand" or something like that problem, but bootstraped Kagi managed to do it - and their results are so much better that I don't usually need it.

172. eitland ◴[24 Apr 24 16:58 UTC] No.40146914{5}[source]▶

>>40142939 #

You are totally correct I think.

This isn't about what is possible.

It is about Google not wanting to say goodbye to the sweet dollars from spammy sites.

Otherwise making the probably number one requested feature, a personal block list, wouldn't have been impossible for a company with so many bright minds.

I mean: little bootstrapped Kagi had it either from the beginning or at least since shortly after they launched.

People always think they lost against SEO spam. But my main reason for quitting as soon as an alternative showed up was because they started to overrule my searches and search for what they thought I wanted to search for.

For a while I kept it at bay by using doublequotes and verbatim but none of those have worked reliably for a decade now.

That isn't SEO spam. That is poor engineering or "we know better than you" attitude.

replies(1): >>40151201 #

173. eitland ◴[24 Apr 24 17:02 UTC] No.40146962{4}[source]▶

>>40141732 #

Even when matt_cutts used to be here it was still impossible to get him (or anyone else) to care about search results including lots of results I never asked for.

Not low quality pages that spammed high ranking words but pages that simply wasn't related to the query at all as evidenced by the fact that they didn't contain the keywords I searched for at all!

174. freeone3000 ◴[24 Apr 24 17:34 UTC] No.40147345{5}[source]▶

>>40146100 #

It’s definitely a concern where I work (not Google). Deranking anybody who happens to share a vertical we’re in is colorable as an anticompetitive action[0], and due to our dominance in another sector (not search), effectively any anticompetitive action anywhere is a no-go. And since we don’t have time to review whether a particular competitor also competes in one of our verticles and run everything by legal, nothing gets de-ranked manually.

0: for context, us doj does not take antitrust action against companies simply for market dominance; it requires market dominance plus an anticompetitive action. However, they don’t like monopolies, so effectively any pretext can be used — see the apple lawsuit or the 90s ms lawsuits for how little it takes.

175. Terr_ ◴[24 Apr 24 17:46 UTC] No.40147483{3}[source]▶

>>40139537 #

I've heard a similar critique for Google launching new products and then letting them die, where it's really driven by their policies and practice around what gets someone a promotion and what doesn't.

replies(1): >>40152654 #

176. nothercastle ◴[24 Apr 24 17:52 UTC] No.40147562[source]▶

>>40138051 #

Engineers love simplicity but management hates it and won’t promote people that strive towards it. A simple system is the most complex system to design.

177. x0x0 ◴[24 Apr 24 18:07 UTC] No.40147693{3}[source]▶

>>40137620 #

mysteriously with a helping of random too!

178. x0x0 ◴[24 Apr 24 18:14 UTC] No.40147759{5}[source]▶

>>40139546 #

Yeah, I'm staring at my use of chatgpt to write a 50 line python program that connected to a local sqlite db and ran a query; for each element returned, made an api call or ran a query against a remote postgres db; depending on the results of that api call, made another api call; saved the results to a file; and presented results in a table.

Chatgpt generated the entirety of the above w/ me tweaking one line of code and putting creds in. I could have written all of the above, but it probably would have taken 20-30 minutes. With chatgpt I banged it out in under a minute, helped a colleague out, and went on my way.

Chatgpt absolutely is a real advancement. Before they released gpt4, there was no tech in the world that could do what it did.

179. KetoManx64 ◴[24 Apr 24 18:23 UTC] No.40147864{4}[source]▶

>>40143563 #

https://kagi.com/ de-prioritizes SEO ad sites and also lets you blacklist sites from your search reaults. Never going back to google after trying it

replies(3): >>40148500 #>>40150866 #>>40156719 #

180. chrisallenlane ◴[24 Apr 24 19:27 UTC] No.40148500{5}[source]▶

>>40147864 #

I've also been using (and paying for) Kagi for a few months now. It's fantastic.

replies(1): >>40152976 #

181. verzali ◴[24 Apr 24 19:31 UTC] No.40148553{3}[source]▶

>>40141971 #

Yes, this is a thing I find really frustrating about Google. Especially as I often search for old news stories to find out what people were saying on a topic a few years ago in order to give some context to more recent stories.

182. verzali ◴[24 Apr 24 19:43 UTC] No.40148691{4}[source]▶

>>40138977 #

OpenStreetMaps is pretty decent, and I find it better than Google Maps in most cases.

183. astrange ◴[24 Apr 24 21:30 UTC] No.40149937{6}[source]▶

>>40141829 #

I have to say, all my YouTube recommendations are good and they're rarely clickbait. If you sign out they're pretty bad though.

184. astrange ◴[24 Apr 24 21:34 UTC] No.40149976{5}[source]▶

>>40142534 #

Since LLMs are loosely based on NM models, it seems research on newer sampling methods like Mirostat might help here.

185. astrange ◴[24 Apr 24 21:34 UTC] No.40149981{5}[source]▶

>>40142531 #

Rules could never work for translation unless the incoming text was formatted in a specific way. Eg, you just couldn't translate a conversation transcript in a pro-drop language like Japanese into English sentence-by-sentence, because the original text just wouldn't have sentences in it. So you need some "intelligence" to know who is saying what.

186. fuzztester ◴[24 Apr 24 22:49 UTC] No.40150822{7}[source]▶

>>40141695 #

>PipePipe

It feels a bit funny asking this, since we're talking about Google (i.e. YouTube), but did you mean ;) PipeTube? I know there is a PeerTube too.

replies(1): >>40156732 #

187. interstice ◴[24 Apr 24 22:53 UTC] No.40150866{5}[source]▶

>>40147864 #

I’ve been toggling between Kagi and Perplexity, can honestly say I don’t miss google search (still use maps though)

188. kelseydh ◴[24 Apr 24 23:26 UTC] No.40151201{6}[source]▶

>>40146914 #

Google's search results are just bad. For example, search: "Does Quebec have an NHL team?"

The results suggest that Quebec does not have an NHL team, because it confuses the province of Quebec with Quebec City. Montreal, in Quebec, has the Montreal Canadiens and this isn't mentioned in the search results at all.

189. deanishe ◴[25 Apr 24 00:44 UTC] No.40151821{5}[source]▶

>>40146100 #

The EU fined Google for prioritising Google Shopping results after complaints by other shopping/price-comparison websites.

https://en.m.wikipedia.org/wiki/Antitrust_cases_against_Goog...

190. snewman ◴[25 Apr 24 01:34 UTC] No.40152275[source]▶

>>40146159 #

Guessing: perhaps this means, if someone needs credit for shepherding an improvement to search quality into production, here is a set of known improvements waiting for someone to take ownership.

replies(1): >>40207628 #

191. broknbottle ◴[25 Apr 24 02:28 UTC] No.40152654{4}[source]▶

>>40147483 #

Yep, promo doc bs that will be immediately abandoned as soon as the promo goes through in X quarter.

192. krick ◴[25 Apr 24 03:14 UTC] No.40152976{6}[source]▶

>>40148500 #

Feels a bit silly to ask such an anecdotal question to somebody I don't know, but is it really better than Google? If you don't consider all the privacy yadda-yadda issues. I mean more like the size of the index, how quickly it updates things, how good is it at actual searching (like finding an almost exact quote which happens to exist on only one obscure site on the internet), stuff like that. I could also mention stuff like blacklisting doorways, but honestly it's less interesting, and I totally believe that it does it better than Google.

Personally, I use DDG on the daily basis, and it's mostly ok, but very-very far from perfect. More so, at least once in several days I have to switch to Google, because it is seriously better at updating the index, and DDG often fails to find something on some obscure forum, even if I know it's there (because I was a part of discussion myself!) and try to assist it with finding it as much as I can. Also, Google is immensely better at knowing local shops and finding products.

Also, Google search, being bad as it is, it still the only thing I find usable on mobile. First off, it's faster, it is integrated nicely into Pixel UI, and it's somewhat good at all these "more than just a search" type of things, like converting a timezone for me, showing wikipedia summary, flight schedule, etc. Also, integration with Google Maps, working hours and venue locations, it is actually far more reliable than, say, Tripadvisor.

Still, I feel reluctant to vendor-locking myself into payed service unless it's actually far better than everything else and can replace DDG and Google completely.

replies(3): >>40153145 #>>40154196 #>>40160368 #

193. barbariangrunge ◴[25 Apr 24 03:39 UTC] No.40153145{7}[source]▶

>>40152976 #

> Privacy yadda yadda

194. friendzis ◴[25 Apr 24 06:36 UTC] No.40154196{7}[source]▶

>>40152976 #

> Also, Google is immensely better at knowing local shops and finding products.

Tangential, but this is precisely the "problem" with Google search. Whatever the internal decision-making process was, Google search at some point embraced race to the bottom incentivizing outspending others, either by paying for ads or showing ads. This race is ultimately won by content scrapers/generators slapping ads on top and businesses selling stuff.

Anecdotally, there is a pet supply store near me. It's nearly impossible to find on Google maps. If I zoom over the shopping mall this particular store does not appear, if I search for "pet store" it does not appear. Only if I do search for "petstore inc." it appears in results and map. So Google knows about the store, but actively tries to hide it, presumably because Google does not make money off it.

> I have to switch to Google, because it is seriously better at updating the index

On one hand yes, Google is in some cases really quick at updating the index with new entries. However, at the same time it is equally good at updating the index with removals making old content very hard to find.

195. friendzis ◴[25 Apr 24 07:34 UTC] No.40154580{3}[source]▶

>>40141598 #

Well, it's in the name. SEO is a fancy name for trying to game whatever heuristics Google employs to form their SERPs. It's just that at some point those heuristics shifted from rewarding "quality content" as defined by the disgruntled towards enshitification.

There are various kinds of SEO - internal: technical, on-page and external. A long time ago Google had an epiphany that instead of trying to make sense out of sites themselves they could offload that effort to website administrators and started ranking pages how well they implement technical elements helping Google index the web. For a very long time that was synonymous with white-hat SEO. Since Google search was in part based on web-of-links, various shady tactics to inflate number of indexed backlinks and boost rankings. That was black-hat SEO.

These days Google search puts tremendous focus on on-page SEO. So much that as long as the internal structure of a site is indexable (no dead links, internal backlinks, meta info) it is typically better to hire copywriters spitting out LLM-like robotic mumblings than to try and optimize further.

196. friendzis ◴[25 Apr 24 07:55 UTC] No.40154704{7}[source]▶

>>40139253 #

Tesla's driver assist since the very beginning to now seems to not posses object/decision permanence.

Here you can see it detected an obstacle (as evidenced by info on screen), made a decision to stop, however it failed to detect existence of the object right in front of the car, promptly forgot about the object and decision to stop and happily accelerated over the obstacle. When tackling a more complex intersection it can happily change its mind with regards to exit lane multiple times, e.g. it will plan to exit on one side of a divider, replan to exit onto upcoming traffic, replan again.

197. hongsy ◴[25 Apr 24 10:14 UTC] No.40155577{4}[source]▶

>>40142033 #

What's qie and qde?

198. halo18 ◴[25 Apr 24 12:33 UTC] No.40156719{5}[source]▶

>>40147864 #

Doesn't seem to be doing great? The example search I got on their home page was 'best headphones' which pretty immediately surfaces http://www.quietheadphones.com/ - which is openly for sale, and also covered in affiliate links.

A bit farther down the page is a 'best headphones for 2020' article.

And this is the example result set they push on the home page to a potential buyer.

You guys pay for this thing?

replies(2): >>40164194 #>>40166589 #

199. snickerer ◴[25 Apr 24 12:35 UTC] No.40156732{8}[source]▶

>>40150822 #

I don't know PipeTube. I meant PipePipe, which works well for me: https://github.com/InfinityLoop1308/PipePipe

200. halo18 ◴[25 Apr 24 12:39 UTC] No.40156768{3}[source]▶

>>40141598 #

Massive media companies finally caught on and started churning out utter shit because it's wildly profitable.

When the 'trusted websites' caught on and embraced the game, Google was apparently helpless to stop it.

201. beej71 ◴[25 Apr 24 17:18 UTC] No.40160368{7}[source]▶

>>40152976 #

I'm a paying subscriber.

It's not "that much" better for some definitions of "that much".

But they're working on making the best search engine for their customers, and it does have a lot of features for helping make your search better and less ad-driven.

I was trying to find the age of an obscure local lava flow. Google was useless for it. Kagi had it on the third hit. So sometimes it's brilliantly better.

But what I like the most is that their incentives are aligned with mine (because I'm paying them to be).

Google is going to maximize revenue which means making it as shitty as possible without you leaving. How many ads can I cram down their throats before they split? Kagi is also maximizing revenue, but they want to make it as great as possible so you don't leave.

Are the results worth it? It's up to you, really. Try it for free--if you don't miss it after you run out of free searches, then it's not for you.

202. atif089 ◴[25 Apr 24 18:00 UTC] No.40160923{5}[source]▶

>>40140291 #

I wholeheatedly agree with you. The GMaps experience is vastly superior. Additionally, when I'm referring to Gmaps, I think one of the critical features that I would love to replace with Open Source is Places. With due respect, I find both Google and Yelp a*holes in this area. While OpenStreetMap is really good for mapping, I'm still looking to find(or create) somethign that can supplement OSM with Places/Business data.

203. barfbagginus ◴[25 Apr 24 19:43 UTC] No.40162092{4}[source]▶

>>40138977 #

What does a zero cost / zero IP / cooperative model of a Google killer look like?

It can't have ads, and it can't hide any knowledge that exists which could help the user.. even if the knowledge is proprietary.

It must repeal copyright laws by force. It must drain all silos and know all things. And it must utilize the entirety of the library Genesis.

204. banish-m4 ◴[25 Apr 24 21:34 UTC] No.40163330{3}[source]▶

>>40139151 #

Oh god. The blind faith in reductive, objectivist, rationalist meritocracy that somehow "everything can be measured perfectly" and "whatever happens is completely unbiased as proscribed by a black-and-white, mechanical formula". No, sorry, that's insufficiently holistic in accounting for intangibles and supportive effort, and more of a throwback ideology that should've died in the 1920's. Some degree of discretion is needed because there is no shortcut to "measuring" performance.

205. banish-m4 ◴[25 Apr 24 21:43 UTC] No.40163425{4}[source]▶

>>40140242 #

Absurdity, unfairness, and failure often result from selective blindness to reality, whether willful or unintentional. Hyperlogical people sometimes lack empathy or an ability to conceive of, to understand, or prefer to trivialize ambiguous situations, politics, biases, human factors, or nonfunctional requirements. Always keep looking for one's own and organizational blind spots.

206. nebula8804 ◴[25 Apr 24 21:48 UTC] No.40163488{5}[source]▶

>>40138577 #

Yeah but what comes after the mass layoffs? Getting hired to clean up the mess that AI eventually creates? Depending on the business it could end up becoming more expensive than if they had never adopted GenAI at all. Think about how many companies hopped on the Big Data Bandwagon when they had nothing even coming close to what "Big Data" actually meant. That wasn't as catastrophic as what AI would do but it still was throwing money in the wrong direction.

replies(1): >>40165169 #

207. ◴[25 Apr 24 23:10 UTC] No.40164194{6}[source]▶

>>40156719 #

208. acdha ◴[26 Apr 24 02:00 UTC] No.40165169{6}[source]▶

>>40163488 #

I’m sure we’re going to see plenty of that but from the perspective of a person who isn’t rich enough to laugh off unemployment, how does that help? If speaking up got you fired, you won’t get your old job back or compensation for the stress of looking in a bad market. If you stick around, you’re under more pressure to bail out the business from the added stress of those bad calls and you’re far more likely to see retribution than thanks for having disagreed with your CEO: it takes a very rare person to appreciate criticism and the people who don’t aren’t going to get in the situation of making such a huge bet on a fad to begin with – they’d have been more careful to find something it’s actually good for.

209. etc-hosts ◴[26 Apr 24 03:55 UTC] No.40165827[source]▶

>>40141596 #

For a few months last year every time I searched for information about a package related to software available in homebrew, the first few results were to a site that clearly just had crawled all of the links in homebrew, and templated out a site of links corresponding to each package name. and thats about it. It would have been nice if the generated pages contained any useful information, but alas it did not.

There's got to be a better way.

210. eitally ◴[26 Apr 24 04:56 UTC] No.40166064[source]▶

>>40136741 (TP) #

I was there from 2015-2023 and, although I didn't work in Search, I remember a lot of the bigger initiatives designed at improving Search for users, like the project to add cards for the top 500 most commonly searched medical terms/conditions, using content from Mayo and custom contracted digital art (for an example, here's a sample link: https://www.google.com/search?q=acl+tear ). There were a lot of things like this going on at any point in time, and it was terrific to see. Then I discovered the manually curated internal knowledge graph, that even included many-language i19n. And then that it was possible for any googler to suggest updates/changes/additions.

Point being, there's a lot of amazing stuff that folks on the outside never would have seen, and it would be a shame for beancounters to ruin it all with decisions actively not "respecting the user".

replies(1): >>40207620 #

211. _xivi ◴[26 Apr 24 06:55 UTC] No.40166589{6}[source]▶

>>40156719 #

What are you comparing it against? Do you actually have a better alternative or just having a bad day?

The fact that you tried to pick on 2 of the results for such a generic keyword, show that it's miles ahead of mainstream search engines which are filled with SEO spam.

I tried that same search on Google, duckduckgo, bing, brave, yandex, even yahoo and needless to say the results were pretty much all SEO spam, list-style keywords farming from generic websites such as NYTimes (how tf is NYTimes an authoritive source on purchasing headphones?). Whereas in Kagi you get a wide range of helpful results focused around reviews/enthusiasts/forums, here are some of the results: youtube video reviews, reddit discussion, discussions on sound design forums, a Quora qusetion, the headphones page on best buy, amazon, walmart, etc.

And as the other comment said, Kagi also has life-saving features that empower the user to have control over the search results [0]. As far as I know the only weak point in Kagi (at the moment) is doing more local-focused searches.

Regardless of the quality of results (which mind you, are already quite superior), it'd be still worth paying for if only to support its ad-less search model and help nurture it. Prove that it's a viable model for the sake of the web. For everyone sake. It's a great effort for that alone. Combine both the model and high-quality results and it's the best in class with no one even close.

[0] https://help.kagi.com/kagi/features/website-info-personalize...

replies(2): >>40173323 #>>40189152 #

212. halo18 ◴[26 Apr 24 19:44 UTC] No.40173323{7}[source]▶

>>40166589 #

Google, with blacklisted domains. I wish an actual better alt existed.

I didn't 'try to pick on' - I pointed out two garbage results in a query that they literally push you to from the home page as examples for potential customers. If those results aren't doing what people claim (not highlighting seo spam) then I'm not really left with any faith that the queries they don't elevate to their home page will be better.

213. eastbound ◴[27 Apr 24 08:01 UTC] No.40178205{4}[source]▶

>>40141000 #

> Best exercise bike articles, and such, are what lots of people people actually search for

Google doesn’t have to return the SEO-optimized page. Google has other options:

- Return 10 results of the 10 top products,

- Derank any site that seems SEO-optimized,

- Derank any commercial site,

- Derank any site with a cookie banner (implying the user is tracked and the writer is trying to write what the user wants to read) or the infamous mailing list popup,

- Prioritize comparisons from brick-and-mortar journals, or give credentials to other vectors of trust,

- Act as a paid directory, where only paid answers appear,

- Return individual positive and negative comments about products, extracted from review pages, maybe even in a graph (“Good for USB-C according to 95% of the reviews, provides an electric shock according to 7% of non-affiliated comments”).

There WERE many options. Google CHOSE to rank awful sites that provide decreased value, and worse than that, it chose that all other sites won’t be viable, killing them. Google chose the face of the internet today.

214. anticensor ◴[27 Apr 24 19:32 UTC] No.40182732{4}[source]▶

>>40143563 #

Reverse of Google Search is also Google Search, due to how the ranking works.

215. crdrost ◴[28 Apr 24 15:07 UTC] No.40189152{7}[source]▶

>>40166589 #

> how tf is NYTimes an authoritive source on purchasing headphones?

Acqui-hire. So what happened was in around 2010 or so a voice-over artist named Lauren Dragan who I think was already dabbling in professional tech journalism, wanted to write about headphones and microphones since she was getting really opinionated about them in her VO work.

So she contributed an article to “The Wirecutter,” which was trying to be like Tom’s and Engadget (I think they then dropped “the” from their name? Which makes one want to abbreviate as WC which is just tragic). I think it was just a freelance article on “audiophile headphones”...?

Well, the audiophile community online was growing etc. and this proved to be remarkably successful because it gave the audiophiles some professional validation, right? “I work in audio booths, I have to listen super closely, I know what I am talking about.” So it made money for The Wirecutter and they pitched her on “if we just bought you dozens of headphones online would you take notes and make a rec” and she's been doing stuff like that for them ever since.

Wirecutter broadened its focus to a lot of other topics, usually not with the same reliability—it really depends on the reviewer’s biases and such, and Lauren’s VO/audiophile bias of “I want my headphones to have a very flat EQ to match what's on the track, it's more important that they don't croak at higher volumes...” was something she could communicate well about in terms of sibilant highs or feeling too much or too little bass. Vs “we looked at air purifiers and, uh, they purify air!” ...

Meanwhile NYT was trying to grow their online presence as newspaper sales die... So they bought up Wirecutter, as a sort of “new journalism,” a “we wanted to get into this anyway, and it's easier if we don't try to build up the network effects ourselves but just take a site’s traffic who is already successful.” So yeah, they aqui-hired Wirecutter and put all their stuff on their domain and it kinda sucks now, but some of that were trends that were already beginning before they were acquired and there's still usually some decent data hiding in the “the competition” section of every “WC” article.

216. immibis ◴[28 Apr 24 21:59 UTC] No.40192374{6}[source]▶

>>40140561 #

you can click one of the ML-selected categories at the top of your homepage to tell it what you'd like to see today

217. immibis ◴[28 Apr 24 22:12 UTC] No.40192451{8}[source]▶

>>40142586 #

YouTube has this feature

replies(1): >>40302975 #

218. immibis ◴[28 Apr 24 22:14 UTC] No.40192455{7}[source]▶

>>40140605 #

Landscape videos were more pleasant on landscape screens, which are rarely used now, so they aren't more pleasant now.

219. darrenoc ◴[30 Apr 24 05:49 UTC] No.40207620[source]▶

>>40166064 #

That amazing internal knowledge graph you're talking about folks on the outside never seeing? That is very ironic because that knowledge graph used to be Freebase.com and a lot of the data came from the open data community who volunteered their efforts and expertise. Then Google bought Metaweb and shut down Freebase.

220. darrenoc ◴[30 Apr 24 05:50 UTC] No.40207628{3}[source]▶

>>40152275 #

Exactly. The main way to get promoted at Google is to claim that you launched something important. Results in a lot of busywork and misaligned incentives.

221. winternett ◴[02 May 24 22:04 UTC] No.40241886[source]▶

>>40140388 #

These search companies should have hired moderators to manually browse results and tag them based on keywords instead of leaving tagging up to content and info creators. The entire results game became fixated on trending topics and SEO spam that it became a game of insider trick trading, that's what makes results everywhere so terrible now.

In a bid for attention, only the fraudsters are winning, well, the platforms are winning lots of money from selling advertising, I guess that's why they're perfectly fine with not fixing results and ranking for many years now. I'm not sure there is a way back to real relevance now, there's no incentive for these large companies to fix things, and the public has already become used to the gamified system to go back to behaving themselves.

222. AlexCoventry ◴[05 May 24 22:45 UTC] No.40269296{6}[source]▶

>>40140765 #

Thanks, that was helpful.

223. fuzztester ◴[08 May 24 21:41 UTC] No.40302975{9}[source]▶

>>40192451 #

Where in the menu is it? I admit I have not checked out YouTube menus or features much.

↑