←back to thread

1743 points caspii | 1 comments | | HN request time: 0s | source
Show context
ilamont ◴[] No.27428272[source]
Same story for various Wordpress plugins and widgety things that live in site footers.

Google has turned into a cesspool. Half the time I find myself having to do ridiculous search contortions to get somewhat useful results - appending site: .edu or .gov to search strings, searching by time periods to eliminate new "articles" that have been SEOed to the hilt, or taking out yelp and other chronic abusers that hijack local business results.

replies(19): >>27428410 #>>27428439 #>>27428441 #>>27428466 #>>27428594 #>>27428652 #>>27428717 #>>27428807 #>>27429076 #>>27429483 #>>27429797 #>>27429818 #>>27429843 #>>27429859 #>>27430023 #>>27430207 #>>27430285 #>>27430707 #>>27430783 #
elchupanebre ◴[] No.27430207[source]
The reason for that is actually rational: when Amit Singhal was in charge the search rules were written by hand. Once he was fired, the Search Quality team switched to machine learning. The ML was better in many ways: it produced higher quality results with a lot less effort. It just had one possibly fatal flaw: if some result was wrong there was no recourse. And that's what you are observing now: search quality is good or excellent most of the time while sometimes it's very bad and G can't fix it.
replies(5): >>27430295 #>>27430301 #>>27430306 #>>27430308 #>>27430753 #
cookiengineer ◴[] No.27430753[source]
> G can't fix it.

Yes, they can. They should simply stop measuring only positives, and start measuring negatives - e.g. people that press the back button of their browser, or click the second, third, fourth result afterwards...which should hint the ML classifiers that the first result was total crap in the first place.

But I guess this is exactly what happens if you have a business model where leads to sites where you provide ads give you a weird ethics, as your company profits from those scammers more than from legit websites.

From an ML point of view google's search results are the perfect example of overfitting. Kinda ironic that they lead the data science research field and don't realize this in their own product, but teach this flaw everywhere.

replies(1): >>27430831 #
quantumofalpha ◴[] No.27430831[source]
They have been already doing this for a loooong time, it's a low hanging fruit.

Take a look sometime at the wealth of data google serp sends back about your interactions with it

replies(2): >>27430846 #>>27430908 #
cookiengineer ◴[] No.27430846[source]
Please provide proof for this theory that google measures this also.
replies(1): >>27430862 #
quantumofalpha ◴[] No.27430862[source]
I worked in ranking for two major search engines. They all measure this, this is a really low hanging fruit - how much time it took you to come up with this idea? Why do you think so lowly of people who put decades of life into their systems that they didn't think of it?

Technically just open google serp in developer tools, network tab, set preserve/persist logs option, and watch the requests flowing back - all your clicks and back navigations are reported back for analysis. Same on other search engines. Only DDG doesn't collect your clicks/dwell time - but that's a distinguishing feature of their brand, they stripped themselves of this valuable data on purpose.

replies(2): >>27430985 #>>27431337 #
1. skinkestek ◴[] No.27430985{3}[source]
So they do collect it, they only ignore it - just like the 10 - 30 (or more) clicks I've spent on the tiny tiny [x] in the top corner of scammy-looking-dating-site-slash-mail-order-bride ads that they served me for a decade?