←back to thread

283 points rrampage | 2 comments | | HN request time: 0.425s | source
Show context
RA_Fisher ◴[] No.42192651[source]
BM25 is an ancient algo developed in the 1970s. It’s basically a crappy statistical model and statisticians can do far better today. Search is strictly dominated by learning (that yes, can use search as an input). Not many folks realize that yet, and / or are incentivized to keep the old tech going as long as possible, but market pressures will change that.
replies(4): >>42192735 #>>42192805 #>>42192828 #>>42194229 #
simplecto ◴[] No.42192805[source]
Those are some really spicy opinions. It would seem that many search experts might not agree.

David Tippet (formerly opensearch and now at Github)

A great podcast with David Tippet and Nicolay Gerold entitled:

"BM25 is the workhorse of search; vectors are its visionary cousin"

https://www.youtube.com/watch?v=ENFW1uHsrLM

replies(2): >>42192855 #>>42193450 #
RA_Fisher ◴[] No.42193450[source]
I’m sure Search experts would disagree, because it’d be their technology they’d be admitting is inferior to another. BM25 is the workhorse, no doubt— but it’s also not the best anymore. Vectors are a step toward learning models, but only a small mid-range step vs. an explicit model.

Search is a useful approach for computing learning models, but there’s a difference between the computational means and the model. For example, MIPS is a very useful search algo for computing learning models (but first the learning model has to be formulated).

replies(3): >>42193880 #>>42194290 #>>42197352 #
softwaredoug ◴[] No.42194290[source]
I don't know a lot of search practitioners who don't want to use the "new sexy" thing. Most of us do a fair amount of "resume driven development" so can claim to be "AI Engineers" :)
replies(1): >>42195479 #
RA_Fisher ◴[] No.42195479[source]
I don’t think it’s realistic to think that software engineers can pick up advanced statistical modeling on the job, unless they’re pairing with statisticians. There’s just too much background involved.
replies(2): >>42196352 #>>42197148 #
1. binarymax ◴[] No.42196352[source]
Your overall condescending attitude in this thread is really disgusting.
replies(1): >>42196528 #
2. RA_Fisher ◴[] No.42196528[source]
Statisticians are famously disliked, especially by engineers (there are open-minded folks, of course! maybe they’d taken some econometrics or statistics, are exceptionally humble, etc). There are some interesting motives and incentives around that. Sometimes I think in part it’s because many people would prefer their existing beliefs be upheld as opposed to challenged, even if they’re not well-supported (and likely to lead to bad decisions and outcomes). Sticking with outdated technology is one example.