←back to thread

279 points freediver | 1 comments | | HN request time: 0s | source
Show context
eduction ◴[] No.45951334[source]
I completely agree with the insight that full text search has been complexified. People seem to want to jump straight to clustering or other enterprise level things.

I also appreciate the moxie of getting in there and building it yourself.

Myself, I reach for Lucene. Then you don’t need to build all this yourself if you don’t want. It lives in a dir on disk. True, it’s a separate database, but one optimized for this problem.

replies(1): >>45951365 #
aorloff ◴[] No.45951365[source]
This was the solution I was thinking about, but I thought, well that's the way someone would have done it 20 years ago
replies(1): >>45951846 #
shevy-java ◴[] No.45951846[source]
Alright but why do we not have more search engines that are actually good?

I'd love to cut myself off from Google, including Google Search, but any alternatives manage to be even worse. Consistently so. It's as if Google won the war by being just permanently slightly better - while everyone is actually really crap. That wasn't the case, say, 10 years ago or so.

replies(5): >>45952022 #>>45952024 #>>45952350 #>>45955823 #>>45958345 #
1. jillesvangurp ◴[] No.45952350[source]
Because it's not a simple problem space. Lucene has gone through about three decades of lots of optimization, feature development, and performance tuning. A lot of brain power goes into that.

Google bootstrapped the AI revolution as a side effect of figuring out how to do search better. They started by hiring a lot of expert researchers that then got busy iterating on interesting search engine adjacent problems (like figuring out synonyms, translations, etc.). In the process they got into running neural networks at scale, figuring out how to leverage GPUs and eventually building their own TPUs.

The Acquire Podcast recently did a great job of outlining the history of Google & Alphabet.

Doing search properly at scale mainly requires a lot of infrastructure. And that's Google's real moat. They get to pay for all that with an advertising money printing machine. Which BTW. leverages a lot of search algorithms. Matching advertisements to content is a search problem. Google just got really good at that. That's what finances all the innovation in this space from deep learning to TPUs. Being able to throw a few hundred million at running some experiments is what makes the difference here.