Given the recent advances in vector-based semantic search, what's the SOTA search stack that people are using for hybrid keyword + semantic search these days?
replies(7):
Author of txtai [1] here. txtai implements a performant BM25 index in Python [2] via the arrays package and storing the term frequency vectors in SQLite.
With txtai, the hybrid index approach [3] supports both convex combination when BM25 scores are normalized and reciprocal rank fusion (RRF) when they aren't [4].
[1] https://github.com/neuml/txtai
[2] https://neuml.hashnode.dev/building-an-efficient-sparse-keyw...
[3] https://neuml.hashnode.dev/benefits-of-hybrid-search
[4] https://github.com/neuml/txtai/blob/master/src/python/txtai/...