The Theoretical Limitations of Embedding-Based Retrieval

I just came across an evaluation of state-of-the-art SPLADE models. Yeah they utilize BERT's vocabulary size as their sparse vector dimensionality and do capture semantics. As expected, they significantly outperform all dense models in this benchmark. https://github.com/frinkleko/LIMIT-Sparse-Embedding OpenSearch team seemed has been working on inference-free versions of these models. Similar to BM25, these models only encode documents offline. So now we have sparse, small and efficient models while is much better than dense ones, at least on LIMIT