(arxiv.org)

147 points fzliu | 2 comments | 29 Aug 25 20:25 UTC | HN request time: 0s | source

Show context

gdiamos ◴[29 Aug 25 21:39 UTC] No.45069705[source]▶

Their idea is that capacity of even 4096-wide vectors limits their performance.

Sparse models like BM25 have a huge dimension and thus don’t suffer from this limit, but they don’t capture semantics and can’t follow instructions.

It seems like the holy grail is a sparse semantic model. I wonder how splade would do?

replies(3): >>45070552 #>>45070624 #>>45088848 #

1. tkfoss ◴[29 Aug 25 23:43 UTC] No.45070624[source]▶

>>45069705 #

Wouldn't holy grail then be parallel channels for candidate generation;

  euclidean embedding
  hyperbolic embedding
  sparse BM25 / SPLADE lexical search
  optional multi-vector signatures

  ↓ merge & deduplicate candidates

followed by weight scoring, expansion (graph) & rerank (LLM)?

replies(1): >>45073701 #

2. jdthedisciple ◴[30 Aug 25 11:16 UTC] No.45073701[source]▶

>>45070624 (TP) #

that is pretty much exactly what we do for our company-internal knowledge retrieval:

    embedding search (0.4)
    lexical/keyword search (0.4)
    fuzzy search (0.2)

might indeed achieve the best of all worlds

↑

The Theoretical Limitations of Embedding-Based Retrieval