←back to thread

283 points rrampage | 1 comments | | HN request time: 0.214s | source
Show context
hubraumhugo ◴[] No.42193073[source]
Given the recent advances in vector-based semantic search, what's the SOTA search stack that people are using for hybrid keyword + semantic search these days?
replies(7): >>42193208 #>>42193787 #>>42193816 #>>42193909 #>>42193922 #>>42193932 #>>42194089 #
1. d4rkp4ttern ◴[] No.42193787[source]
In the Langroid[1] LLM library we have a clean, extensible RAG implementation in the DocChatAgent[2] -- it uses several retrieval techniques, including lexical (bm25, fuzzy search) and semantic (embeddings), and re-ranking (using cross-encoder, reciprocal-rank-fusion) and also re-ranking for diversity and lost-in-the-middle mitigation:

[1] Langroid - a multi-agent LLM framework from CMU/UW-Madison researchers https://github.com/langroid/langroid

[2] DocChatAgent Implementation - https://github.com/langroid/langroid/blob/main/langroid/agen...

Start with the answer_from_docs method and follow the trail.

Incidentally I see you're the founder of Kadoa -- Kadoa-snack is one of favorite daily tools to find LLM-related HN discussions!