This, combined with a subsequent reranker, basically eliminated any of our issues on search.
This, combined with a subsequent reranker, basically eliminated any of our issues on search.
Disclosure: I work at MS and help maintain our most popular open-source RAG template, so I follow the best practices closely: https://github.com/Azure-Samples/azure-search-openai-demo/
So few developers realize that you need more than just vector search, so I still spend many of my talks emphasizing the FULL retrieval stack for RAG. It's also possible to do it on top of other DBs like Postgres, but takes more effort.
AI Search team's been working with the Sharepoint team to offer more options, so that devs can get best of both worlds. Might have some stuff ready for Ignite (mid November).
That's why I write blog posts like https://blog.pamelafox.org/2024/06/vector-search-is-not-enou...
Moreover I am curious why you guys use bm25 over SPLADE?
That query generation approach does not extract structured data. I do maintain another RAG template for PostgreSQL that uses function calling to turn the query into a structured query, such that I can construct SQL filters dynamically. Docs here: https://github.com/Azure-Samples/rag-postgres-openai-python/...
I'll ask the search about SPLADE, not sure.
Shameless plug: plpgsql_bm25: BM25 search implemented in PL/pgSQL (The Unlicense / PUBLIC DOMAIN)
https://github.com/jankovicsandras/plpgsql_bm25
There's an example Postgres_hybrid_search_RRF.ipynb in the repo which shows hybrid search with Reciprocal Rank Fusion ( plpgsql_bm25 + pgvector ).
Of course, agentic retrieval is just better quality-wise for a broader set of scenarios, usual quality-latency trade-off.
We don't do SPLADE today. We've explored it and may get back to it at some point, but we ended up investing more on reranking to boost precision, we've found we have fewer challenges on the recall side.