I have a RAG setup that doesn't work on documents but other data points that we use for generation (the original data is call recordings but it is heavily processed to just a few text chunks).
Instead of a reranker model we do vector search and then simply ask GPT-5 in an extra call which of the results is the most relevant to the input question. Is there an advantage to actual reranker models rather than using a generic LLM?
replies(2):