←back to thread

Production RAG: what I learned from processing 5M+ documents

(blog.abdellatif.io)

548 points tifa2up | 1 comments | 20 Oct 25 15:55 UTC | HN request time: 0.207s | source

Show context

pietz ◴[21 Oct 25 09:55 UTC] No.45654114[source]▶

>>45645349 (OP) #

My biggest RAG learning is to use agentic RAG. (Sorry for buzzword dropping)

- Classic RAG: `User -> Search -> LLM -> User`

- Agentic RAG: `User <-> LLM <-> Search`

Essentially instead of having a fixed loop, you provide the search as a tool to the LLM, which does three things:

- The LLM can search multiple times

- The LLM can adjust the search query

- The LLM can use multiple tools

The combination of these three things has solved a majority of classic RAG problems. It improves user queries, it can map abbreviations, it can correct bad results on its own, you can also let it list directories and load files directly.

replies(2): >>45656209 #>>45656944 #

1. googamooga ◴[21 Oct 25 14:23 UTC] No.45656209[source]▶

I fully support this approach! When I first started experimenting—rather naively—with using tool-enabled LLMs to generate documents (such as reports or ADRs) from the extensive knowledge base in Confluence, I built a few tools to help the LLM search Confluence using CQL (Confluence Query Language) and store the retrieved pages in a dedicated folder. The LLM could then search within that folder with simple filesystem tools and pull entire files into its context as needed. The results were quite good, as long as the context didn’t become overloaded. However, when I later tried to switch to a 'Classic RAG' setup, the output quality dropped significantly and I refrained from switching.