←back to thread

548 points tifa2up | 2 comments | | HN request time: 0.435s | source
Show context
manishsharan ◴[] No.45645772[source]
Thanks for sharing. TIL about rerankers.

Chunking strategy is a big issue. I found acceptable results by shoving large texts to to gemini flash and have it summarize and extract chunks instead of whatever text splitter I tried. I use the method published by Anthropic https://www.anthropic.com/engineering/contextual-retrieval i.e. include full summary along with chunks for each embedding.

I also created a tool to enable the LLM to do vector search on its own .

I do not use Langchain or python.. I use Clojure+ LLMs' REST APIs.

replies(2): >>45645995 #>>45692691 #
1. esafak ◴[] No.45645995[source]
Have you measured your latency, and how sensitive are you to it?
replies(1): >>45646290 #
2. manishsharan ◴[] No.45646290[source]
>> Have you measured your latency, and how sensitive are you to it?

Not sensitive to latency at all. My users would rather have well researched answers than poor answers.

Also, I use batch mode APIs for chunking .. it is so much cheaper.