←back to thread

1303 points serjester | 1 comments | | HN request time: 0.213s | source
1. lyjackal ◴[] No.42958155[source]
If the end goal is just rag or search over the pdfs, seems like ColPali based embedding search would be a good alternative here. Don’t process the PDFs, instead just search their image embedding directly. From what I understand, you also get a sort of attention as to what part of the image is being activated by the search.