There’s no guarantee this didn’t base the results on just 1/3 of the contents of your library though, right? How can it be accurate if it’s not comprehensive, due to the widely noted issues with long context? (distraction, confusion, etc)
This is a gap I see often, and I wonder how people are solving it. I’ve seen strategies like using a “file” tool to keep a checklist of items with looping LLM calls, but haven’t applied anything like this personally.
replies(1):