(blog.google)

612 points meetpateltech | 2 comments | 05 Feb 25 16:03 UTC | HN request time: 0.413s | source

Show context

leonidasv ◴[05 Feb 25 17:47 UTC] No.42952286[source]▶

That 1M tokens context window alone is going to kill a lot of RAG use cases. Crazy to see how we went from 4K tokens context windows (2023 ChatGPT-3.5) to 1M in less than 2 years.

replies(6): >>42952393 #>>42952519 #>>42952569 #>>42954277 #>>42958220 #>>42975332 #

1. torginus ◴[05 Feb 25 20:01 UTC] No.42954277[source]▶

>>42952286 #

That's not really my experience. Error rate goes up the more stuff you cram into the context, and processing gets both slower and more expensive with the amount of input tokens.

I'd say it makes sense to do RAG even if your stuff fits into context comfortably.

replies(1): >>42954311 #

2. lamuswawir ◴[05 Feb 25 20:03 UTC] No.42954311[source]▶

>>42954277 (TP) #

Try exp-1206. That thing works on large context.

↑

Gemini 2.0 is now available to everyone