Gemini 2.0 is now available to everyone

(blog.google)

612 points meetpateltech | 1 comments | 05 Feb 25 16:03 UTC | HN request time: 0s | source

Show context

leonidasv ◴[05 Feb 25 17:47 UTC] No.42952286[source]▶

That 1M tokens context window alone is going to kill a lot of RAG use cases. Crazy to see how we went from 4K tokens context windows (2023 ChatGPT-3.5) to 1M in less than 2 years.

replies(6): >>42952393 #>>42952519 #>>42952569 #>>42954277 #>>42958220 #>>42975332 #

okdood64 ◴[07 Feb 25 17:42 UTC] No.42975332[source]▶

>>42952286 #

> is going to kill a lot of RAG use cases.

I have a high level understanding of LLMs and am a generalist software engineer.

Can you elaborate on how exactly these insanely large (and now cheap) context windows will kill a lot of RAG use cases?

replies(1): >>42977907 #

1. jiggawatts ◴[07 Feb 25 21:50 UTC] No.42977907[source]▶

>>42975332 #

If a model has 4K input context and you have a document or code base with 40K, then you have to split it up. The system prompt, user prompt, and output token budget all eat into this. You might need hundreds of small pieces, which typically end up in a vector database for RAG retrieval.

With a million tokens you can shove several short books into the prompt and just skip all that. That’s an entire small-ish codebase.

A colleague used a HTML dump of every config and config policy from a Windows network, pasted it into Gemini and started asking questions. It’s just that easy now!

↑