3 points anuarsh | 9 comments | | HN request time: 1.046s | source | bottom
1. ◴[] No.45058122[source]
2. anuarsh ◴[] No.45058298[source]
Hi everyone, any comments or questions are appreciated
3. attogram ◴[] No.45058682[source]
"~20 min for the first token" might turn off some people. But it is totally worth it to get such a large context size on puny systems!
replies(1): >>45058870 #
4. anuarsh ◴[] No.45058870[source]
Absolutely, there are tons of cases where interactive experience is not required, but ability to process large context to get insights.
replies(1): >>45061478 #
5. Haeuserschlucht ◴[] No.45060882[source]
20 minutes is a huge turnoff, unless you have it run over night.... Just to get the hint that you should exercise self care in the morning when presenting a legal paper and have the ai check it for flaws.
replies(1): >>45067676 #
6. Haeuserschlucht ◴[] No.45060903[source]
It's better to have software erase all private details from text and have it checked by cloud ai to then have all placeholders replaced back at your harddrive.
7. attogram ◴[] No.45061478{3}[source]
It would be interesting to see some benchmarks of this vs, for example, Ollama running localy with no timeout
8. anuarsh ◴[] No.45067676[source]
We are talking about 100k context here. 20k would be much faster, but you won't need KVCache offloading for it