←back to thread

3 points anuarsh | 3 comments | | HN request time: 0.631s | source
1. attogram ◴[] No.45058682[source]
"~20 min for the first token" might turn off some people. But it is totally worth it to get such a large context size on puny systems!
replies(1): >>45058870 #
2. anuarsh ◴[] No.45058870[source]
Absolutely, there are tons of cases where interactive experience is not required, but ability to process large context to get insights.
replies(1): >>45061478 #
3. attogram ◴[] No.45061478[source]
It would be interesting to see some benchmarks of this vs, for example, Ollama running localy with no timeout