←back to thread

426 points benchmarkist | 2 comments | | HN request time: 0.422s | source
Show context
aurareturn ◴[] No.42180200[source]
Normally, I don't think 1000 tokens/s is that much more useful than 50 tokens/s.

However, given that CoT makes models a lot smarter, I think Cerebras chips will be in huge demand from now on. You can have a lot more CoT runs when the inference is 20x faster.

Also, I assume financial applications such as hedge funds would be buying these things in bulk now.

replies(1): >>42180756 #
1. deadmutex ◴[] No.42180756[source]
> Also, I assume financial applications such as hedge funds would be buying these things in bulk now.

Please elaborate.. why?

replies(1): >>42182264 #
2. aurareturn ◴[] No.42182264[source]
I'm assuming hedge funds are using LLMs to dissect information from company news, SEC reports as soon as possible then make a decision on trading. Having faster inference would be a huge advantage.