/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference
(cerebras.ai)
426 points
benchmarkist
| 1 comments |
19 Nov 24 00:15 UTC
|
HN request time: 0.201s
|
source
1.
arthurcolle
◴[
19 Nov 24 06:28 UTC
]
No.
42180550
[source]
▶
>>42178761 (OP)
#
Damn that's a big model and that's really fast inference.
ID:
GO
↑