←back to thread

507 points martinald | 1 comments | | HN request time: 0s | source
Show context
ekelsen ◴[] No.45052850[source]
The math on the input tokens is definitely wrong. It claims each instance (8 GPUs) can handle 1.44 million tokens/sec of input. Let's check that out.

1.44e6 tokens/sec * 37e9 bytes/token / 3.3e12 bytes/sec/GPU = ~16,000 GPUs

And that's assuming a more likely 1 byte per parameter.

So the article is only off by a factor of at least 1,000. I didn't check any of the rest of the math, but that probably has some impact on their conclusions...

replies(5): >>45052936 #>>45052942 #>>45052964 #>>45053047 #>>45053166 #
1. Lionga ◴[] No.45052936[source]
Well he asked some AI to do the math for him probably