←back to thread

167 points galeos | 1 comments | | HN request time: 0s | source
Show context
wwwtyro ◴[] No.41880073[source]
Can anyone help me understand how this works without special bitnet precision-specific hardware? Is special hardware unnecessary? Maybe it just doesn't reach the full bitnet potential without it? Or maybe it does, with some fancy tricks? Thanks!
replies(3): >>41880204 #>>41880283 #>>41881707 #
summerlight ◴[] No.41881707[source]
The major benefit would be its significant decrease in memory consumption, rather than the compute itself. The major bottleneck of the current LLM infra is typically memory bandwidth and that's the reason why those chip industries are going crazy on HBM. Surely compute optimization helps but this is useful even without any hardware changes.
replies(1): >>41882331 #
1. az226 ◴[] No.41882331[source]
Inference speeds go brrrr as well.