←back to thread

167 points galeos | 1 comments | | HN request time: 0s | source
Show context
newfocogi ◴[] No.41879780[source]
I'm enthusiastic about BitNet and the potential of low-bit LLMs - the papers show impressive perplexity scores matching full-precision models while drastically reducing compute and memory requirements. What's puzzling is we're not seeing any major providers announce plans to leverage this for their flagship models, despite the clear efficiency gains that could theoretically enable much larger architectures. I suspect there might be some hidden engineering challenges around specialized hardware requirements or training stability that aren't fully captured in the academic results, but would love insights from anyone closer to production deployment of these techniques.
replies(6): >>41879903 #>>41880200 #>>41880375 #>>41881054 #>>41881230 #>>41882202 #
1. swfsql ◴[] No.41880200[source]
I think that since training must happen on a non-bitnet architecture, tuning towards bitnet is always a downgrade on it's capabilities, so they're not really interested in it. But maybe they could be if they'd offer cheaper plans, since it's efficiency is relatively good.

I think the real market for this is for local inference.