/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
BitNet b1.58 2B4T Technical Report
(arxiv.org)
111 points
galeos
| 2 comments |
17 Apr 25 07:27 UTC
|
HN request time: 0s
|
source
Show context
Havoc
◴[
17 Apr 25 11:46 UTC
]
No.
43715393
[source]
▶
>>43714004 (OP)
#
Is there a reason why the 1.58 ones are always aimed at quite small ones? Think I’ve seen an 8B but that’s about it.
Is there a technical reason for it or just research convenience ?
replies(2):
>>43715453
#
>>43717231
#
1.
yieldcrv
◴[
17 Apr 25 14:18 UTC
]
No.
43717231
[source]
▶
>>43715393
#
They aren’t, there is a 1.58 version of deepseek that’s like 200gb instead of 700
replies(1):
>>43719355
#
ID:
GO
2.
logicchains
◴[
17 Apr 25 16:48 UTC
]
No.
43719355
[source]
▶
>>43717231 (TP)
#
That's not a real BitNet, it's just a post-training quantisation, and its performance suffers compared to if it was trained from scratch at 1.58 bits.
↑