BitNet b1.58 2B4T Technical Report

1. nopelynopington ◴[17 Apr 25 12:57 UTC] No.43716144[source]▶

I built it at home this morning and tried it, perhaps my expectations were high but I wasn't terribly impressed. I asked it for a list of ten types of data I might show on a home info display panel. It gave me three. I clarified that I wanted ten, it gave me six. Every request after that just returned the same six things.

I know it's not chatGPT4 but I've tried other very small models that run on CPU only and had better results

replies(2): >>43716331 #>>43720674 #

2. ashirviskas ◴[17 Apr 25 13:13 UTC] No.43716331[source]▶

>>43716144 (TP) #

> I've tried other very small models that run on CPU only and had better results

Maybe you can you share some comparative examples?

replies(1): >>43716768 #

3. nopelynopington ◴[17 Apr 25 13:45 UTC] No.43716768[source]▶

>>43716331 #

sure, here's my conversation with BitNet b1.58 2B4T

https://pastebin.com/ZZ1tADvp

here's the same prompt given to smollm2:135m

https://pastebin.com/SZCL5WkC

The quality of the second results are not fantastic. The data isn't public, and it repeats itself mentioning income a few times. I don't think I would use either of these models for accurate data but I was surprised at the truncated results from bitnet

Smollm2:360M returned better quality results, no repetition, but it did suggest things which didn't fit the brief exactly (public data given location only)

https://pastebin.com/PRFqnqVF

Edit:

I tried the same query on the live demo site and got much better results. Maybe something went wrong on my end?

replies(1): >>43717694 #

4. sroussey ◴[17 Apr 25 14:47 UTC] No.43717694{3}[source]▶

>>43716768 #

You were using bitnet.cpp?

replies(1): >>43717900 #

5. nopelynopington ◴[17 Apr 25 15:00 UTC] No.43717900{4}[source]▶

>>43717694 #

Yes

6. Me1000 ◴[17 Apr 25 18:46 UTC] No.43720674[source]▶

>>43716144 (TP) #

This is a technology demo, not a model you'd want to use. Because Bitnet models are only average 1.58 bits per weight you'd expect to need the model to be much larger than your fp8/fp16 counterparts in terms of parameter count. Plus this is only a 2 billion parameter model in the first place, even fp16 2B parameter models generally perform pretty poorly.

replies(1): >>43720945 #

7. nopelynopington ◴[17 Apr 25 19:11 UTC] No.43720945[source]▶

>>43720674 #

Ok that's fair. I still think something was up with my build though, the online demo worked far better than my local build