(www.tomshardware.com)

370 points ferriswil | 1 comments | 19 Oct 24 18:03 UTC | HN request time: 0.407s | source

Show context

kayo_20211030 ◴[19 Oct 24 19:34 UTC] No.41890110[source]▶

Extraordinary claims require extraordinary evidence. Maybe it's possible, but consider that some really smart people, in many different groups, have been working diligently in this space for quite a while; so claims of 95% savings on energy costs _with equivalent performance_ is in the extraordinary category. Of course, we'll see when the tide goes out.

replies(6): >>41890280 #>>41890322 #>>41890352 #>>41890379 #>>41890428 #>>41890702 #

Randor ◴[19 Oct 24 20:05 UTC] No.41890352[source]▶

>>41890110 #

The energy claims up to ~70% can be verified. The inference implementation is here:

https://github.com/microsoft/BitNet

replies(2): >>41890548 #>>41891150 #

kayo_20211030 ◴[19 Oct 24 20:29 UTC] No.41890548[source]▶

>>41890352 #

I'm not an AI person, in any technical sense. The savings being claimed, and I assume verified, are on ARM and x86 chips. The piece doesn't mention swapping mult to add, and a 1-bit LLM is, well, a 1-bit LLM.

Also,

> Additionally, it reduces energy consumption by 55.4% to 70.0%

With humility, I don't know what that means. It seems like some dubious math with percentages.

replies(2): >>41890656 #>>41891000 #

Randor ◴[19 Oct 24 20:45 UTC] No.41890656[source]▶

>>41890548 #

> I don't know what that means. It seems like some dubious math with percentages.

I would start by downloading a 1.58 model such as: https://huggingface.co/HF1BitLLM/Llama3-8B-1.58-100B-tokens

Run the non-quantized version of the model on your 3090/4090 gpu and observe the power draw. Then load the 1.58 model and observe the power usage. Sure, the numbers have a wide range because there are many gpu/npu to make the comparison.

replies(1): >>41890880 #

1. kayo_20211030 ◴[19 Oct 24 21:22 UTC] No.41890880[source]▶

>>41890656 #

Good one!

↑

AI engineers claim new algorithm reduces AI power consumption by 95%