AI engineers claim new algorithm reduces AI power consumption by 95%

1. robomartin ◴[19 Oct 24 19:26 UTC] No.41890056[source]▶

I posted this about a week ago:

https://news.ycombinator.com/item?id=41816598

This has been done for decades in digital circuits, FPGA’s, Digital Signal Processing, etc. Floating point is both resource and power intensive and using FP without the use of dedicated FP processing hardware is something that has been avoided and done without for decades unless absolutely necessary.

replies(3): >>41890331 #>>41890498 #>>41890812 #

2. ujikoluk ◴[19 Oct 24 20:02 UTC] No.41890331[source]▶

>>41890056 (TP) #

Explain more for the uninitiated please.

replies(1): >>41978807 #

3. ausbah ◴[19 Oct 24 20:24 UTC] No.41890498[source]▶

>>41890056 (TP) #

a lot of things in the ML research space are rebranding an old concept w a new name as “novel”

4. fidotron ◴[19 Oct 24 21:09 UTC] No.41890812[source]▶

>>41890056 (TP) #

Right, the ML people are learning, slowly, about the importance of optimizing for silicon simplicity, not just reduction of symbols in linear algebra.

Their rediscovery of fixed point was bad enough but the “omg if we represent poses as quaternions everything works better” makes any game engine dev for the last 30 years explode.

5. robomartin ◴[29 Oct 24 02:35 UTC] No.41978807[source]▶

>>41890331 #

Not sure there's much to explain. Using integers for math in digital circuits is far more resource and computationally efficient than floating-point math. It has been decades since I did the math on the difference. I'll just guess that it could easily be an order of magnitude better across both metrics.

At basic level it is very simple: A 10 bit bus gives you the ability to represent numbers between 0 and 1 with a resolution of approximately 0.001. 12 bits would be four times better. Integer circuits can do the math in one clock cycle. Hardware multipliers do the same. To rescale the numbers after multiplication you just take the N high bits, where N is your bus width; which is a zero clock-cycle operation. Etc.

In training a neural network, the back propagation math can be implemented using almost the same logic used for a polyphase FIR filter.