←back to thread

183 points leotravis10 | 1 comments | | HN request time: 0.237s | source
Show context
dev_john15[dead post] ◴[] No.43544457[source]
[dead]
Aardwolf ◴[] No.43544547[source]
I would love to know how multiplication and division work in modern chips to have such low cycle count compared to addition, since in theory the addition complexity is linear in the amount of bits but multiplication and division are quadratic, or loglinear for large inputs. Part of that is solved by surface area rather than time I guess, but that's also true for the adders already with the carry logic
replies(2): >>43549294 #>>43549620 #
1. kens ◴[] No.43549620[source]
I'm working on the multiplication circuit in the Pentium; I've done a partial writeup: https://www.righto.com/2025/03/pentium-multiplier-adder-reve... The short answer is that multiplication uses a large tree of adders so it can add up all the long-division terms at once. It also uses base-8 for the multiplier to reduce the number of terms. The adders are 4:2 carry-save compressors that take four numbers as inputs and produce two numbers as outputs.

I also wrote about the Pentium's division circuitry and the infamous FDIV bug: https://www.righto.com/2024/12/this-die-photo-of-pentium-sho... The short answer is that the Pentium used base-4 SRT division, similar to long division but generating two bits of result per cycle. It used a lookup table to determine the two quotient bits; an error in this table resulted in the bug.