←back to thread

426 points benchmarkist | 2 comments | | HN request time: 0s | source
Show context
easeout ◴[] No.42180121[source]
How does binning work when your chip is the entire wafer?
replies(1): >>42180161 #
shrubble ◴[] No.42180161[source]
They expect that some of the cores on the wafer will fail, so they have redundant links all throughout the chip, so they can seal off/turn off any cores that fail and still have enough cores to do useful work.
replies(1): >>42180731 #
1. why_only_15 ◴[] No.42180731[source]
My understanding is that they mask off or otherwise disable a whole row+column of cores when one dies
replies(1): >>42180846 #
2. wtallis ◴[] No.42180846[source]
That's way too wasteful.

Take a look at https://fuse.wikichip.org/news/3010/a-look-at-cerebras-wafer... and specifically the diagram https://fuse.wikichip.org/wp-content/uploads/2019/11/hc31-ce...

The fabric can effectively route signals diagonally to work around an individual defective core, with a displacement of one position for cores in the same row from that defect over to the nearest spare core. That's how they get away with a claimed "1–1.5%" of spare cores.