Most active commenters
  • kulahan(3)

←back to thread

200 points rbanffy | 11 comments | | HN request time: 1.3s | source | bottom
1. lorenzohess ◴[] No.45655889[source]
Summary:

> Rather than allowing heat to build up, what if we could spread it out right from the start, inside the chip?... To do that, we’d have to introduce a highly thermally conductive material inside the IC, mere nanometers from the transistors, without messing up any of their very precise and sensitive properties. Enter an unexpected material—diamond.

> ... my research group at Stanford University has managed what seemed impossible. We can now grow a form of diamond suitable for spreading heat, directly atop semiconductor devices at low enough temperatures that even the most delicate interconnects inside advanced chips will survive... Our diamonds are a polycrystalline coating no more than a couple of micrometers thick.

> The potential benefits could be huge. In some of our earliest gallium-nitride radio-frequency transistors, the addition of diamond dropped the device temperature by more than 50 °C.

replies(1): >>45656776 #
2. kulahan ◴[] No.45656776[source]
Fifty Celsius is an insane drop.

It sounds like the most important part of the article (and another cool quote) is this:

>Until recently we knew how to grow it only at circuit-slagging temperatures in excess of 1,000 °C.

So basically, the big breakthrough was low-temp growth of a diamond lattice. Very cool they can do it at such a low temperature. It must be a crazy low temp - probably under 100C?

replies(2): >>45657044 #>>45657067 #
3. beautifulfreak ◴[] No.45657044[source]
The article says 400C
4. yorwba ◴[] No.45657067[source]
From the article:

"we were able to find a formula that produced coatings of large-grained polycrystalline diamond all around devices at 400 °C, which is a survivable temperature for CMOS circuits and other devices."

replies(2): >>45657194 #>>45659290 #
5. kulahan ◴[] No.45657194{3}[source]
Thanks, not sure how I missed that. Still, a 60% drop in required temp! These gems are truly, truly outrageous.
replies(1): >>45658380 #
6. zeristor ◴[] No.45658380{4}[source]
~50% it helps to do these calculations using the Kelvin scale.

Learnt that in Physics lab.

replies(1): >>45660688 #
7. FaradayRotation ◴[] No.45659290{3}[source]
It is genuinely impressive to grow thin film polycrystalline diamond at 400C, but my understanding is this temperature is basically at the ceiling of what the circuits will tolerate in the course of manufacturing to still get a good quality device at end of line. Stress tests, anneals, and wafer bakes are usually limited to about 400C - unless the point is to deliberately degrade the chip

Not to say that it can't be done, only that the process window is not very large and the propensity for deleterious carbon soot is very high. Likely this will generate some very fun, highly integrated problem statements before we see this available for sale.

Getting heat out of the chip is such a painful and important struggle. I hope this works on a real process line. Too many benefits on the table to ignore.

Edit: Grammar, clarity

replies(2): >>45661975 #>>45682876 #
8. kulahan ◴[] No.45660688{5}[source]
That makes sense. A direct scale instead of degrees of representation. Thanks for the correction.
9. hnuser123456 ◴[] No.45661975{4}[source]
I wonder, in situations like the Raptor lake fiasco or other "overclocked a little too far" scenarios where the circuit degrades to the point the frequency must be reduced to maintain expected stability, that some very small spots on the chip approached that temperature, while the temp sensor read 100C or below (not kicking in thermal throttling when it should've)?
replies(1): >>45662258 #
10. FaradayRotation ◴[] No.45662258{5}[source]
Caveats: My understanding of the Raptor Lake mess is pretty limited, mostly because Intel has been fairly closed lipped on what specific issue caused that. My personal suspicion is that it was a pareto plot's worth of issues. Also, while I do know a few things about this particular topic, I am far from the final authority on it.

My understanding is that point/local resistive heating problems out in the wild tend to drive different failure modes vs the global heating techniques used on the manufacturing line, mostly because the CPU is in active operation, which changes the defect physics. Put another way, likely any particular structure in the CPU would not need to reach 400C to fail - even the small voltages used in these chips coupled with elevated temperature can drive a lot of difficult-to-catch, slow-to-manifest failure modes. Copper metal migration is the classic example of this type of problem, where copper ions slowly migrate under voltage+temperature, causing/propagating voids until finally an open circuit is made. Surprise! your chip no longer works after seeming perfectly fine! Manufacturers try to catch such problems with simulated aging through aggressive temperature and voltage experiments. Intel must have discovered a big gap in their visibility, and then discovered their CPU specs were incompatible with the stated product lifetime without a major re-spec of already sold product. Ouch.

The chip manufacturer also has some capability to make repairs and adjustments ahead of end of line, which should encompass managing some of the issues you refer to. Some big customers might have their own repair capabilities.

Edit: Clarity, trying to better address the question

11. xeonmc ◴[] No.45682876{4}[source]
If growing diamonds is the thermal bottleneck of manufacturing processes, one could imagine a sci-fi future where rather than silicon wafers serving as base matrix material to grow ancillary structures upon, it would instead be diamond wafers that are used to subtractively etch structural scaffoldings, around which silicon-based structures are grown, the diamond scaffolding serving simultaneously as bone and blood vessels for thermal and power conduction as well as mechanical support.