> Nvidia B200s ... offer 2-3x the performance of H100s
For ML, not for HPC. ML and HPC are two completely different, only loosely related fields.
ML tasks are doing great with low precision, 16 and 8 bit precision is fine, arguably good results can be achieved even with 4 bit precision [0][1]. That won't do for HPC tasks, like predicting global weather, computational biology, etc. -- one would need 64 to 128 bit precision for that.
Nvidia needs to decide how to divide the billions of transistors on their new silicon. Greatly oversimplifying, they can choose to make one of the following:
* Card A with *n* FP64 cores, or
* Card B with *2n* FP32 cores, or
* Card C with *4n* FP16 cores, or
* Card D with *8n* FP8 cores, or (theoretically)
* Card E with *16n* FP4 cores (not sure if FP4 is a thing).
Card A would give HPC guys
n usable cores, and it would give ML guys
n usable cores. On the other end, Card E would give ML guys
16n usable cores (and
zero usable cores for HPC guys). It's no wonder that HPC crowd wants Nvidia to produce Card A, while ML crowd wants Nvidia to produce Card E. Given that all the hype and the money are currently with the ML guys (and $NVDA reflects that), Nvidia will make a combination of different cores that is much much closer to Card E than it is to Card A.
Their new offerings are arguably worse than their older offerings for HPC tasks, and the feeling with the HPC crowd is that "Nvidia and AMD are in the process of abandoning this market".
[0] https://papers.nips.cc/paper/2020/file/13b919438259814cd5be8...
[1] https://arxiv.org/abs/2212.09720