NVIDIA Hopper H200 GPUs Supercharged With World’s Fastest HBM3e Memory, Grace Hopper Superchips Power Jupiter Supercomputer

Hassan Mujtaba
NVIDIA H20 AI GPU For China To Be Mass Produced In Q2 2024, Full Compliance With US Policies 1

NVIDIA has announced its brand new H200 Hopper GPU which now comes equipped with the world's fastest HBM3e memory from Micron. In addition to the new AI platforms, NVIDIA also announced a major supercomputer win with its Grace Hopper Superchips that now power the Exaflop Jupiter supercomputer.

NVIDIA Continues To Build AI Momentum With Upgraded Hopper GPUs, Grace Hopper Superchips & Supercomputer Wins

NVIDIA's H100 GPUs are the most highly demanded AI chips in the industry so far but the green team wants to offer even more performance to its customers. Enter, HGX H200, the latest HPC & computing platform for AI which is powered by H200 Tensor Core GPUs. These GPUs feature the latest Hopper optimizations on both hardware and software & while delivering the world's fastest memory solution to date.

Related Story NVIDIA Reportedly Halts Bundling VRAM Chips With GPU Dies For Board Partners

The NVIDIA H200 GPUs are equipped with Micron's HBM3e solution with memory capacities of up to 141 GB and up to 4.8 TB/s of bandwidth which is 2.4x more bandwidth and double the capacity versus the NVIDIA A100. This new memory solution allows NVIDIA to nearly double the AI inference performance versus its H100 GPUs in applications such as Llama 2 (70 Billion parameter LLM). The recent advancements in the TensorRT-LLM suite have also resulted in huge performance gains in a vast number of AI applications.

In terms of solutions, the NVIDIA H200 GPUs will be available in a wide range of HGX H200 servers with 4 and 8-way GPU configurations. An 8-way configuration of H200 GPUs in an HGX system will provide up to 32 PetaFLOPs of FP8 compute performance and 1.1 TB of memory capacities.

NVIDIA H200 GPU: Supercharged With HBM3e Memory, Available In Q2 2024

The GPUs will also be compatible with the existing HGX H100 systems, making it easier for customers to upgrade their platforms. NVIDIA partners such as ASUS, ASRock Rack, Dell, Eviden, GIGABYTE, Hewlett Packard Enterprise, Ingrasys, Lenovo, QCT, Wiwynn, Supermicro, and Wistron, will offer updated solutions when the H200 GPUs become available in the 2nd quarter of 2024.

NVIDIA Grace Hopper Superchips Power 1-Exaflop Jupiter Supercomputer

In addition to the H200 GPU announcement, NVIDIA has also announced a major supercomputer win powered by its Grace Hopper Superchips (GH200). The Supercomputer is known as Jupiter and will be located at the Forschungszentrum Jülich facility in Germany as a part of the EuroHPC Joint Undertaking and contracted to Eviden and ParTec. The supercomputer will be used for Material Science, Climate Research, Drug Discovery, and More. This is also the second supercomputer that NVIDIA announced in November with the previous one being the Isambard-AI, offering up to 21 Exaflops of AI performance.

In terms of configuration, the Jupiter Supercomputer is based on Eviden’s BullSequana XH3000 which makes use of a fully liquid-cooled architecture. It boasts a total of 24,000 NVIDIA GH200 Grace Hopper Superchips which are interconnected using the company's Quantum-2 Infiniband. Considering that each Grace CPU packs 288 Neoverse cores, we are looking at almost 7 Million ARM cores on the CPU side alone for Jupiter (6,912,000 to be exact).

Performance metrics include 90 Exaflops of AI training & 1 Exaflop of high-performance compute. The supercomputer is expected to be installed in 2024. Overall, these are some major updates by NVIDIA as it continues to lead the charge of the AI world with its powerful hardware and software technologies.

NVIDIA HPC / AI GPUs

NVIDIA Tesla Graphics CardNVIDIA B200NVIDIA H200 (SXM5)NVIDIA H100 (SMX5)NVIDIA H100 (PCIe)NVIDIA A100 (SXM4)NVIDIA A100 (PCIe4)Tesla V100S (PCIe)Tesla V100 (SXM2)Tesla P100 (SXM2)Tesla P100
(PCI-Express)
Tesla M40
(PCI-Express)
Tesla K40
(PCI-Express)
GPUB200H200 (Hopper)H100 (Hopper)H100 (Hopper)A100 (Ampere)A100 (Ampere)GV100 (Volta)GV100 (Volta)GP100 (Pascal)GP100 (Pascal)GM200 (Maxwell)GK110 (Kepler)
Process Node4nm4nm4nm4nm7nm7nm12nm12nm16nm16nm28nm28nm
Transistors208 Billion80 Billion80 Billion80 Billion54.2 Billion54.2 Billion21.1 Billion21.1 Billion15.3 Billion15.3 Billion8 Billion7.1 Billion
GPU Die SizeTBD814mm2814mm2814mm2826mm2826mm2815mm2815mm2610 mm2610 mm2601 mm2551 mm2
SMs160132132114108108808056562415
TPCs806666575454404028282415
L2 Cache SizeTBD51200 KB51200 KB51200 KB40960 KB40960 KB6144 KB6144 KB4096 KB4096 KB3072 KB1536 KB
FP32 CUDA Cores Per SMTBD128128128646464646464128192
FP64 CUDA Cores / SMTBD128128128323232323232464
FP32 CUDA CoresTBD16896168961459269126912512051203584358430722880
FP64 CUDA CoresTBD16896168961459234563456256025601792179296960
Tensor CoresTBD528528456432432640640N/AN/AN/AN/A
Texture UnitsTBD528528456432432320320224224192240
Boost ClockTBD~1850 MHz~1850 MHz~1650 MHz1410 MHz1410 MHz1601 MHz1530 MHz1480 MHz1329MHz1114 MHz875 MHz
TOPs (DNN/AI)20,000 TOPs3958 TOPs3958 TOPs3200 TOPs2496 TOPs2496 TOPs130 TOPs125 TOPsN/AN/AN/AN/A
FP16 Compute10,000 TFLOPs1979 TFLOPs1979 TFLOPs1600 TFLOPs624 TFLOPs624 TFLOPs32.8 TFLOPs30.4 TFLOPs21.2 TFLOPs18.7 TFLOPsN/AN/A
FP32 Compute90 TFLOPs67 TFLOPs67 TFLOPs800 TFLOPs156 TFLOPs
(19.5 TFLOPs standard)
156 TFLOPs
(19.5 TFLOPs standard)
16.4 TFLOPs15.7 TFLOPs10.6 TFLOPs10.0 TFLOPs6.8 TFLOPs5.04 TFLOPs
FP64 Compute45 TFLOPs34 TFLOPs34 TFLOPs48 TFLOPs19.5 TFLOPs
(9.7 TFLOPs standard)
19.5 TFLOPs
(9.7 TFLOPs standard)
8.2 TFLOPs7.80 TFLOPs5.30 TFLOPs4.7 TFLOPs0.2 TFLOPs1.68 TFLOPs
Memory Interface8192-bit HBM45120-bit HBM3e5120-bit HBM35120-bit HBM2e6144-bit HBM2e6144-bit HBM2e4096-bit HBM24096-bit HBM24096-bit HBM24096-bit HBM2384-bit GDDR5384-bit GDDR5
Memory SizeUp To 192 GB HBM3 @ 8.0 GbpsUp To 141 GB HBM3e @ 6.5 GbpsUp To 80 GB HBM3 @ 5.2 GbpsUp To 94 GB HBM2e @ 5.1 GbpsUp To 40 GB HBM2 @ 1.6 TB/s
Up To 80 GB HBM2 @ 1.6 TB/s
Up To 40 GB HBM2 @ 1.6 TB/s
Up To 80 GB HBM2 @ 2.0 TB/s
16 GB HBM2 @ 1134 GB/s16 GB HBM2 @ 900 GB/s16 GB HBM2 @ 732 GB/s16 GB HBM2 @ 732 GB/s
12 GB HBM2 @ 549 GB/s
24 GB GDDR5 @ 288 GB/s12 GB GDDR5 @ 288 GB/s
TDP700W700W700W350W400W250W250W300W300W250W250W235W

Follow Wccftech on Google to get more of our news coverage in your feeds.

Button