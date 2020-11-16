  ⋮  

NVIDIA Announces DGX Station A100 With Upgraded 80 GB A100 Tensor Core GPUs, Up To 320 GB Memory & 2.5 Petaflops of AI Horsepower

NVIDIA has just announced its 2nd Generation DGX Station AI server based on the Ampere A100 Tensor Core GPUs. The DGX Station A100 comes in two configurations and features the updated A100 Tensor Core CPUs which pack double the memory & multi-Petaflops of AI horsepower at its disposal.

NVIDIA Unveils 2nd Generation DGX Station A100 AI Server - Now Packs Updated 80 GB A100 Tensor Core GPUs & Multi-Petaflops of Performance

The NVIDIA DGX Station A100 is aimed at the AI market, accelerating machine learning and data science performance for corporate offices, research facilities, labs, or home offices everywhere. According to NVIDIA, the DGX Station A100 is designed to be the fastest server in a box dedicated to AI research.

DGX Station Powers AI Innovation Organizations around the world have adopted DGX Station to power AI and data science across industries such as education, financial services, government, healthcare, and retail. These AI leaders include:

  • BMW Group Production is using NVIDIA DGX Stations to explore insights faster as they develop and deploy AI models that improve operations.
  • DFKI, the German Research Center for Artificial Intelligence, is using DGX Station to build models that tackle critical challenges for society and industry, including computer vision systems that help emergency services respond rapidly to natural disasters.
  • Lockheed Martin is using DGX Station to develop AI models that use sensor data and service logs to predict the need for maintenance to improve manufacturing uptime, increase safety for workers, and reduce operational costs.
  • NTT Docomo, Japan's leading mobile operator with over 79 million subscribers, uses DGX Station to develop innovative AI-driven services such as its image recognition solution.
  • Pacific Northwest National Laboratory is using NVIDIA DGX Stations to conduct federally funded research in support of national security. Focused on technological innovation in energy resiliency and national security, PNNL is a leading U.S. HPC center for scientific discovery, energy resilience, chemistry, Earth science, and data analytics.

NVIDIA DGX Station A100 System Specifications

Coming to the specifications, the NVIDIA DGX Station A100 is powered by a total of four A100 Tensor Core GPUs. These aren't just any A100 GPUs as NVIDIA has updated the original specs, accomodating twice the memory.

The NVIDIA A100 Tensor Core GPUs in the DGX Station A100 comes packed with 80 GB of HBM2e memory which is twice the memory size of the original A100. This means that the DGX Station has a total of 320 GB of total available capacity while fully supporting MIG (Multi-Instance GPU protocol) and 3rd Gen NVLink support, offering 200 GB/s of bidirectional bandwidth between any GPU pair & 3 times faster interconnect speeds than PCIe Gen 4. The rest of the specs for the A100 Tensor Core GPUs remain the same.

The system itself houses an AMD EPYC Rome 64 Core CPU with full PCIe Gen 4 support, up to 512 GB of dedicated system memory, 1.92 TB NVME M.2 SSD storage for OS, and up to 7.68 TB NVME U.2 SSD storage for data cache. For connectivity, the system carries 2x 10 GbE LAN controllers, a single 1 GbE LAN port for remote management. Display output is provided through a discrete DGX Display Adapter card which offers 4 DisplayPort outputs with up to 4K resolution support. The AIC features its own active cooling solution.

Talking about the cooling solution, the DGX Station A100 houses the A100 GPUs on the rear side of the chassis. All four GPUs and the CPU are supplemented by a refrigerant cooling system which is whisper quiet and also maintenance free. The compressor for the cooler is located within the DGX chassis.

NVIDIA Ampere GA100 GPU Based Tesla A100 Specs:

NVIDIA Tesla Graphics CardTesla K40
(PCI-Express)		Tesla M40
(PCI-Express)		Tesla P100
(PCI-Express)		Tesla P100 (SXM2)Tesla V100 (SXM2)Tesla V100S (PCIe)NVIDIA A100 (SXM4)NVIDIA A100 (PCIe4)
GPUGK110 (Kepler)GM200 (Maxwell)GP100 (Pascal)GP100 (Pascal)GV100 (Volta)GV100 (Volta)GA100 (Ampere)GA100 (Ampere)
Process Node28nm28nm16nm16nm12nm12nm7nm7nm
Transistors7.1 Billion8 Billion15.3 Billion15.3 Billion21.1 Billion21.1 Billion54.2 Billion54.2 Billion
GPU Die Size551 mm2601 mm2610 mm2610 mm2815mm2815mm2826mm2826mm2
SMs152456568080108108
TPCs1524282840405454
FP32 CUDA Cores Per SM192128646464646464
FP64 CUDA Cores / SM644323232323232
FP32 CUDA Cores28803072358435845120512069126912
FP64 CUDA Cores96096179217922560256034563456
Tensor CoresN/AN/AN/AN/A640640432432
Texture Units240192224224320320432432
Boost Clock875 MHz1114 MHz1329MHz1480 MHz1530 MHz1601 MHz1410 MHz1410 MHz
TOPs (DNN/AI)N/AN/AN/AN/A125 TOPs130 TOPs1248 TOPs
2496 TOPs with Sparsity		1248 TOPs
2496 TOPs with Sparsity
FP16 ComputeN/AN/A18.7 TFLOPs21.2 TFLOPs30.4 TFLOPs32.8 TFLOPs312 TFLOPs
624 TFLOPs with Sparsity		312 TFLOPs
624 TFLOPs with Sparsity
FP32 Compute5.04 TFLOPs6.8 TFLOPs10.0 TFLOPs10.6 TFLOPs15.7 TFLOPs16.4 TFLOPs156 TFLOPs
(19.5 TFLOPs standard)		156 TFLOPs
(19.5 TFLOPs standard)
FP64 Compute1.68 TFLOPs0.2 TFLOPs4.7 TFLOPs5.30 TFLOPs7.80 TFLOPs8.2 TFLOPs19.5 TFLOPs
(9.7 TFLOPs standard)		19.5 TFLOPs
(9.7 TFLOPs standard)
Memory Interface384-bit GDDR5384-bit GDDR54096-bit HBM24096-bit HBM24096-bit HBM24096-bit HBM26144-bit HBM2e6144-bit HBM2e
Memory Size12 GB GDDR5 @ 288 GB/s24 GB GDDR5 @ 288 GB/s16 GB HBM2 @ 732 GB/s
12 GB HBM2 @ 549 GB/s		16 GB HBM2 @ 732 GB/s16 GB HBM2 @ 900 GB/s16 GB HBM2 @ 1134 GB/s40 GB HBM2 @ 1.6 TB/sUp To 80 GB HBM2 @ 1.6 TB/s
L2 Cache Size1536 KB3072 KB4096 KB4096 KB6144 KB6144 KB40960 KB40960 KB
TDP235W250W250W300W300W250W400W250W

NVIDIA DGX Station A100 System Performance

As for performance, the DGX Station A100 delivers 2.5 Petaflops of AI training power & 5 PetaOPS of INT8 inferencing horsepower. The DGX Station A100 is also the only workstation of its kind to support the MIG (Multi-Instance GPU) protocol, allowing users to slice up individual GPUs, allowing for simultaneous workloads to be executed faster and more efficiently.

Over the original DGX Station, the new version offers a 3.17x increase in Training performance, 4.35x increase in Inference performance, and 1.85x increase in HPC oriented workloads. NVIDIA has also updated its DGX A100 system to feature 80 GB A100 Tensor Core GPUs too. Those allow NVIDIA to gain 3 times faster training performance over the standard 320 GB DGX A100 system, 25% faster inference performance, and two times faster data analytics performance.

NVIDIA DGX Station A100 System Availability

NVIDIA has announced that the DGX Station A100 and NVIDIA DGX A100 640 GB systems will be available this quarter through NVIDIA's partner network resellers worldwide. The company will also be offering an upgrade option for DGX A100 320 GB system owners to upgrade to the 640 GB DGX variant featuring eight 80 GB A100 Tensor Core GPUs. NVIDIA has not provided any information on the pricing of the systems yet.

