NVIDIA DGX-1 Pascal Based Supercomputer Announced – Features 8 Tesla P100 GPUs With 170 TFLOPs Compute

Apr 5, 2016
At GTC 2016, NVIDIA announced their behemoth DGX-1 supercomputer which features up to 170 TFLOPs of compute performance. The DGX-1 is an all-in-one supercomputing solution that houses several Tesla P100 graphics boards that were launched today. Based on the Pascal GPU architecture, the Tesla P100 delivers an insane boost in compute performance and allows high-performance supercomputing for deep learning.

NVIDIA DGX-1 Is A 16nm Pascal Based Super computing Solution With 170 TeraFlops of Compute Performance

The NVIDIA DGX-1 is a complete supercomputing solution that houses NVIDIA’s latest hardware and software innovations ranging from Pascal and NVIDIA SDK suite. The DGX-1 has the performance throughput equivalent to 250 x86 servers. This insane amount of performance allows users to get their own supercomputer for HPC and AI specific workloads.

“Artificial intelligence is the most far-reaching technological advancement in our lifetime,” said Jen-Hsun Huang, CEO and co-founder of NVIDIA. “It changes every industry, every company, everything. It will open up markets to benefit everyone. Data scientists and AI researchers today spend far too much time on home-brewed high performance computing solutions. The DGX-1 is easy to deploy and was created for one purpose: to unlock the powers of superhuman capabilities and apply them to problems that were once unsolvable.” via NVIDIA

Some of the key specifications of NVIDIA’s DGX-1 Supercomputer include:

  • Up to 170 teraflops of half-precision (FP16) peak performance
  • Eight Tesla P100 GPU accelerators, 16GB memory per GPU
  • NVLink Hybrid Cube Mesh
  • 7TB SSD DL Cache
  • Dual 10GbE, Quad InfiniBand 100Gb networking
  • 3U – 3200W

Supercomputing Performance Comes at a Super Insane Price – $129,000 US For NVIDIA’s DGX-1

It’s obvious that the NVIDIA DGX-1 isn’t built for a specific user but will be aimed at big organizations such as universities and institutes involved in research. Just like Pascal which will be shipping in 2016 and available in June in the US and Q3 for the rest of the world, the DGX-1 orders commence from today but will be available at a later date. Probably by the end of this year. The NVIDIA DGX-1 comes with 8 Pascal based Tesla P100 graphics boards, Dual Intel Xeon processors and 7 TBs of SSD storage. The whole platform achieves an aggregate bandwidth of 768 GB/s.

Comprehensive Deep Learning Software Suite

The NVIDIA DGX-1 system includes a complete suite of optimized deep learning software that allows researchers and data scientists to quickly and easily train deep neural networks.

The DGX-1 software includes the NVIDIA Deep Learning GPU Training System (DIGITS), a complete, interactive system for designing deep neural networks (DNNs). It also includes the newly released NVIDIA CUDA Deep Neural Network library (cuDNN) version 5, a GPU-accelerated library of primitives for designing DNNs.
It also includes optimized versions of several widely used deep learning frameworks — Caffe, Theano and Torch. The DGX-1 additionally provides access to cloud management tools, software updates and a repository for containerized applications.

The Tesla P100 Housed Inside the DGX-100 Is a Monster Graphics Card

The Tesla P100 is the heart of the DGX-100 platform. Featuring the latest 5th generation Pascal architecture with 3584 CUDA Cores, 240 texture mapping units, clock speeds up to 1480 MHz and 16 GB of HBM2 VRAM (720 GB/s stream bandwidth), the DGX-1 is all prepped for the most intensive workloads pitted against it. We have already covered an extensive deal of architecture details in our article here so you definitely want to give that a read. The NVIDIA presentation was definitely enjoyable for the folks who love high performance computing products but we are very sure that NVIDIA will have products for consumers heading it in a very short time.

NVIDIA Pascal Tesla P100 Specs:

NVIDIA Tesla Graphics CardTesla K40
Tesla M40
Tesla P100
Tesla P100
Tesla P100 (Mezzanine)
GPUGK110 (Kepler)GM200 (Maxwell)GP100 (Pascal)GP100 (Pascal)GP100 (Pascal)
Process Node28nm28nm16nm16nm16nm
Transistors7.1 Billion8 Billion15.3 Billion15.3 Billion15.3 Billion
GPU Die Size551 mm2601 mm2610 mm2610 mm2610 mm2
CUDA Cores Per SM192128646464
CUDA Cores (Total)28803072358435843584
FP64 CUDA Cores / SM644323232
FP64 CUDA Cores / GPU96096179217921792
Base Clock745 MHz948 MHzTBDTBD1328 MHz
Boost Clock875 MHz1114 MHz1300MHz1300MHz1480 MHz
FP64 Compute1.68 TFLOPs0.2 TFLOPs4.7 TFLOPs4.7 TFLOPs5.30 TFLOPs
Texture Units240192224224224
Memory Interface384-bit GDDR5384-bit GDDR54096-bit HBM24096-bit HBM24096-bit HBM2
Memory Size12 GB GDDR524 GB GDDR512 GB HBM216 GB HBM216 GB HBM2
L2 Cache Size1536 KB3072 KB4096 KB4096 KB4096 KB
