NVIDIA Unveils DGX SATURNV – World’s Most Efficient SuperComputer Powered by Pascal GP100, Delivers 9.46 Gigaflops/Watt

Hassan Mujtaba • Nov 15, 2016 at 06:56am EST

NVIDIA has announced their latest DGX SATURNV Supercomputer that is designed to build smarter cars and next generation GPUs. The DGX SATURNV is termed as the most efficient supercomputer and utilizes NVIDIA Pascal GPUs.

NVIDIA's DGX SATURNV SuperComputer Is The World's Most Efficient - Utilizes Tesla P100 GPUs

The DGX SATURNV is ranked 28th on the Top500 list of Supercomputers and is also the most efficient of them all. The Supercomputer houses several DGX-1 units, which is NVIDIA's custom designed server rack based on their Tesla P100 graphics chips. Right now, the most efficient machine on the Top500 list is rated at 6.67 Giga Flops/Watt. The NVIDIA designed DGX SATURNV delivers an incredible 9.46 GigaFlops/Watt which is a 42% improvement.

That efficiency is key to building machines capable of reaching exascale speeds — that’s 1 quintillion, or 1 billion billion, floating-point operations per second. Such a machine could help design efficient new combustion engines, model clean-burning fusion reactors, and achieve new breakthroughs in medical research. via NVIDIA

What Powers The DGX SATURNV?

Powering the NVIDIA GDX SATURNV are 124 DGX-1 units. The NVIDIA DGX-1 is a supercomputer inside a box and is capable of delivering large amounts of performance in a small package.

The NVIDIA DGX-1 is a complete supercomputing solution that houses NVIDIA’s latest hardware and software innovations ranging from Pascal and NVIDIA SDK suite. The DGX-1 has the performance throughput equivalent to 250 x86 servers. This insane amount of performance allows users to get their own supercomputer for HPC and AI specific workloads.

Assembled by a team of a dozen engineers using 124 DGX-1s — the AI supercomputer in a box we unveiled in April — SATURNV helps us build the autonomous driving software that’s a key part of our NVIDIA DRIVE PX 2 self-driving vehicle platform. via NVIDIA

Some of the key specifications of NVIDIA’s DGX-1 Unit include:

Up to 170 teraflops of half-precision (FP16) peak performance
Eight Tesla P100 GPU accelerators, 16GB memory per GPU
NVLink Hybrid Cube Mesh
20 Core Broadwell-E "Xeon E5-2698 v4" CPU (2.2 GHz)
7TB SSD DL Cache
Dual 10GbE, Quad InfiniBand 100Gb networking
3U – 3200W

DGX-1 is an appliance that integrates deep learning software, development tools and eight of our Tesla P100 GPUs — based on our new Pascal architecture — to pack computing power equal to 250 x86 servers into a device about the size of a stove top. via NVIDIA

The Tesla P100 is the heart of the DGX-100 platform. Featuring the latest 5th generation Pascal architecture with 3584 CUDA Cores, 240 texture mapping units, clock speeds up to 1480 MHz and 16 GB of HBM2 VRAM (720 GB/s stream bandwidth), the DGX-1 is all prepped for the most intensive workloads pitted against it. The chi[ delivers 5.6 TFLOPs of FP64, 10.6 TFLOPs of FP32 and 21.2 TFLOPs of FP16 compute performance. It comes in a 300W package but delivers up to 17.7 GFLOPs/Watt at double precision compute.

“This system is internally at Nvidia for our self-driving car initiatives,” says Buck. “We are also using it for chip and wafer defect analysis and for our own sales and marketing analytics. We are also taking the framework we are using on this system and using it as the starting point for the CANDLE framework for cancer research. You only need 36 of these nodes to reach one petaflops, and it really speaks to our strategy of building strong nodes. The small number of nodes makes it really tractable for us to build a system like Saturn V.” via NextPlatform

The DGX SaturnV proves that NVIDIA Pascal GP100 was designed for the AI / Datacenter market, offering incredible amounts of power efficiency to this market along with increased performance from previous gen graphics processing units.

NVIDIA Volta Tesla V100S Specs:

NVIDIA Tesla Graphics Card	Tesla K40 (PCI-Express)	Tesla M40 (PCI-Express)	Tesla P100 (PCI-Express)	Tesla P100 (SXM2)	Tesla V100 (PCI-Express)	Tesla V100 (SXM2)	Tesla V100S (PCIe)
GPU	GK110 (Kepler)	GM200 (Maxwell)	GP100 (Pascal)	GP100 (Pascal)	GV100 (Volta)	GV100 (Volta)	GV100 (Volta)
Process Node	28nm	28nm	16nm	16nm	12nm	12nm	12nm
Transistors	7.1 Billion	8 Billion	15.3 Billion	15.3 Billion	21.1 Billion	21.1 Billion	21.1 Billion
GPU Die Size	551 mm2	601 mm2	610 mm2	610 mm2	815mm2	815mm2	815mm2
SMs	15	24	56	56	80	80	80
TPCs	15	24	28	28	40	40	40
CUDA Cores Per SM	192	128	64	64	64	64	64
CUDA Cores (Total)	2880	3072	3584	3584	5120	5120	5120
Texture Units	240	192	224	224	320	320	320
FP64 CUDA Cores / SM	64	4	32	32	32	32	32
FP64 CUDA Cores / GPU	960	96	1792	1792	2560	2560	2560
Base Clock	745 MHz	948 MHz	1190 MHz	1328 MHz	1230 MHz	1297 MHz	TBD
Boost Clock	875 MHz	1114 MHz	1329MHz	1480 MHz	1380 MHz	1530 MHz	1601 MHz
FP16 Compute	N/A	N/A	18.7 TFLOPs	21.2 TFLOPs	28.0 TFLOPs	30.4 TFLOPs	32.8 TFLOPs
FP32 Compute	5.04 TFLOPs	6.8 TFLOPs	10.0 TFLOPs	10.6 TFLOPs	14.0 TFLOPs	15.7 TFLOPs	16.4 TFLOPs
FP64 Compute	1.68 TFLOPs	0.2 TFLOPs	4.7 TFLOPs	5.30 TFLOPs	7.0 TFLOPs	7.80 TFLOPs	8.2 TFLOPs
Memory Interface	384-bit GDDR5	384-bit GDDR5	4096-bit HBM2	4096-bit HBM2	4096-bit HBM2	4096-bit HBM2	4096-bit HBM
Memory Size	12 GB GDDR5 @ 288 GB/s	24 GB GDDR5 @ 288 GB/s	16 GB HBM2 @ 732 GB/s 12 GB HBM2 @ 549 GB/s	16 GB HBM2 @ 732 GB/s	16 GB HBM2 @ 900 GB/s	16 GB HBM2 @ 900 GB/s	16 GB HBM2 @ 1134 GB/s
L2 Cache Size	1536 KB	3072 KB	4096 KB	4096 KB	6144 KB	6144 KB	6144 KB
TDP	235W	250W	250W	300W	250W	300W	250W

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA Unveils DGX SATURNV – World’s Most Efficient SuperComputer Powered by Pascal GP100, Delivers 9.46 Gigaflops/Watt

NVIDIA Unveils DGX SATURNV – World’s Most Efficient SuperComputer Powered by Pascal GP100, Delivers 9.46 Gigaflops/Watt

NVIDIA's DGX SATURNV SuperComputer Is The World's Most Efficient - Utilizes Tesla P100 GPUs

What Powers The DGX SATURNV?

NVIDIA Volta Tesla V100S Specs:

Trending Stories

Samsung’s Profit In 2026 Will Exceed Its Cumulative Profit Generated Over The Past 40 Years!

Square Enix’s Final Fantasy VII Rebirth Remastered Lighting Drops On PC This Week, as Modder Transforms Game To Another Level

NVIDIA’s RTX 3060 12 GB Graphics Card Comeback Proves Just How Bad Things Are For The PC Gaming Market

Square Enix Shareholder Derails 46th Meeting to Praise The Adventures of Elliot, As Publisher Hints At Future Of Final Fantasy

CXMT Could Give Apple One More Reason To Pursue A DRAM Partnership, As Chinese Firm Is Working On A Game-Changing High-Density Memory Without Using EUV

Popular Discussions

Intel’s Shot At Fabricating Apple’s A20 Chip For The Base iPhone 18 Collapses As A Credible Leaker Calls The Original Source A ‘Blowhard’

AMD Zen 6 Gains a New Low-Power Core Beyond Zen 6 and Zen 6C, Surfacing in Linux Kernel Patches

Intel Expected To Restart Supply Of 10th, 12th, 13th, And 14th Gen Processors In Mainland China

Intel Cites Rising Supply Chain Costs As The Reason For Raising Prices Of Intel Core Ultra 200S Plus Processors

NVIDIA’s RTX 3060 12 GB Graphics Card Comeback Proves Just How Bad Things Are For The PC Gaming Market

NVIDIA Unveils DGX SATURNV – World’s Most Efficient SuperComputer Powered by Pascal GP100, Delivers 9.46 Gigaflops/Watt

NVIDIA's DGX SATURNV SuperComputer Is The World's Most Efficient - Utilizes Tesla P100 GPUs

What Powers The DGX SATURNV?

NVIDIA Volta Tesla V100S Specs:

Further Reading

Trending Stories

Popular Discussions