NVIDIA’s 5th Gen Flagship Pascal GPU is 70% Faster Than Maxwell in CUDA Deep Neural Network Workloads

• Apr 5, 2016 at 07:58am EDT

NVIDIA will unveil their Pascal GPU architecture aimed at HPC and AI market at GTC 2016 later today. Pascal is the codename for NVIDIA's 5th generation graphics architecture which delivers a range of technologies such as NVLINK, HBM2 and Mixed Precision. While we will get to hear more details at Jen-Hsun Huang's keynote in a few hours, some seminars at GTC 2016 have already revealed the performance improvement Pascal brings in AI (Artificial Intelligence) specific workloads.

Image Credits/Source: Hardware.Fr

NVIDIA's Pascal GPU Boost Performance By 70% in Deep Neural Network / AI Workloads

Spotted by Videocardz (via Hardware.fr), a slide discussing the NVIDIA cuDNN improvements was displayed in a session related to AI at GTC 2016. The slide displays the general performance improvement that NVIDIA brings with their updated cuDNN (CUDA Deep Neural Network) library. The new cuDNN v5 library comes with updates that include:

High Performance Deep Neural Network Training
Accelerates Deep Learning: Caffe, CNTK, Tensorflow, Theano, Torch
Performance continues to improve over time

The slide shows that Pascal with cuDNN v5 can deliver up to 12 times the performance increase in general. The Maxwell based Tesla M40 with cuDNN v3 and Kepler based Tesla K40 with cuDNN v1 delivers 6 times and 4 times (respective) performance increases. To sum it up for you, Pascal with the updated library delivers 70% better performance in AlexNet training throughput compared to the fastest single chip Maxwell Tesla solution available today.

Image Credits/Source: Computerbase and ServerTheHome

While this is a relatively big increase in performance, we can't evaluate the overall performance of the Pascal GPUs with just one metric and hope to learn more about Pascal GPUs in the main keynote today. Pascal is also confirmed to ship in a range of HPC optimized racks from SuperMicro and Quanta later this year.

Both companies have showcased their latest solutions based on Pascal GPU architecture and NVLINK. The new QuantaPlex T21W-3W is the first x86 server with NVIDIA Pascal NVLINK technology which means that NVIDIA is already ready to ship such solutions through their partners while SuperMicro will be shipping the flagship Pascal based 1U DP SYS-1028GQ-TR(T) rack this year following the announcement today.

The NVLINK Interconnect allows faster GPU To GPU access in servers!

The latest NVLINK interconnect path will allow multi-processors featured inside HPC blocks to have faster interconnect than traditional PCI-e Gen3 lanes up to 200 GB/s speeds. Pascal GPUs will also feature Unified memory support allowing the CPU and GPU to share the same memory pool and finally we have Mixed precision support. While NVLINK isn’t planned for commercial integration right now, it will be featured in servers using ARM64 chips and x86 powered HPC platforms that utilize OpenPower, Tyan and Quantum solutions. Expect to hear more on Pascal GPUs in a few hours.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA’s 5th Gen Flagship Pascal GPU is 70% Faster Than Maxwell in CUDA Deep Neural Network Workloads

NVIDIA’s 5th Gen Flagship Pascal GPU is 70% Faster Than Maxwell in CUDA Deep Neural Network Workloads

NVIDIA's Pascal GPU Boost Performance By 70% in Deep Neural Network / AI Workloads

Trending Stories

CAPCOM Insider Shuts Down First-Person Fears for Resident Evil Veronica, Warns Fans a Lot Has Been Reimagined

Intel Arc G3 Extreme Performance Benchmarks Show Clear Disruption In The Handled Segment, Offers Double The Battery At Same Performance As Z2 Extreme

AMD Says It Will Bring New Zen Architectures & Products To AM5 Through 2029, But The Next Socket Will Only Arrive When DDR6/PCIe Make Sense

Alien: Isolation 2 Promises a Smarter Xenomorph and Slimmer Survival Odds on a Remote, Storm-Ravaged Weyland-Yutani Outpost

Verizon Replacing Its Customer Service Personnel With AI Has Turned Live Chat Queries Into Low-Quality ChatGPT-Like Replies, Enraging Customers

Popular Discussions

AMD’s Frank Azor Pushes Back on FSR 4.1 Cancellation Rumor for RDNA 3.5 iGPUs, Says No Such Decision Has Been Made

Radeon RX 9060 XT Smashes Into Second Place At 4.64 GHz After AMD Overclocker Taps Secret Internal Tools

AMD Radeon RX 9070 GRE Sells Almost Zero Units On Day One At A Major German Retailer

AMD Reportedly Says No To FSR 4 For RDNA 3.5, Stripping Ryzen AI 300/400 APUs Of Latest Upscaling Technology

NVIDIA Doubles Down on DLSS 4.5 With Smarter Ray Reconstruction at Computex, But DLSS 5 Is a No-Show

NVIDIA’s 5th Gen Flagship Pascal GPU is 70% Faster Than Maxwell in CUDA Deep Neural Network Workloads

NVIDIA's Pascal GPU Boost Performance By 70% in Deep Neural Network / AI Workloads

Further Reading

Trending Stories

Popular Discussions