Deep Learning Breakthrough Results In A 44-Core Intel Xeon Destroying NVIDIA Tesla V100 GPU

Usman Pirzada • Mar 5, 2020 at 12:07pm EST

A featured image of Intel Xeon Platinum and Gold's namesake metals.

[Edited 1:26 PM GMT+5] It would appear that the press release was a bit misleading. The actual comparison here is between a 2P system housing two 22-core Xeon CPUs with hyperthreading disabled vs one single Tesla V100. It is still an absolutely substantial speedup and the ramifications are more or less the same but I apologise for the error. Changes have been made in the original text wherever needed.

Something that will almost certainly be followed very closely by investors and professionals alike just occurred during a collaboration between Rice University and Intel Corporation. In what appears to be an absolutely insane speedup, researchers were able to use a 44-core Intel Xeon setup to beat an NVIDIA Tesla V100 by 3.5 times! CPUs usually perform far worse than GPUs when it comes to training deep neural networks (because of the highly parallel architecture) and this would be the first time a CPU has been leveraged this effectively for deep learning.

SLIDE algorithm makes a 44-core Intel Xeon CPU setup 3.5 times faster than NVIDIA Tesla V100 GPUs in AI deep learning

It has become almost common sense that GPUs will always be far superior to CPUs when it comes to training DL (deep neural) networks but these researchers from Rice University have succeeded in questioning this very basic tenet of DL. For what seems to be the very first time, a CPU has not only matched but absolutely destroyed GPU-based implementations and resulted in a confoundingly huge speedup.

SLIDE lead inventor Anshumali Shrivastava is an assistant professor of computer science in Rice University’s Brown School of Engineering. (Photo by Jeff Fitlow/Rice University)

Before we go any further, here is an extract from their press release:

Rice University computer scientists have overcome a major obstacle in the burgeoning artificial intelligence industry by showing it is possible to speed up deep learning technology without specialized acceleration hardware like graphics processing units (GPUs).

SLIDE doesn’t need GPUs because it takes a fundamentally different approach to deep learning. The standard “back-propagation” training technique for deep neural networks requires matrix multiplication, an ideal workload for GPUs. With SLIDE, Shrivastava, Chen and Medini turned neural network training into a search problem that could instead be solved with hash tables.

This radically reduces the computational overhead for SLIDE compared to back-propagation training. For example, a top-of-the-line GPU platform like the ones Amazon, Google and others offer for cloud-based deep learning services has eight Tesla V100s and costs about $100,000, Shrivastava said.

“We have one in the lab, and in our test case we took a workload that’s perfect for V100, one with more than 100 million parameters in large, fully connected networks that fit in GPU memory,” he said. “We trained it with the best (software) package out there, Google’s TensorFlow, and it took 3 1/2 hours to train.

“We then showed that our new algorithm can do the training in one hour, not on GPUs but on a 44-core Xeon-class CPU,” Shrivastava said. A copy of the research paper is available here.

Interestingly, however, Intel doesn't have a publicly available 44-core Xeon out right now. So, one of three possible things has happened here: 1) this is an unreleased and upcoming Intel Xeon, 2) the test was conducted using a single 22 core processor (which had 44 threads and the researchers erroneously referred to it as 44 cores) or 3) the test was conducted using 2x 22-cores in a 2P system.

The algorithm dubbed SLIDE (Sub LInear Deep learning Engine) is currently only executable on Intel processors. If an implementation of this algorithm is mainstreamed it would almost instantly disrupt the dynamics of the deep learning ecosystem. Valuations of companies could change overnight (assuming what the researchers are claiming has no caveat attached). It also raises the interesting question of whether the approach can be replicated on an AMD processor.

In any event, pending validation of this technique, we should see a significant amount of demand added to Intel's already lopsided supply equation. It would seem that as long as Intel can produce its processors, they have pent up demand as far as the eye can see.

News Source: Deep learning rethink overcomes major obstacle in AI industry

About the author: PC Hardware and Technology Enthusiast, Blood of Silicon (1 nm),

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on Deep Learning Breakthrough Results In A 44-Core Intel Xeon Destroying NVIDIA Tesla V100 GPU

Deep Learning Breakthrough Results In A 44-Core Intel Xeon Destroying NVIDIA Tesla V100 GPU

SLIDE algorithm makes a 44-core Intel Xeon CPU setup 3.5 times faster than NVIDIA Tesla V100 GPUs in AI deep learning

Trending Stories

AMD Prepares For Zen 6 EPYC CPUs Launch For July 22nd-23rd, Confirms AMD’s Mark Papermaster

Intel’s Arc Pro B70 Beats NVIDIA’s RTX 5090D In DeepSeek R1 AI LLM, Despite Costing A Quarter As Much, Offers Over 2000 Tokens/s

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

Valve Says Red Line Of Death On Steam Machine Indicates Memory Training And Not GPU Failure; Confirms Flipped LED Bar On Steam Machine

PS5 Pro Delivers The Best Experience In Assassin’s Creed Black Flag Resynced, As It Runs Ray-Traced Visuals At 60FPS

Popular Discussions

Intel’s Shot At Fabricating Apple’s A20 Chip For The Base iPhone 18 Collapses As A Credible Leaker Calls The Original Source A ‘Blowhard’

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

NVIDIA’s RTX 3060 12 GB Graphics Card Comeback Proves Just How Bad Things Are For The PC Gaming Market

AMD Ryzen Becomes The Top CPU Choice While Radeon Powers 1 In Every 3 Desktop Gaming GPUs Sold at Microcenter

Intel Cites Rising Supply Chain Costs As The Reason For Raising Prices Of Intel Core Ultra 200S Plus Processors

Deep Learning Breakthrough Results In A 44-Core Intel Xeon Destroying NVIDIA Tesla V100 GPU

SLIDE algorithm makes a 44-core Intel Xeon CPU setup 3.5 times faster than NVIDIA Tesla V100 GPUs in AI deep learning

Further Reading

Trending Stories

Popular Discussions