Intel Preparing 72 Core Xeon Monster Chip For High Performance Parallel Computing


The new monstrous arrival from Intel features 72 Silvermont cores, 16GB of stacked high bandwidth memory and support for 6 DDR4 channels.  If you think the above figures are impressive wait till you hear the rest. But before then let's take a small step back.

The chip is part of Intel's Xeon Phi line of professional co-processors. We talked about this new chip just over two weeks ago. However today we bring you more details directly from Intel as we discuss an interesting paper that has been surfaced ahead of the 2015 Intel Developer Forum. The new 72 core part is confirmed ahead of its launch since the only Xeon Phi "Knights Landing" coprocessor we had known earlier was the 60 core part.

Intel Preparing 72 Core Xeon Phi Monster Chip For High Performance Parallel Computing

The paper details how Intel intends to continue the push for even faster parallel computing with the x86 ISA.
High performance parallel computing relies on a number of crucial cornerstones for success and these include a very wide, very high performance compute focused processor and mammoth amounts of memory bandwidth to keep this chip fed.

Intel has all of these essentials covered with Knight's Landing. Featuring up to 72 Silvermont cores each capable of handling 4 threads. Packing up to 36MB of shared L2 cache and up to 16GB of stacked High Bandwidth Memory on package. In addition to a six channel DDR4 memory controller that can support speeds up to 2400Mhz and a total capacity of 384GB Intel did not refer to the total compute performance of its new processor in this particular paper, which was peculiar to say the least. However we do know that the chip is capable of 3 TFLOPs of double precision (FP64) compute performance which will be faster than NVIDIA's and AMD's current graphics offerings on the HPC front.

Interestingly enough, in this particular configuration that we mentioned above. The, very fast, stacked HBM will take on the role of L3 cache. However with HBM there's a sacrifice in memory speeds compared to traditional SRAM or DRAM L3. But thanks to the massively wide memory interface of HBM and the paramount gap in memory capacity in favor of HBM over traditional SRAM or DRAM based L3 cache. The end result is a very sizable increase in memory bandwidth and capacity.
Intel Xeon Phi Co-Processor 72 Cores
Apart from the highly impressive HBM memory system which will soon become the industry standard graphics memory as AMD introduces in its consumer graphics cards this year and Nvidia follows next year. Kinight's Landing features a very impressive, re-worked and highly improved microarchitecture as well. With the repurposed 14nm Silvermont, Intel is promising up to 3 times the single threaded performance over the last generation of Xeon Phi co-processors and up to 300% gains in power efficiency.

Xeon Phi co-processors have carved a niche in the market for high performance parallel computing. An area that's been and still is largely dominated by General Purpose Graphics Processing Units. Processors from Nvidia and AMD that were originally designed for computer graphics but were later on adapted and used for general purpose computing. These graphics processors continue to hold the crown for performance and power efficiency for parallel computing.

However an x86 solution still maintains a few genuine advantages in ease of use and programmability for developers. However, GPGPU solutions are closing the gap in this regard, just as Intel continues to close the gap with GPGPUs in performance and efficiency. It'll be interesting to see how this market develops, but before then Xeon Phi co-processors will continue to address a genuine need in the market that no other product can fulfill. And that's enough for Intel to keep pushing ahead.