AMD Manages To Pack A PetaFLOPs Capable Super Computer In A Rack With Project 47

Author Photo
Aug 1

AMD recently unveiled something truly remarkable today – a server rack that has a total processing power of 1 PetaFLOPs. That’s 10 to the power of 15 floating point operations per second or 20 to the power of 15 half precision FLOPs. Here’s the kicker though: a decade ago in 2007, a computer of the same power would have required roughly 6000 square feet of area and thousands of processors to power. A decade ago, this would have been one of the most powerful supercomputers on Earth, and today, its a server rack.

AMD’s Project 47 unveiled: 1 PetaFLOPs of single precision compute at 30 GFLOPs per watt and only single rack footprint

The server rack, ahem supercomputer, named Project 47 is powered by 20x EPYC 7601 32 Core processors and around 80x Radeon Instinct GPUs. It supports around 10 TB of Samsung Memory and 20x Mellanox 100G cards as well as 1 switch. All of this is fitted into a server rack that is roughly the height of 1.25 Lisa Su’s with an energy efficiency of 30 GFLOPs per watt. That means the project 47 super computer consumes around 33,333 watts of electricity. Project 47 will be available from Inventec and their principal distributor AMAX sometime in Q4 of this year.

Today at Capsaicin SIGGRAPH, AMD showcased what can be achieved when the world’s greatest server CPU is combined with the world’s greatest GPU, based on AMD’s revolutionary “Vega” architecture. Developed by AMD in collaboration with Inventec, Project 47 is based on Inventec’s P-series massively parallel computing platform, and is a rack designed to excel in a range of tasks, from graphics virtualization to machine intelligence.

Back in 2007, you would have found the same power in a supercomputer called the IBM Roadrunner. This was a super computer project that was once the most powerful, well, super computer of its time and built by AMD and IBM for the Los Alamos National Laboratory. The cluster had 696 racks spanning an area of 6000 square feet and consumer 2,350,000 watts of electricity. The cluster consisted primarily of around 64,000 dual core Opteron CPUs and some accelerators.

So basically in a little over 10 years, AMD has managed to make a system that consumes 98% less power and takes up 99.93% less space. We are not yet sure how much Project 47 will cost, but we are pretty sure it will be less than the US $100 Million cost of the original Roadrunner. If that isn’t the epitome of modern computational advances, I don’t know what is.

So how exactly did AMD manage this feat? Well, usually when talking abut a decade, there are several node shrinks involved as well as architectural gains however, it is clear from the specifications that the rockstar of Project 47 isn’t the CPU, its the GPU. While AMD has progressed from the architecture of old of 2007, and the occasional node shrink excepted, the progress on the CPU front hasn’t been anywhere near as large to justify the simply ridiculous gains seen here. In fact, with 20 EPYC 7601 CPUs you are looking at a core count of just 640 cores which simply pales in comparison to the 128,000 cores in the original roadrunner. Since we certainly did not see IPC increase of 20000% it is clear that the star of the Project 47 is the Radeon Instinct GPU.

With 80 Radeon Instincts inside the server rack, you can already account for roughly 960 TFLOPs (depending on the clock speed) already out of the 1000 TFLOPs that the P47 is rated at. With 128 PCIe lanes per CPU, the EPYC processors will act as the drivers of the Radeon Instinct and won’t actually handle the brunt of the load. So basically form an all-CPU based Roadrunner, we have come to P47, which is practically an all-GPU based show. It really speaks volumes for the bonkers growth in power we seen in the GPU department. The rapid scaling of core count, architectural gains and node shrinks have really ushered in a new era of computational power.