AMD's Senior Vice President, Sam Naffziger, had a lot to talk about regarding GPU efficiency, power numbers, chiplets, cache, and how their competitors stack up to them in an interview with Venture Beat.
AMD President & Tech Architect Talks Energy-Efficient Computing, GPUs, Chiplets & How They Are Addressing The Competition
Sam Naffziger has been at AMD for 16 years and currently has the role of the Senior Vice President, Corporate Fellow, and Product Technology Architect at the red camp. One of the key areas where Sam has focused is the power efficiency and power technology department. A few years ago, AMD laid out an ambitious goal of hitting 25x performance efficiency by 2020, and the company actually achieved it. Currently, AMD-Powered supercomputers not only land in the Top 500 but also on the Green 500 list. Now, the company is taking things further, achieving a 30x efficiency growth by 2025 or just 3 years from now. Considering all of the techs that AMD has coming soon (& detailed at FAD '22), we are really excited to see AMD hitting this goal.
During the interview, Sam talked about some of the key ways how they are achieving these goals and how the competition compares to them. It is no longer a mystery that next-gen GPUs are going to get more power-hungry. AMD recently confirmed this themselves and we have several NVIDIA leaks confirming the same thing.
In a slide presented by AMD, the company expects GPUs to hit over 600W TDP figures even before 2025. The company states that 'Power consumption is exploding since demand is outstripping the gains'. To address this, AMD has a few key technologies up its sleeves that will allow them to ship a GPU that's compelling in both performance and wattage versus the competition.
We’ve driven the frequency up, and that is something unique to AMD. Our GPU frequencies are 2.5 GHz plus now, which is hitting levels not before achieved. It’s not that the process technology is that much faster, but we’ve systematically gone through the design, re-architected the critical paths at a low level, the things that get in the way of high frequency, and done that in a power-efficient way.
Frequency tends to have a reputation of resulting in high power. But in reality, if it’s done right, and we just re-architect the paths to reduce the levels of logic required, without adding a bunch of huge gates and extra pipe stages and such, we can get the work done faster. If you know what drives power consumption in silicon processors, it’s voltage. That’s a quadratic effect on power. To hit 2.5 GHz, Nvidia could do that, and in fact they do it with overclocked parts, but that drives the voltage up to very high levels, 1.2 or 1.3 volts. That’s a squared impact on power. Whereas we achieve those high frequencies at modest voltages and do so much more efficiently.
We analyze our design pre-silicon, as we’re in the process of developing it, to assess that efficiency.
We absolutely analyzed heavily the Nvidia designs and what they were doing, and of course targeted doing much better.
It can already be seen in RDNA 2 which hits over 2.5 GHz clock speeds while retaining a lower wattage than its direct competition from NVIDIA. Sam highlights that the high power levels come directly from voltage. He states that NVIDIA GPUs can achieve the same clock speeds and even hit those in custom variants but to do so, they have to drive the voltages to the extreme (1.2-1.3V). That's much higher than the voltages AMD ship their GPUs with to achieve the same or even higher clock speeds.
The Infinity Cache in particular was an exciting thing to bring to market. That, as well as some of the power optimizations, was a CPU-leveraged capability. At the core of that is the same dense SRAM array that we use in our CPU designs for the L3 cache. It’s very power-efficient, very high bandwidth, and it turned out it was a great fit for graphics. No one had done such a large last-level cache like that. In fact, there was a lot of uncertainty as to whether the rates would be high enough to justify it. But we placed a bet, because going to a much wider GDDR6 interface is certainly a high-power solution for getting that bandwidth. We placed a bet on that. We went with a narrower bus interface and a large cache. That’s worked well for us. We see Nvidia following suit with larger last-level caches. But no one’s at 128MB yet.
So to achieve better performance without compromising efficiency, features such as Cache and Chiplets are emphasized. AMD says that they know that their competitors at NVIDIA are going for a larger last-level cache. NVIDIA's next-gen Ada Lovelace GPUs are expected to pack up to 96 MB of L2 cache which is a massive increase from the 6 MB cache that their current flagship features. But AMD says that while they are following suit, no one's at 128 MB like they have. Furthermore, that 128 MB cache is expected to be doubled or even tripled in the coming RDNA 3 GPU generation as per rumors.
VentureBeat: Compared to Nvidia and Intel, do you feel like we’re in a state of divergence when it comes to designs, or some kind of convergence?
Naffziger: It’s hard to speculate. Nvidia certainly hasn’t jumped on the chiplet bandwagon yet. We have a big lead there and we see big opportunities with that. They’ll be forced to do so. We’ll see when they deploy it. Intel certainly has jumped on that. Ponte Vecchio is the poster child for chiplet extremes. I would say that there’s more convergence than divergence. But the companies that innovate in the right space the soonest gain an advantage. It’s when you deliver the new technology as much as what the technology is. Whoever is first with innovation has the advantage.
As for chiplets, Sam welcomes Intel's approach to chiplets in the Ponte Vecchio GPU which he calls a 'Poster child for chiplet extremes' and also states that NVIDIA certainly hasn't jumped on the chiplet bandwagon so the red team who was first to chiplet innovation will definitely have a upper-hand in the segment.
Raja is a visionary. He paints a great and compelling picture of the gaming future and features that are required to drive the gaming experience to the next level. He’s great at that. As far as hands-on silicon execution, his background is in software. He definitely helped AMD to improve our software game and feature sets. I worked closely with Raja, but I didn’t join the graphics group until after he had left. He had a sabbatical there and went to Intel. So as far as the performance-per-watt, that was not really Raja’s footprint. But some of the software dimensions and such.
Finally, in a question regarding Raja Koduri's handling of the AMD graphics division, Sam said that Raja is a visionary and he paints a great and compelling picture of the gaming future and features. The AMD President states that Raja was more of a software guy than a Silicon techie but he really helped improve AMD's software side of things along with the gaming features. Raja has since went abroad Intel and taken the charge of Intel's Graphics division which is responsible for the upcoming Arc and Server GPU lineup.