AMD EPYC Rome 7nm Server Processors Could Feature Up To 162 PCIe Gen 4 Lanes, More Than Twice As Much As Intel’s Xeon 9200 CPUs
AMD's EPYC Rome processors based on the 7nm Zen 2 architecture are not that far from launch as the company is expected to launch them at the Computex 2019, next month. While we have taken a good look at the underlying architecture of the Rome processors and the innovative Chiplet (Zen) design, there are still some key features that AMD hasn't disclosed yet but thanks to STH, we now know one key feature of the upcoming processor which would absolutely crush the competition, if true.
AMD EPYC Rome Processors Rock Up To 162 PCIe Gen 4 Lanes, Possibly More - Twice The Amount of Intel's Flagship Xeon Platinum 9200
After some fact finding, ServerTheHome has concluded that AMD's EPYC Rome processors would feature a higher number of PCI Express lanes than anticipated. We know that a single EPYC Rome processor would rock 128 PCIe Gen 4 lanes but this is particularly the dual socket servers that we are talking about.
The dual socket configurations would be pitted directly against the Intel's Xeon Platinum 9200 processor lineup which also comes in a dual-socket solution only. The Intel Xeon Platinum 9200 processors rock 40 PCIe Gen 3 lanes and since there are two chips on the 2S solution, we get a total of 80 PCIe Gen 3 lanes. Compare that to a single EPYC Rome processor which already offers more PCIe lanes than Intel's dual socket solution. Only the Intel 4S and 8P solutions can offer more lanes with the specific PCIe lane count for each server solution mentioned below (via STH):
- Xeon Platinum 9200: 2 CPUs with 40x PCIe Gen3 lanes each for 80 lanes total
- Xeon Scalable Mainstream: 2 CPUs with 48x PCIe Gen3 lanes each for 96 lanes total
- Xeon Scalable 4P: 4x CPUs with 48x PCIe Gen3 lanes each for 192 lanes total
- Xeon Scalable 8P: 8x CPUs with 48x PCIe Gen3 lanes each for 384 lanes total
Now the main advantage that AMD gains over Intel is that PCIe Gen 4 offers them twice the bandwidth as PCIe Gen 3. This is crucial along with the updated Infinity Fabric that AMD is using on their server processors. While the previous Infinity Fabric relied on the PCIe Gen 3 speeds for inter-chip communication, having PCI-e Gen 4 would mean that the Infinity Fabric would be affecting the PCI-e capacity lesser this time, directly enhancing the chip-to-chip, socket-to-socket, and I/O band-width speeds.
Since there's excess bandwidth available, there would be less reliance on the x16 links between the two chips and it is said that this would open up some flexibility, allowing partners who don't want the excess bandwidth to use them for practical purposes rather than server a high-speed interlink. Having just three x16 links instead of the four would allow for additional PCIe lanes that could serve outside the IF communication channel.
This would allow for additional PCIe Gen 4 connectivity, giving users up to 162 PCIe Gen 4 lanes. It is reasonable to consider that most won't go this route since lower bandwidth for chip-to-chip I/O is not an ideal approach but AMD has given a path to choose from. There's also the possibility that some customers can gain up to 192 PCIe Gen 4 lanes which would be possible by disabling two x16 links but STH reports that now OEM is currently supporting 2x inter-socket links (192 PCIe Gen 4) lanes, although it would offer the same interconnect speeds as the first generation EPYC "Naples" processors.
AMD CPU Roadmap (2017-2022)
|Architecture||Zen (1)||Zen (1) / Zen+||Zen (2) / Zen+||Zen (3) / Zen 2||Zen (3) / Zen 3 (+)||Zen (4) / Zen 3 (+)||Zen (4)|
|Process Node||14nm||14nm / 12nm||7nm||7nm||7nm||5nm / 6nm||5nm|
|Server||EPYC 'Naples'||EPYC 'Naples'||EPYC 'Rome'||EPYC 'Rome'||EPYC 'Milan'||EPYC 'Genoa'||TBD|
|Max Server Cores / Threads||32/64||32/64||64/128||64/128||64/128||TBD||TBD|
|High End Desktop||Ryzen Threadripper 1000 Series (White Haven)||Ryzen Threadripper 2000 Series (Coflax)||Ryzen Threadripper 3000 Series (Castle Peak)||Ryzen Threadripper 3000 Series (Castle Peak)||Ryzen Threadripper 5000 Series (Chagall)||Ryzen Threadripper 6000 Series||Ryzen Threadripper 7000 Series|
|Ryzen Family||Ryzen 1000 Series||Ryzen 2000 Series||Ryzen 3000 Series||Ryzen 4000/5000 Series||Ryzen 5000 Series||Ryzen 6000 Series||Ryzen 7000 Series|
|Max HEDT Cores / Threads||16/32||32/64||64/128||64/128||64/128||TBD||TBD|
|Mainstream Desktop||Ryzen 1000 Series (Summit Ridge)||Ryzen 2000 Series (Pinnacle Ridge)||Ryzen 3000 Series (Matisse)||Ryzen 5000 Series (Vermeer)||Ryzen 5000/6000 Series (Warhol)||Ryzen 6000/7000 Series (Raphael)||TBD|
|Max Mainstream Cores / Threads||8/16||8/16||16/32||16/32||16/32||16/32||TBD|
|Budget APU||N/A||Ryzen 2000 Series (Raven Ridge)||Ryzen 3000 Series (Picasso Zen+)||Ryzen 4000 Series (Renoir Zen 2)||Ryzen 5000 Series (Cezanne Zen 3)||Ryzen 6000 Series (Rembrandt Zen 3+)||Ryzen 7000 Series (Phoenix Zen 4)|
Just for comparison sakes, the first generation EPYC Naples infinity fabric ran at 10.7 GT/s and 4 x 16 IF links were required to meet the bandwidth demand. In EPYC Rome, the infinity fabric runs at 25.6 GT/s which is more than twice the speed of the first generation EPYC processors. This means that you'd need only 2 x16 IF links for chip-to-chip communication and the more IF links used, the better the latency and bandwidth. One thing to consider however is that the PCIe Gen 4 on EPYC processors would require a slightly new platform with updated PCB design.
In EPYC NAPLES, IF is 10.7 GT/s (8X MEMCLK). In order to provide sufficient bi-section bandwidth, you need 4 x16 IF links between the 2 sockets. In ROME, however, IF 2.0 is much faster at 25.6 GT/s (16x MEMCLK). You only need 2 x16 IF links between the 2 sockets.
— RetiredEngineer® (@chiakokhua) April 1, 2019
The other main feature of the new EPYC Rome processors would be the SCH (Integrated Server Controller Hub) which is mentioned as the standalone 14nm I/O die. In the previous processors, AMD had to share a lot of resources, including PCIe lanes by relying on low-speed 3rd party controllers.
AMD has planned to provide an extra lane per CPU to the SCH to drive NVMe and other essential I/O but not necessarily high-speed connectivity devices which would be powered by the main x16 links. This extra lane wouldn't be part of the core x16 links but an independent link which would be provided to the EPYC Rome I/O chip.
It looks like AMD is in the leading position if this research is correct. There are already reports of them gaining double-digit server market share numbers by 2020. Unless Intel makes some drastic changes with their 10nm Xeon CPU lineup codenamed Ice Lake-SP, things don't look that well for their Xeon server efforts.