AMD Vega 20 GPUs To Feature XGMI GPU-To-GPU PCIe 4.0 Interconnect , Will Compete Against NVIDIA’s 300 GB/s NVLINK 2.0

Author Photo
Sep 6, 2018

AMD Vega 20 GPUs for high-performance computing and server environments would be utilizing the latest XGMI GPU-To-GPU interconnect as revealed in the latest Linux patched (via Phoronix). It has been rumored since 2016 that the Vega 20 series of graphics cards will utilize a new interconnect to compete against NVIDIA’s NVLINK solution which was available on their Tesla HPC lineup and has now been expanded to their consumer aimed Quadro and GeForce graphics cards too since the announcement of the Turing GPU.

AMD Vega 20 With XGMI (PCIe 4.0) Interconnect Revealed in Linux Patch – Aiming for The HPC Market

First of all, there’s endless speculation about the AMD Radeon Vega 20 GPU. AMD made it very clear during the announcement of the Radeon Instinct Vega 7nm that it is optimized for server, workstation and compute markets. AMD also mentioned a new high-speed interconnect, H/W based virtualization and new deep learning operations to be added to their enhanced Vega 20 graphics core.

amd-epyc-milan-and-nvidia-volta-next-perlmutter-supercomputer_3Related AMD EPYC Milan With 7nm+ Zen 3 Cores and NVIDIA Volta-Next GPU With Greater Than 7 TFLOPs FP64 Compute To Power Perlmutter Supercomputer

The latest AMDGPU Linux drivers include XGMI (inter-chip global memory interconnect) patches and are said to be under queue for introduction in the Linux 4.20~5.0 Kernel which would be part of the Vega 20 enabling stack. There’s no telling how fast XGMI would be in comparison to NVIDIA’s NVLINK solution which will be competing with it in the HPC market. Currently, NVIDIA’s Tesla V100 lineup offers up to 300 GB/s interconnect speeds, while that has further been extended to Quadro and GeForce cards, ranging from speeds of 200 GB/s (Quadro GV100), 100 GB/s (Quadro RTX 8000 and RTX 6000), and 50 GB/s (Quadro RTX 5000).


amd-epyc-rome-64-core-cpuRelated AMD EPYC Rome With 64 Zen Core 2 Cores Based on 7nm Technology Clock Speed Revealed – 2.35 GHz Inside The Hawk Supercomputer

The new interconnect will be effectively doubling the bandwidth of Radeon GPUs based on the PCIe 3.0 interface, offering a 16 GT/s bit rate from 8 GT/s, and a total bandwidth of 64 GB/s (x16 interface) compared to 32 GB/s (x16 interface) on PCIe 3.0.


This aligns with some of the leaked roadmap and slides we have seen in the past which revealed PCIe 4.0 xGMI (inter-chip global memory interconnect). The leak also mentioned that PCIe 4.0 link would be available with Vega 20 graphics cards that arrive in late 2018.


Based on the 7nm process node, the new Vega 20 GPUs will offer 2x the density since AMD is packing more stuff in a smaller die than Vega 10, it will also be twice as efficient due to architectural and process maturation, and finally, AMD has planned out several products based on 7nm in the pipeline.

PCIe 4.0 and PCIe 5.0 Are Fast But Also Very Expensive To Design For – Still Long Way From Consumer Adoption

So, most of us will be excited that PCIe 4.0 will be arriving in the consumer space too, since it is going to be available on 7nm Vega GPUs. The reality is far from it, the industry is aware of the costs it takes to design a platform specific for PCIe 4.0 and beyond, and to be clear, it’s really expensive and not something we would be looking at in the consumer space for some time.

A report by EETimes mentions that the cost associated with a PCIe 4.0 specific platform would only make sense for an HPC environment and there will be many changes. The only thing that has so far accelerated the PCIe roadmap is cloud computing, deep learning machines, and data centers which require faster speeds and data transfer rates.

“Speed costs money, so as we go to higher signaling rates, we will see how much people are willing to pay for it and how,” said Michael Krause, an interconnect expert at Hewlett Packard Enterprise.

But as the transfer speed increases, the signal decays and devices/components need to be crammed together to maintain higher signal integrity and hence availability. It is stated that with PCIe Gen 1.0, the signal retained strength up to 20 inches in mainstream FR4 boards. With PCIe 4.0, the signal would decay after only 3-5 inches, hence requiring the need of extender cables and signal amplifying chips.

Retimer chips for a full 16-lane full PCIe 4.0 could cost $15 to $25 — if you can find them. Upgrading an adapter card from Megtron-2 to Megtron-4 materials might only add a dollar or so. However, the cost of a similar upgrade for a motherboard is about $100, and if the upgrade is to even higher quality Megtron-6 it would cost about $300.

“The data center will go to Megtron-4 for PCIe 4.0 and that will add maybe $10 cost — and you may still need retimers,” said Krause. “For version 5.0, people will weigh even higher-cost PCB materials and retimers or move to cables.”

“What we have been using for 4.0 and expect to use for 5.0 is twinax cables and firefly connectors,” he added. “The cost is very low compared to retimers, you can get whatever you want in distance, and the latency is really good.”

Indeed, Krause noted that “there’s been a lot of interest in using cables … for every inch on a board, you can go 10 inches on cables for the same power and loss budget, but cables have costs in being routed and connected.”

via EETimes

It will be interesting to see how much performance and interconnect speeds XGMI has to offer since next year, NVIDIA will be releasing their third iteration of the NVLINK I/O architecture for IBM Power9 based platforms. The new NVLINK 3.0 interface would be delivering higher speeds than what’s currently available (300 GB/s) and will be aimed at the server/HPC Compute market too before moving completely to PCIe Gen 5 and Next-Gen NVLINK in 2020.

Do you think AMD will launch Vega 20 for consumers?