NVIDIA Announces Xavier Tegra SOC – Features Volta GPU With 7 Billion Transistors, 512 CUDA Cores and 8 ARM64 Custom Cores
NVIDIA has announced their next generation Xavier SOC at GTC Europe 2016. The latest SOC is part of NVIDIA's Tegra family and will replace the currently available Parker SOC in late 2017. Xavier is the first product that has been announced, featuring NVIDIA's next-generation Volta GPU along with custom ARM cores.
NVIDIA Xavier Tegra SOC Announced - Powered by Volta GPU and Custom ARM64 Cores
Since Parker, NVIDIA has moved their Tegra business from gaming and mobility over to automotive and AI markets. This in return has offered NVIDIA room to improve upon the design of their Tegra chips drastically, with the current generation SOC featuring 20 Deep Learning Tera-OPs (DLTOPs). Parker is available on two different Drive PX 2 boards, one that can feature discrete GPUs (MXM) and a standalone board with just one chip. The former delivers 20 DLTOPs at a rated TDP of 80W, but Xavier is going to change that.
The Xavier SOC will start sampling in Q4 2017, that's a long time till the availability of the chip. This also indicates that Volta will be shipping in higher-end variants to HPC sometime in 2017. Current rumors suggest a formal unveiling of the new GPU architecture at GTC 2017. NVIDIA did share some technical details of Xavier at the GTC Europe keynote and they are really interesting.
“This is the greatest SoC endeavor I have ever known, and we have been building chips for a very long time,” Huang said. “Just imagine what an autonomous vehicle can do in the near future with Xavier.”
It looks like Volta based chips will be NVIDIA's most efficient products to date. Before we get into details on efficiency, let's take a look at the specifications. The Xavier SOC is built on TSMC's 16nm FinFET+ process node. The chip itself will pack a total of 7 Billion transistors. The transistor density puts Xavier in the same league as Pascal GP104 which houses 7.2 Billion transistors on a 314mm2 die. With a refined process and architecture design, we could be looking at a die very close to 300mm2.
Xavier Comes With 512 Volta Cores and 8 ARM64 Denver Cores
It comes with 512 CUDA cores that are based on the Volta GPU architecture. This is a substantial bit as Volta has now been confirmed to be available in 2017. On the CPU side, the chip boasts 8 Custom ARM64 cores. NVIDIA hasn't mentioned their Denver codename in the slides but since these are custom designed, ARM 64-bit cores, we could be looking at a next - gen Denver design that is yet to be disclosed. The chip comes with dual 8K HDR video processors and a super fast, computer vision accelerator for automotive applications.
Also, NVIDIA mentioned some performance numbers for Xavier. The chip is said to feature up to 20 DLTOPs worth of performance at a TDP of 20W. This will be achieved on a single SOC and not a custom board like the Drive PX 2. To put things into perspective, Drive PX 2 delivers 24 DLTOPs worth of performance at 80W. Furthermore, it uses two Parker SOCs and two discrete GPUs based on the Pascal architecture. This is 1 DLTOP (INT8) performance per watt we are looking at and its just a glimpse of the massive power efficiency coming in from the Volta architecture.
The new Xavier SOC sounds like a massive improvement in performance/watt sector. With Volta being disclosed this early, we are sure that GTC 2017 will bring us a ton of more information on the architecture itself. As for the new Drive PX board, expect it to start sampling in late 2017, followed by volume shipment in early 2018.
NVIDIA Drive PX Generation Comparison:
|Product Name||NVIDIA Drive PX||NVIDIA Drive PX 2||NVIDIA Drive Xavier||NVIDIA Drive Pegasus||NVIDIA Drive AGX Orin|
|SOC Name||Tegra X1||Parker||Xavier||Xavier||Orin|
|Process Technology||20nm SOC||16nm FinFET||12nm FinFET||12nm FinFET||TBA|
|SOC Transistors||2 Billion (Tegra X1)||N/A||7 Billion (Xavier)||7 Billion (Xavier)||17 Billion (Orin)|
|GPU Architecture||Maxwell (256 Core)||Pascal (256 Core)||Volta (512 Core)||Volta (512 Core)||Ampere?|
|CPU||16 Core ARM CPU||12 Core ARM CPU||8 Core ARM CPU||16 Core ARM CPU||12 Core ARM CPU|
|CPU Architecture||8x Cortex A57|
8x Cortex A53
8x Cortex A57
|Carmel ARM64 8 Core CPU (8 MB L2 + 4 MB L3)||Carmel ARM64 8 Core CPU (8 MB L2 + 4 MB L3)||ARM Herclues Cores|
|Compute DLTOPs||N/A||20 DLTOPs||30 TOPs||320 TOPs||200 TOPs|
|Total Chips||2 x Tegra X1||2 x Tegra X2|
2 x Pascal MXM GPUs
|1 x Xavier||2 x Volta|
2 x Turing
|1 x Ampere|
|System Memory||LPDDR4||8 GB LPDDR4 (50+ GB/s)||16 GB 256-bit LPDDR4||LPDDR4 + GDDR6||N/A|
|Graphics Memory||N/A||4 GB GDDR5 (80+ GB/s)||137 GB/s||1 TB/s||200 GB/s|