⋮    ⋮  

NVIDIA Unveils Tegra Parker SOC at Hot Chips – Built on 16nm TSMC Process, Features Pascal and Denver 2 Duo Architecture

Aug 22, 2016
Submit

NVIDIA has unveiled the very first architectural details for their next-generation Tegra Parker SOC at Hot Chips. The latest Tegra SOC is based on TSMC's advanced FinFET process, combining the Pascal GPU and Denver 2 CPU architecture for unprecedented increase in performance and efficiency. NVIDIA revealed at Hot Chips that they are currently focusing the SOC at automotive markets but there are possibilities that it could arrive in different solutions too.

NVIDIA Tegra Parker SOC Detailed - TSMC FinFET Process, Pascal GPU and Denver 2 CPU Architecture

Starting with the details, the Tegra Parker SOC is based on the 16nm FinFET process node from TSMC and uses NVIDIA's latest CPU and GPU architecture. The bulk of the mass is dedicated to the Pascal GPU cores and their Denver 2 CPU cores. The chip features 256 CUDA cores that are based on the same DNA that is featured on the Titan X (Pascal) graphics card. The ARM v8 CPU complex comprises of two Denver 2 and four A57 cores with an coherent HMP (Heterogeneous Multi-Processor Architecture).

The Denver 2 and A57 chips each pack 2 MB of L2 cache and are linked via the HMP architecture to deliver 4 MB L2 cache. The Denver 2 chips also pack 128K+64K sub cache while the A57 chips include a 48K+32K sub-cache system. In addition to the CPU cores, the unit also packs 128b LPDDR4 support with 50 GB/s bandwidth (ECC). Display is a triple pipeline (4K @ 60 FPS) link while camera features include auto-HDR technology on up to 12 cameras.

NVIDIA mentions that their Denver 2 chips are the most advanced and highest performance ARM CPUs with significant performance improvements over first gen Denver cores. The new cores feature dynamic code optimizations, a 7-wide superscalar architecture and several low power retention states. This leads to a 40% performance increase in CPU performance over Apple's A9X chip.

While NVIDIA may not reveal the full purpose of Tegra Parker aside from automotive, I believe they have hinted that Parker is as good as an gaming chip as it's an automotive processor. The Multiprocessor architecture that combines big+super cores are mentioned to be great for single threaded performance, maximize the aggregate performance and have a sufficient thread count for automotive and gaming applications.

NVIDIA Tegra Parker SOC Supports Hardware Enabled Virtualization

The Tegra Parker SOC is also the first Tegra chip to support Hardware enabled CPU, GPU and SOC Virtualization. The chip can drive up to 8 virtualized machines with each VM having its dedicated display pipeline. NVIDIA will also provide their own software solutions to deliver the best virtualization experience using the Tegra SOC.

The Tegra SOC also supports 4K 60 FPS Encode/Decode, Ethernet-AVB, Dual CAN, QSPI for automotive, eMMC 5.2 and SATA for storage, PCI-E and a dedicated audio-processing chip.

The Tegra Parker SOC is first featured on the Drive PX 2 which comes with two such Tegra modules and even more space to support dedicated MXM graphics cards. This product packs 12 CPU Cores, four Pascal GPUs (2 Tegra / 2 MXM) with 8 TFLOPs of FP32 and 24 TFLOPs of INT8 compute. We have already seen the product packing two GP106 GPUs in MXM form factor so we expect something close to the GTX 1060 (Notebook) on NVIDIA's Drive PX 2 solution.

NVIDIA Drive PX Generation Comparison:

Product NameNVIDIA Drive PXNVIDIA Drive PX 2NVIDIA Drive XavierNVIDIA Drive PegasusNVIDIA Drive AGX Orin
SOC NameTegra X1ParkerXavierXavierOrin
Process Technology20nm SOC16nm FinFET12nm FinFET12nm FinFETTBA
SOC Transistors2 Billion (Tegra X1)N/A7 Billion (Xavier)7 Billion (Xavier)17 Billion (Orin)
GPU ArchitectureMaxwell (256 Core)Pascal (256 Core)Volta (512 Core)Volta (512 Core)Ampere?
CPU16 Core ARM CPU12 Core ARM CPU8 Core ARM CPU16 Core ARM CPU12 Core ARM CPU
CPU Architecture8x Cortex A57
8x Cortex A53
4x Denver
8x Cortex A57
Carmel ARM64 8 Core CPU (8 MB L2 + 4 MB L3)Carmel ARM64 8 Core CPU (8 MB L2 + 4 MB L3)ARM Herclues Cores
Compute DLTOPsN/A20 DLTOPs30 TOPs320 TOPs200 TOPs
Total Chips2 x Tegra X12 x Tegra X2
2 x Pascal MXM GPUs
1 x Xavier2 x Volta
2 x Turing
1 x Ampere
System MemoryLPDDR48 GB LPDDR4 (50+ GB/s)16 GB 256-bit LPDDR4LPDDR4 + GDDR6N/A
Graphics MemoryN/A4 GB GDDR5 (80+ GB/s)137 GB/s1 TB/s200 GB/s
TDP20W80W30W500WTBA
Submit