Announcement Hardware PC

NVIDIA Announces Tesla T4 Based on Turing GPU For Inferencing – 65 TFLOPs FP16, 130 TOPs INT8, 260 TOPs INT4 at Just 75W

Hassan Mujtaba • Sep 12, 2018 at 10:30pm EDT

NVIDIA has just announced their latest Turing based Tesla T4 graphics card inference acceleration. The graphics card was announced by NVIDIA's CEO, Jensen Huang, at the GTC 2018 Japan keynote as the first Tesla based graphics card featuring the brand new Turing GPU.

NVIDIA Tesla T4 With Turing GPU Announced at GTC Japan - Aiming At The Inferencing Market With Multi-TFLOPs of Performance at Just 75W, 2560 Cores

The Turing based NVIDIA Tesla T4 graphics card is aimed at inference acceleration markets. It is designed to accelerate deep learning performance by a magnitude over its predecessors and is also going to deliver breakthrough performance for AI video applications. NVIDIA's own estimate put the graphics card at twice as fast in video processing, enabling users to decode up to 38 full-HD video streams which just wasn't possible on the previous generation.

The NVIDIA Tesla T4 GPU is the world’s most advanced inference accelerator. Powered by NVIDIA Turing Tensor Cores, T4 brings revolutionary multi-precision inference performance to accelerate the diverse applications of modern AI. Packaged in an energy-efficient 75-watt, small PCIe form factor, T4 is optimized for scale-out servers and is purpose-built to deliver state-of-the-art inference in real time.

As the volume of online videos continues to grow exponentially, demand for solutions to efficiently search and gain insights from video continues to grow as well. Tesla T4 delivers breakthrough performance for AI video applications, with dedicated hardware transcoding engines that bring twice the decoding performance of prior-generation GPUs. T4 can decode up to 38 full-HD video streams, making it easy to integrate scalable deep learning into video pipelines to deliver innovative, smart video services.

via NVIDIA

The specifications inside the Tesla T4 are very impressive given its single-slot PCIe form factor. The graphics card packs the Turing TU104 GPU with 2560 CUDA cores and 320 Tensor Cores. It delivers 8.1 TFLOPs of FP32 performance, 65 TFLOPs of FP16 mixed-precision, 130 TOPs of INT8 and 260 TOPs of INT4 performance. All of this compute performance is achieved with a TDP of just 75W. It means that you don't need any external power source as the graphics card will be pulling the juice from the PCIe slot and can be put inside a 1U, 4U or any rack since the small form factor design will allow for large-scale compatibility in many servers.

Additionally, the graphics card will be coupled with 16 GB of GDDR6 memory which will deliver a bandwidth of more than 320 GB/s which is just stunning. The NV TensorRT Hyperscale Platform includes a comprehensive set of hardware and software offerings optimized for powerful, highly efficient inference. Key elements include:

NVIDIA Tesla T4 GPU – Featuring 320 Turing Tensor Cores and 2,560 CUDA cores, this new GPU provides breakthrough performance with flexible, multi-precision capabilities, from FP32 to FP16 to INT8, as well as INT4. Packaged in an energy-efficient, 75-watt, small PCIe form factor that easily fits into most servers, it offers 65 teraflops of peak performance for FP16, 130 teraflops for INT8 and 260 teraflops for INT4.
NVIDIA TensorRT 5 – An inference optimizer and runtime engine, NVIDIA TensorRT 5 supports Turing Tensor Cores and expands the set of neural network optimizations for multi-precision workloads.
NVIDIA TensorRT inference server – This containerized microservice software enables applications to use AI models in data center production. Freely available from the NVIDIA GPU Cloud container registry, it maximizes data center throughput and GPU utilization, supports all popular AI models and frameworks, and integrates with Kubernetes and Docker.

NVIDIA Tesla T4 GPU Specifications

Product Name	Tesla M4	Tesla P4	Tesla T4
GPU Architecture	Maxwell GM206	Pascal GP104	Turing TU104
GPU Process	28nm	16nm FinFET	12nm FinFET
CUDA Cores	1280 CUDA	2560 CUDA	2560 CUDA
Clock Speed	1072 MHz	1063 MHz	1582 MHz
FP32 Compute	2.20 TFLOPs	5.50 TFLOPs	8.1 TFLOPs
FP16 Compute	N/A	11 TFLOPs	65 TFLOPs
INT8 Compute	N/A	22 DLTOPs	130 DLTOPs
INT4 Compute	N/A	N/A	260 DLTOPs
VRAM	4 GB GDDR5	8 GB GDDR5	16 GB GDDR6
Memory Clock	5.5 GHz	6.0 GHz	10 GHz
Memory Bus	128-bit	256-bit	256-bit
Memory Bandwidth	88.0 GB/s	192.0 GB/s	320 GB/s+
TDP	~75W	~75W	75W
Launch	2015	2016	2018

There's no word on pricing or availability yet but we will keep you updated as we get more info on the new Tesla T4 graphics card.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA Announces Tesla T4 Based on Turing GPU For Inferencing – 65 TFLOPs FP16, 130 TOPs INT8, 260 TOPs INT4 at Just 75W

NVIDIA Announces Tesla T4 Based on Turing GPU For Inferencing – 65 TFLOPs FP16, 130 TOPs INT8, 260 TOPs INT4 at Just 75W

NVIDIA Tesla T4 With Turing GPU Announced at GTC Japan - Aiming At The Inferencing Market With Multi-TFLOPs of Performance at Just 75W, 2560 Cores

NVIDIA Tesla T4 GPU Specifications

Trending Stories

Square Enix’s Final Fantasy VII Rebirth Looks Like a Remaster on PC, as Shader Injector 2.0 Delivers Series’ Best Visuals

GameStop May Have Leaked Zelda: Ocarina of Time Remake Pre-Orders for August 4, Hinting First Real Footage Isn’t Far

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

Kimi K3 Built A Chip In Just 48 Hours, Which Pushes Over 8700 Tokens/s, As China’s Moonshot Delivers A 2.8 Trillion Parameter Frontier AI Model

AMD Ryzen 7 7700X3D Is Now Available At Newegg At $279; Retailer Bundles Various Hardware With The New Zen 4 CPU

Popular Discussions

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

AMD Ryzen 7 7800X3D CPU Drops To $299 A Day Ahead of 7700X3D’s Launch, Bringing 3D V-Cache Goodness To Mainstream Gamers

NVIDIA Announces Tesla T4 Based on Turing GPU For Inferencing – 65 TFLOPs FP16, 130 TOPs INT8, 260 TOPs INT4 at Just 75W

NVIDIA Tesla T4 With Turing GPU Announced at GTC Japan - Aiming At The Inferencing Market With Multi-TFLOPs of Performance at Just 75W, 2560 Cores

Related Story NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

NVIDIA Tesla T4 GPU Specifications

Further Reading

Trending Stories

Popular Discussions