NVIDIA Launches Tesla M40 and Tesla M4 GPUs For Data Centers – Tegra X1 Powered Jetson TX1 Module Announced Too

• Nov 11, 2015 at 07:49pm EST

NVIDIA has made a couple of new announcements this week which include new products in their Tesla lineup along with a Tegra based launch. A total of three new products have been announced for the data center and Tegra (Android/Linux) development market. All three products have one thing in common as they are powered by NVIDIA's current generation Maxwell architecture and will be focused towards the emerging, machine deep learning markets.

NVIDIA New Maxwell Tesla Cards - GM200 Powered Tesla M40 and GM206 Powered Tesla M4

The launch of the new Tesla cards comes two months after NVIDIA launched their first Maxwell based Tesla cards. The Tesla M60 and Tesla M6 were launched to power the GRID but were also available for consumers to get them through NVIDIA OEM partners. The first two Tesla cards were based on the GM204 core, we got to see the first dual-chip Maxwell offering in the form of the Tesla M60 which featured two full GM204 core configuration. Today, NVIDIA is launching their first GM200 and GM206 powered Telsa parts which include the Tesla M40 and Tesla M4. Dubbed as "Hyperscale Accelerators", the focus of these new cards will be the deep machine learning sector which NVIDIA has put a lot of focus towards since 2014.

Together, they enable developers to use the powerful Tesla Accelerated Computing Platform to drive machine learning in hyperscale data centers and create unprecedented AI-based applications.

"The artificial intelligence race is on," said Jen-Hsun Huang, co-founder and CEO of NVIDIA. "Machine learning is unquestionably one of the most important developments in computing today, on the scale of the PC, the internet and cloud computing. Industries ranging from consumer cloud services, automotive and health care are being revolutionized as we speak.

"Machine learning is the grand computational challenge of our generation. We created the Tesla hyperscale accelerator line to give machine learning a 10X boost. The time and cost savings to data centers will be significant," he said. via NVIDIA

So we have two products in the market, starting off with the heavy weight Tesla M40, powered by the full GM200 core that comes with 3072 CUDA cores, 192 TMUs, 96 ROPs. The card is configured to run at boost clocks of 1140 MHz. The Tesla M40 features 12 GB GDDR5 VRAM that operates along a 384-bit bus interface and is clocked at 6.00 GHz effective memory frequency which indicates a total available bandwidth of 288.0 GB/s. The card has a peak FP32 throughput of 7.00 TFLOPs and just 0.21 TFLOPs of double precision (FP64) throughput due to lack of necessary double precision hardware on Maxwell GPUs. That is going to change with the upcoming Pascal GPUs which are solely built for peak compute performance and aimed at the HPC markets that require the higher computational throughput. The card features a TDP of 250W, powered by a single 8-Pin and single 6-Pin connector configuration and comes with passive cooling since the servers which these cards are configured in have the necessary cooling to keep them stable under the workloads.

The second card is the Tesla M4 which is a surprisingly tiny card and is the first Tesla card using the GM206 GPU core. The card comes in a low-profile form factor and is the smallest Maxwell card we have seen yet that features the full GM206 GPU core. The card features 1024 CUDA cores, 64 TMUs and 32 ROPs. Boost clocks are maintained around 1075 MHz (Max). The Tesla M4 features 4 GB of GDDR5 memory along a 128 bit bus interface that is clocked at 5.5 GHz effective clock frequency and pumps out 88.0 GB/s bandwidth. The card is offered with passive cooling and has a TDP configured around 50W up to 75W. The card's peak performance is rated at 2.2 TFLOPs (FP32) and 0.07 TFLOPs (FP64).

NVIDIA believes that Machine Learning is an emerging market and that is where these two cards are focused at. The workloads consist of Video Transcoding, Media Processing, Data Analytics and Deep Learning Inference. NVIDIA has also brought forward their new NVIDIA Hyperscale Suite that is focused for max utilization of their cards in such workloads allowing real-time accelerated services for developers, optimized GPU support in FFMPEG video processing framework and efficient image compute engines for dynamic image resizing at scale. NVIDIA has not announced pricing of these two products but the Tesla M40 will be available in late Q4 (end of 2015) and M4 will be available in Q1 2016.

NVIDIA Tesla Maxwell GPUs:

NVIDIA Tesla Maxwell Lineup:

Grid 2.0 Board Name	NVIDIA Tesla M60	NVIDIA Tesla M40	NVIDIA Tesla M10	NVIDIA Tesla M6	NVIDIA Tesla M4
GPU	GM204	GM200	GM107	GM204	GM206
GPU Cores	2048 x 2 (Dual Config) 4096 CUDA Cores	3072 CUDA Cores	2560 CUDA Cores	1536 CUDA Cores	1024 CUDA Cores
Memory	16 GB GDDR5 (8 GB x 2)	12 GB GDDR5	32 GB GDDR5	8 GB GDDR5	4 GB GDDR5
Memory Bus	256-bit x 2	384-bit	128-bit x 4	256-bit	128-bit
Max Users	36	Deep Learning Focused	64	18	Deep Learning Focused
H.264 (1080P @ 30 FPS) Streams	2-32	Deep Learning Focused	28	1-16	Deep Learning Focused
Form Factor	Dual-Slot PCI-Express	Dual Slot PCI-Express (Passive Cooling)	Dual Slot PCI-Express (Passive Cooling)	MXM Card	Single Slot PCI-Express (Low Profile Passive Cooling)
TDP	300W	250W	225W	100W	50-75W

NVIDIA Jetson TX1 Announced - Tegra X1 Maxwell Powered Module and Development Kit

The third product is the Jetson TX1 which is a Tegra X1, Maxwell powered module plus development kit. Being a successor to the Tegra K1 based Jetson TK1, the Jetson TX1 improves in all possible ways and the most notable difference is that the platform now comes in a credit card sized module rather than a full M-ATX form factor board. Aimed at smaller developers with focus on relatively small projects that include the like of embedded systems and even mobility devices, the Jetson board offers all the necessary hardware to begin development on such projects. The Jetson TX1 is offered in the smaller module which is a full system that is workable and a second variant that comes with separate board that offers necessary I/O.

Jetson TX1 is the first embedded computer designed to process deep neural networks -- computer software that can learn to recognize objects or interpret information. This new approach to program computers is called machine learning and can be used to perform complex tasks such as recognizing images, processing conversational speech, or analyzing a room full of furniture and finding a path to navigate across it. Machine learning is a groundbreaking technology that will give autonomous devices a giant leap in capability. via NVIDIA

NVIDIA Jetson TX1 Specifications:

GPU: 1 teraflops, 256-core Maxwell architecture-based GPU offering best-in-class performance
CPU: 64-bit ARM A57 CPUs
Video: 4K video encode and decode
Camera: Support for 1400 megapixels/second
Memory: 4GB LPDDR4; 25.6 gigabits/second
Storage: 16GB eMMC
Wi-Fi/Bluetooth: 802.11ac 2x2 Bluetooth ready
Networking: 1GB Ethernet
OS Support: Linux for Tegra
Size: 50mm x 87mm, slightly smaller than a credit card

The NVIDIA Tegra X1 SOC makes use of the 20nm ARM CPU architecture while the graphics side is powered by the ultra efficient Maxwell core. The Tegra X1 (formerly known as Tegra ERISTA) features eight 64-bit ARM CPU cores with a full fledge Maxwell GPU core that has 2 SMM units on the die enabled giving 256 CUDA Cores. The TX1 is based on a combination of four Cortex A-57 and four Cortex A-53 64/32-bit cores with the dual stacks integrated inside the die that deliver 1.0 TFlops of compute in 16-bit workloads (FP16) and around 500 GFlops for 32-bit workloads (FP32). The Jetson TX1 module consume 10W of peak power while delivering the through put as advertised.

Compared to the 192 CUDA Cores on Kepler based Tegra K1, it should be noted that Maxwell cores feature 40% better performance and 2 times the efficiency hence delivering increased speed in gaming and other GPGPU applications which will be suited for devices based on the Tegra X1 chip. The Maxwell architecture at a high level is similar to its predecessor, the Kepler GPU architecture in the sense that it is based on fundamental compute cores called CUDA cores, Streaming Multiprocessors (SMs), Polymorph Engines, Warp Schedulers, Texture Caches, and other hardware elements. But each hardware block on Maxwell has been optimized and upgraded with an intensive focus on power efficiency.

Specifications wise, the 2 SMMs of Maxwell GPU result in a total of 256 CUDA Cores with 16 ROPs and 16 Texture units. The clock speed isn’t mentioned but the chip pumps out a good 16 GTexels/s fill rate. The Maxwell GPU has also been manufacutred on the 20nm process which will deliver improved energy efficency compared to desktop variants. Memory clock is maintained at 1.6 GHz pumping out 25.6 GB/s bandwidth and has a 256 KB L2 cache. NVIDIA's Jetson TX1 SOC comes with 4 GB LPDDR4 memory clocked at 3200 MHz, 16GB eMMC Flash module, 2x2 802.11ac / Bluetooth connectivity and a Gigabit Ethernet controller. The board that is offered separately has tons of I/O options that include WiFi, Bluetooth, HDMI, M.2 SSD slot, USB ports, PCI-e 2.0 x4 slot, 5 MP camera interface and Ethernet port. The Jetson TX1 is expected to hit retail markets on 16th November with pre-orders starting from 12th November. The retail kits will be available for $599 US and $299 for education. The stand-along module is expected to go on retail later in Q1 2016 for $299 and only 1000 units will be available.

NVIDIA Jetson TX1 Module and Development Kit:

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA Launches Tesla M40 and Tesla M4 GPUs For Data Centers – Tegra X1 Powered Jetson TX1 Module Announced Too

NVIDIA Launches Tesla M40 and Tesla M4 GPUs For Data Centers – Tegra X1 Powered Jetson TX1 Module Announced Too

NVIDIA New Maxwell Tesla Cards - GM200 Powered Tesla M40 and GM206 Powered Tesla M4

NVIDIA Tesla Maxwell GPUs:

NVIDIA Tesla Maxwell Lineup:

NVIDIA Jetson TX1 Announced - Tegra X1 Maxwell Powered Module and Development Kit

NVIDIA Jetson TX1 Module and Development Kit:

Trending Stories

Black Myth: Wukong Outpaces FromSoftware’s Elden Ring to 30 Million Sales Nearly Two Years Ahead of Schedule

Lara Croft’s PlayStation 1 Debut Returns as a Modern Remake, but the Demo Proved Exploration Still Beats the Gunplay

Intel’s 18A-P Debuts Power Boost, an Industry-First Dual-Contact Transistor That Squeezes More Frequency From the Same Chip Footprint

NVIDIA Blackwell Sweeps Every MLPerf 6.0 Benchmark With No Competition In Sight, While GB300 Systems Run Up to 60% Faster Than GB200

The Entire Team Behind Luna Abyss from Kwalee Labs has Been Laid Off, CEO Reveals, Weeks After Luna Abyss’ Launch

Popular Discussions

AMD’s Marketing Chief Boasts ’15 Out Of 15′ On Amazon’s Best-Seller CPU Chart, Leaving Intel Without A Single Top Spot

AMD Olympic Ridge “Zen 6” Ryzen CPUs Get Integrated NPU At The Cost of iGPU, CUDIMM Ready Platform

Intel’s Z990 Chipset Goes All-In On Gen5, Shrinking Its Die 22% While Pushing Power Up To 14W

AMD’s RX 9070 XT Finally Crashes Steam Survey At 1.33% Share, Closing The Gap On NVIDIA’s RTX 5080 After A Year In Hiding

AMD’s Next-Gen Threadripper “Mustang Peak” Confirmed: Built For TR6 Platform, Bringing 2nm Zen 6 Cores and PCIe Gen6

NVIDIA Launches Tesla M40 and Tesla M4 GPUs For Data Centers – Tegra X1 Powered Jetson TX1 Module Announced Too

NVIDIA New Maxwell Tesla Cards - GM200 Powered Tesla M40 and GM206 Powered Tesla M4

NVIDIA Tesla Maxwell GPUs:

NVIDIA Tesla Maxwell Lineup:

NVIDIA Jetson TX1 Announced - Tegra X1 Maxwell Powered Module and Development Kit

NVIDIA Jetson TX1 Module and Development Kit:

Further Reading

Trending Stories

Popular Discussions