Hardware Report

Microsoft Azure Upgraded To AMD Instinct MI200 GPU Clusters For ‘Large-Scale’ AI training, Offers 20% Performance Improvement Over NVIDIA A100 GPUs

Jason R. Wilson • May 27, 2022 at 10:18am EDT

Yesterday, Microsoft Azure disclosed the plan to use AMD Instinct MI200 Instinct GPUs to expand AI machine learning in the widely used cloud on a larger scale. AMD revealed the MI200 GPU series at the company's exclusive Accelerated Datacenter event at the end of 2021. AMD MI200 accelerators utilize the CDNA 2 architecture, offering 58 billion transistors with high-bandwidth memory of 128 GB packed into a dual-die layout.

Microsoft Azure will use AMD Instinct MI200 GPUs to conduct extensive AI training in the cloud-based platform

Forrest Norrod, senior vice president and general manager of data center and embedded solutions at AMD, assures that the next-gen chips are close to five times more efficient than NVIDIA's top-level A100 GPU. This calculation regards FP64 measures that the company felt were "highly precise." In FP16 workloads, the gap was primarily narrowed in standard workloads, even though AMD stated the chips were up to 20 percent more instantaneous than the current NVIDIA A100, where the company remains the data center GPU leader.

Azure will be the first public cloud to deploy clusters of AMD's flagship MI200 GPUs for large-scale AI training. We've already started testing these clusters using some of our own AI workloads with great performance.

— Kevin Scott, chief technical officer, Microsoft

It is unknown when Azure instances utilizing AMD Instinct MI200 GPUs will be widely available or if the series will be used in internal workload situations.

Microsoft is reported to work with AMD to advance the company's GPUs for machine learning workloads under the open-source machine learning framework, PyTorch.

We're also deepening our investments in the open-source PyTorch framework, working with the PyTorch core team and AMD both to optimize the performance and developer experience for customers running PyTorch on Azure, and to ensure that developers' PyTorch projects work great on AMD hardware.

Microsoft's recent partnership with Meta AI was PyTorch development to help boost the infrastructure of the framework's workloads. Meta AI did reveal that the company plans to utilize next-gen machine learning workloads on a reserved cluster on Microsoft Azure that would enlist 5,400 A100 GPUs from NVIDIA.

This strategic placement by NVIDIA allowed the company to gain $3.75 billion in the last quarter, topping the company's gaming market, which ended at $3.62 billion — a first for the company.

Intel Ponte Vecchio GPUs are anticipated to launch later in the year alongside the manufacturer's Sapphire Rapids Xeon Scalable processors, which will be the first time for Intel to compete with NVIDIA's H100 and AMD's Instinct MI200 GPUs for the cloud service marketplace. The company also introduced the next-generation AI accelerators for training and inferences and reported higher performance than NVIDIA's A100 GPUs.

News Source: The Register

About the author: Jason R. Wilson is a member of the Hardware news team at Wccftech. Equipped with a background in graphic design and writing, Jason works daily to improve his craft and continues to create new and innovative ideas every day.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on Microsoft Azure Upgraded To AMD Instinct MI200 GPU Clusters For ‘Large-Scale’ AI training, Offers 20% Performance Improvement Over NVIDIA A100 GPUs

Microsoft Azure Upgraded To AMD Instinct MI200 GPU Clusters For ‘Large-Scale’ AI training, Offers 20% Performance Improvement Over NVIDIA A100 GPUs

Microsoft Azure will use AMD Instinct MI200 GPUs to conduct extensive AI training in the cloud-based platform

Trending Stories

Xbox Studio Leaders Reportedly Detest Game Pass, Arguing it Destroyed the Value of Their $40+ Games Now Available for Pennies

Over 80% Of Samsung Foundry Workers Are Planning To Leave Amid A Yawning Pay Gap With The Memory Division

CXMT Supply Chain To Witness Major Process Transition To Seize DDR6 Opportunity Before Commercialization, Threatening Samsung’s And SK hynix’s Global Hold

SpaceX Awards Foxconn A Part In A Huge $52 Billion Order For 13,000 Racks Of NVIDIA GB300 AI Servers, Where Each Rack Costs $4 Million And The Total Order Spans Nearly 1 Million GPUs

Apple Reportedly Grabs The OG Team Behind Open-Source Qwen, Betting on Alibaba’s AI to Rescue Siri in China

Popular Discussions

AMD Medusa Point 10-Core “Zen 6” CPU Beats Strix Point 10-Core “Zen 5” By Nearly 35% While Operating at 5.4 GHz

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

AMD Ryzen 7 7800X3D CPU Drops To $299 A Day Ahead of 7700X3D’s Launch, Bringing 3D V-Cache Goodness To Mainstream Gamers

Microsoft Azure Upgraded To AMD Instinct MI200 GPU Clusters For ‘Large-Scale’ AI training, Offers 20% Performance Improvement Over NVIDIA A100 GPUs

Microsoft Azure will use AMD Instinct MI200 GPUs to conduct extensive AI training in the cloud-based platform

Related Story Data Center Electricians Are Now Earning $280,000 Per Year At A Time When Computer Engineering Graduates Face Chronic Unemployment

Further Reading

Trending Stories

Popular Discussions