Microsoft Azure Gets An Ultra Upgrade With NVIDIA’s GB300 “Blackwell Ultra” GPUs, 4600 GPUs Connected Together To Run Over Trillion Parameter AI Models

•

Oct 10, 2025 at 05:40am EDT

Microsoft Azure Gets An Ultra Upgrade With NVIDIA's GB300 "Blackwell Ultra" GPUs, 4600 GPUs Connected Together To Run Over Trillion Parameter AI Models 1

Microsoft has announced its first at-scale production cluster featuring NVIDIA's GB300 "Blackwell Ultra" GPUs for massively-sized AI models.

NVIDIA GB300 "Blackwell Ultra" Crunches Through 100s of Trillions of Parameters AI Models Within Microsoft's latest Azure Platform

Microsoft's Azure has received the Blackwell Ultra upgrade. The latest large-scale and production cluster has been integrated with over 4,600 GPUs based on NVIDIA's GB300 NVL72 architecture, all connected using the next-generation InfiniBand interconnect fabric. This deployment paves the way for Microsoft to scale to hundreds of thousands of Blackwell Ultra GPUs deployed across various datacenters across the globe, and all tackling one workload, AI.

The world's first large-scale @nvidia GB300 NVL72 supercomputing cluster for AI workloads is now live on Microsoft Azure.

The deployment connects 4,600+ NVIDIA Blackwell Ultra GPUs using next-gen InfiniBand network—built to train and deploy advanced AI models faster than… pic.twitter.com/CmmDtcrlwn
— Microsoft Azure (@Azure) October 9, 2025

According to Microsoft, the Azure cluster with NVIDIA GB300 NVL72 "Blackwell Ultra" GPUs can reduce training times from months to weeks, and unlocks the way for training models that are over 100s of trillions of parameters large. NVIDIA also leads the way with its strong Inference performance that has been demonstrated countless times in MLPerf benchmarks, and also the most recent InferenceMAX AI tests.

The new Microsoft Azure ND GB300 v6 VMs are optimized for reasoning models, agentic AI systems, and multimodal generative AI workloads. Each rack features a total of 18 VMs with 72 GPUs each. The following are the main specs highlights:

72 NVIDIA Blackwell Ultra GPUs (with 36 NVIDIA Grace CPUs).
800 gigabits per second (Gbp/s) per GPU cross-rack scale-out bandwidth via next-generation NVIDIA Quantum-X800 InfiniBand (2x GB200 NVL72).
130 terabytes (TB) per second of NVIDIA NVLink bandwidth within a rack.
37TB of fast memory.
Up to 1,440 petaflops (PFLOPS) of FP4 Tensor Core performance.

At the rack level, NVLink and NVSwitch reduce memory and bandwidth constraints, enabling up to 130TB per second of intra-rack data-transfer connecting 37TB total of fast memory. Each rack becomes a tightly coupled unit, delivering higher inference throughput at reduced latencies on larger models and longer context windows, empowering agentic and multimodal AI systems to be more responsive and scalable than ever.

To scale beyond the rack, Azure deploys a full fat-tree, non-blocking architecture using NVIDIA Quantum-X800 Gbp/s InfiniBand, the fastest networking fabric available today. This ensures that customers can scale up training of ultra-large models efficiently to tens of thousands of GPUs with minimal communication overhead, thus delivering better end-to-end training throughput. Reduced synchronization overhead also translates to maximum utilization of GPUs, which helps researchers iterate faster and at lower costs despite the compute-hungry nature of AI training workloads. Azure’s co-engineered stack, including custom protocols, collective libraries, and in-network computing, ensures the network is highly reliable and fully utilized by the applications. Features like NVIDIA SHARP accelerate collective operations and double effective bandwidth by performing math in the switch, making large-scale training and inference more efficient and reliable.

Azure’s advanced cooling systems use standalone heat exchanger units and facility cooling to minimize water usage while maintaining thermal stability for dense, high-performance clusters like GB300 NVL72. We also continue to develop and deploy new power distribution models capable of supporting the high energy density and dynamic load balancing required by the ND GB300 v6 VM class of GPU clusters.

via Microsoft

As per NVIDIA, the Microsoft Azure partnership marks a leadership moment for the United States in leading the AI race. The newest Azure VMs are now deployed and ready for use by customers.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Microsoft Azure Gets An Ultra Upgrade With NVIDIA’s GB300 “Blackwell Ultra” GPUs, 4600 GPUs Connected Together To Run Over Trillion Parameter AI Models

NVIDIA GB300 "Blackwell Ultra" Crunches Through 100s of Trillions of Parameters AI Models Within Microsoft's latest Azure Platform

Related Story New ROG Xbox Ally Update Adds “Hand-Crafted” Game Profiles For 40 Games

Deal of the Day

Further Reading

Call of Duty: Black Ops 7 EU Launch Sales Fall Short of Battlefield 6's Launch by 63%

NVIDIA & Arm Partner To Bring NVLink Fusion Support On Neoverse Platforms, Hi-Bandwidth Interconnect For AI Data Centers

Call of Duty: Black Ops 7 Is Out, And It's Immediately Being Called Out For Being Filled With AI Slop

NVIDIA Blackwell Ultra Secures Win Across All Seven MLPerf AI Training Benchmarks, GB200 NVL72 Sets Record 10 Minutes Training Time For Llama 405B