Intel Gaudi 3 AI Accelerator Official: 5nm, 128 GB HBM2e, Up To 900W, 50% Faster Than NVIDIA H100 & 40% More Efficient

Apr 9, 2024 at 12:01pm EDT
Intel Gaudi 3 AI Accelerator Official: 5nm, 128 GB HBM2e, Up To 900W, 50% Faster Than NVIDIA H100 & 40% More Efficient 1

Intel has finally revealed its next-gen AI Accelerator, the Gaudi 3, based on a 5nm process node and competing directly against NVIDIA's H100 GPUs.

Intel Gaudi 3 AI Accelerators Take the Fight To NVIDIA, Offers 50% Faster AI Performance On Average While Being 40% Efficient

Intel's Gaudi AI accelerators have been a big competitor and the only alternative to NVIDIA's GPUs in the AI segment. We recently saw some heated benchmark comparisons between the Gaudi 2 & the NVIDIA A100/H100 GPUs with Intel showcasing its strong perf/$ lead while NVIDIA remained an overall AI leader in terms of performance. Now begins the third chapter in Intel's AI journey with its Gaudi 3 accelerator which has been fully detailed.

Related Story Apple To Design & Build Chips At Intel on American Soil, US President Confirms
Intel introduced the Intel Gaudi 3 AI accelerator on April 9, 2024, at the Intel Vision event in Phoenix, Arizona. It is designed to bring global enterprises choice for generative AI, building on the performance and scalability of its Gaudi 2 predecessor. (Credit: Intel Corporation)

The company announced the Gaudi 3 accelerator which features the latest (5th Gen) Tensor Core architecture with a total of 64 tensor cores packed within two compute dies. The GPU itself has a 96 MB cache pool which is shared across both dies and there are eight HBM sites, each featuring 8-hi stacks of 16 Gb HBM2e DRAM for up to 128 GB capacities & up to 3.7 TB/s bandwidth. The entire chip is fabricated using TSMC 5nm process node technology and there are a total of 24 200GbE interconnect links.

In terms of product offerings, the Intel Gaudi 3 AI accelerators will come in both Mezzanine OAM (HL-325L) form factor with up to 900W standard and over 900W liquid-cooled variants & PCIe AIC with a full-height, double-wide and 10.5" length design. The Gaudi 3 HL-338 PCIe cards will come in passive cooling and support up to 600W TDP with the same specifications as the OAM variant.

The company also announced its own HLB-325 baseboard and HLFB-325L integrated subsystem which can carry up to 8 Gaudi 3 accelerators. This system has a combined TDP of 7.6 Kilowatts & measures 19".

The follow up to Gaudi 3 will come in the form of Falcon Shores which is expected for 2025 and will be combining both Gaudi and Xe IPs in a single GPU programming interface which is built around the Intel oneAPI specification.

Press Release: At Intel Vision, Intel introduces the Intel Gaudi 3 AI accelerator, which delivers 4x AI compute for BF16,  1.5x increase in memory bandwidth, and 2x networking bandwidth for massive system scale-out compared to its predecessor – a significant leap in performance and productivity for AI training and inference on popular large language models (LLMs) and multimodal models.

The Intel Gaudi 3 accelerator will meet these requirements and offer versatility through open community-based software and open industry-standard Ethernet, helping businesses flexibly scale their AI systems and applications.

How Custom Architecture Delivers GenAI Performance and Efficiency: The Intel Gaudi 3 accelerator, architected for efficient large-scale AI compute, is manufactured on a 5 nanometer (nm) process and offers significant advancements over its predecessor.  It is designed to allow activation of all engines in parallel — with the Matrix Multiplication Engine (MME), Tensor Processor Cores (TPCs) and Networking Interface Cards (NICs) — enabling the acceleration needed for fast, efficient deep learning computation and scale. Key features include:

Intel introduced the Gaudi 3 AI accelerator on April 9, 2024, at the Intel Vision event in Phoenix, Arizona. The accelerator delivers 4x AI compute for BF16 and a 1.5x increase in memory bandwidth compared with its predecessor. (Credit: Intel Corporation)

Intel Gaudi 3 accelerator will deliver significant performance improvements for training and inference tasks on leading GenAI models. Specifically, the Intel Gaudi 3 accelerator is projected to deliver on average versus NVIDIA H100:

About Market Adoption and Availability: The Intel Gaudi 3 accelerator will be available to original equipment manufacturers (OEMs) in the second quarter of 2024 in industry-standard configurations of Universal Baseboard and open accelerator module (OAM). Among the notable OEM adopters that will bring Gaudi 3 to market are Dell Technologies, HPE, Lenovo, and Supermicro. General availability of Intel Gaudi 3 accelerators is anticipated for the third quarter of 2024, and the Intel Gaudi 3 PCIe add-in card is anticipated to be available in the last quarter of 2024.

Intel introduced the Intel Gaudi 3 AI accelerator on April 9, 2024, at the Intel Vision event in Phoenix, Arizona. The AI accelerator is designed to break down proprietary walls to bring choice to the enterprise generative AI market. (Credit: Intel Corporation)

The Intel Gaudi 3 accelerator will also power several cost-effective cloud LLM infrastructures for training and inference, offering price-performance advantages and choices to organizations that now include NAVER.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.