Intel Arc Pro B70 Delivers A 80% Boost in MLPerf Inference v6.0, Existing Arc Pro B60 GPUs Get A 18% Boost Thanks To AI Optimizations

Apr 1, 2026 at 12:55pm EDT
The Intel logo is displayed prominently over a black, circular, metallic structure with a blue accent.

Intel has just published its latest MLPerf Inference v6.0 benchmarks, showcasing strong performance with the Arc Pro B70 & Arc Pro B60 GPUs.

Intel's Continuous AI Optimizations Deliver A 18% Boost for Existing Arc Pro GPUs, While Arc Pro B70 Offers A 80% Uplift Over Arc Pro B60.

MLCommons has just published its latest benchmarks in MLPerf Inference v6.0, which showcase the AI inference performance of various GPUs and CPUs. Today's results are special for Intel since they are the first showcase of their latest Arc Pro GPUs, the Arc Pro B70 and Arc Pro B65, which were recently unveiled, and feature the Big Battlemage chip.

Related Story Intel Arc Pro B70 32 GB GPU Tested In Games – Up To 40% Faster In Raster & 65% Faster In RT Versus B580, Trades Blows With 5060 Ti 16GB

The results today were carried out on a four-GPU system comprised of Intel's Arc Pro B70 and Arc Pro B65 GPUs, carrying up to 128 GB of VRAM, that can easily run 120B parameter models. The system was using Intel's latest Xeon 6 CPUs and delivered a 80% higher inference performance than the previous flagship, the Arc Pro B60 (24 GB per GPU).

Intel MLPerf v6.0 GPT-OSS-120B Inference Benchmark:

GPU ConfigOffline (Tokens/s)Server (Tokens/s)
4 x Arc Pro B70 (128 GB)1536.90951.67
4 x Arc Pro B60 Dual (192 GB)1601.91884.24
4 x Arc Pro B60 (96 GB)841.04452.19

Intel MLPerf v6.0 llama2-70b-99 Inference Benchmark:

GPU ConfigOffline (Tokens/s)Server (Tokens/s)
4 x Arc Pro B70 (128 GB)2459.181698.57
4 x Arc Pro B60 Dual (192 GB)3270.662199.50
4 x Arc Pro B60 (96 GB)1697.661106.26

Intel MLPerf v6.0 llama3.1 8b Inference Benchmark:

GPU ConfigOffline (Tokens/s)Server (Tokens/s)
4 x Arc Pro B60 Dual (192 GB)52.8349.17
4 x Arc Pro B70 (128 GB)36.0732.58
4 x Arc Pro B60 (96 GB)26.1524.57
4 x Arc Pro B50 (64 GB)13.459.27
2 x Xeon 6 (128 Cores)9.613.68

While the performance improvement from GPU to GPU is great, Intel has also showcased the AI software optimizations it has made, achieving a 18% boost on existing GPUs such as the Arc Pro B60.

In addition to the GPUs, Intel is also submitting its Xeon 6 CPUs to MLPerf Inference v6.0. The latest Xeon 6 lineup with P-Cores delivers a 90% generation performance gain with built-in features such as AMX and AVX-512.

Intel GPU Systems, featuring newly launched Intel Arc Pro B70/B65 GPUs, are designed to meet the needs of modern AI inference and provide an all-in-one inference platform combining full-stack validated hardware and software. With enhanced memory capacity, they aim to simplify the adoption and ease of use with a containerized solution built for Linux environments, optimized to deliver incredible inference performance with multi-GPU scaling and PCIe P2P data transfers, and designed to include enterprise-class reliability and manageability features such as ECC, SRIOV, telemetry and remote firmware updates. For example, when compared to comparable competitor GPU solutions the Intel Arc Pro B70 is able to handle significantly larger models and context windows in multi-GPU setups – powering up to 1.6x as much KV cache capacity when running larger models.

AI inference is increasingly defined not only by GPU throughput but also by CPU-accelerated system performance. The CPU, shaping overall cluster efficiency and total cost of ownership, is also responsible for critical functions such as memory management, task orchestration, and workload distribution, while ensuring the security, reliability, and operational continuity essential to modern AI infrastructure.

Intel continues to be the only server processor vendor to submit stand-alone CPU results for MLPerf inference benchmarks, underscoring its leadership and strong commitment to advancing AI inference across both compute and accelerator centric platforms. As the most widely used host CPU in AI accelerated systems—with over half of MLPerf 6.0 submissions powered by Xeon—Intel further reinforces its position at the core of the industry’s AI infrastructure. This leadership extends to the silicon itself: Intel Xeon 6 processors with P-cores delivered up to a 1.9x generational performance gain in MLPerf Inference v5.1, while built-in AI acceleration technologies such as AMX and AVX512 allow workloads like LLM inference, fine tuning, and classical machine learning to run efficiently without the need for dedicated accelerator hardware.

via Intel

It's nice to see Intel continuing to optimize its existing GPUs and CPUs for inference, and that shows in the MLPerf Inference v6.0 benchmarks. The Arc Pro is a very potent card for AI, as it brings 32 GB VRAM and lots of AI compute for under $1000 US. The GPU is expected to hit retail availability soon.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.