Intel has released the first major software for its Arc Pro "Project Battlematrix" solution, the LLM Scaler v1.0, with massive improvements.
Intel Arc Pro GPUs Get Major Software Update For Project Battlematrix With LLM Scaler 1.0, Massive Improvements & Support Added
During Computex 2025, Intel unveiled Project Battlematrix alongside its Arc Pro GPUs. Battlematrix is designed as a one-stop solution for inference workstation platforms running multiple Arc Pro GPUs. The company promised in its roadmap to offer the first container deployment along with vLLM staging and basic telemetry support in Q3 as an "Inference Optimized" container, and it's finally here with LLM Scaler v1.0.
The following is the full list of features and optimizations included in LLM Scaler container v1.0:
- vLLM:
- Performance optimization of TPOP for long input length (>4K): up to 1.8x perf for 40K seq length on 32B KPI model, and 4.2x perf for 40K seq length on 70B KPI model
- Performance optimizations with ~10% output throughput improvement for 8B-32B KPI models compared to the last drop
- By-layer online quantization to reduce the required GPU memory
- PP (pipeline parallelism) support in vLLM (experimental)
- torch.compile (experimental)
- speculative decoding (experimental)
- Support for embedding, a rerank model
- Enhanced multi-modal model support
- Maximum length auto-detecting
- Data parallelism support
- OneCCL benchmark tool enablement
- XPU Manager:
- GPU Power
- GPU Firmware update
- GPU Diagnostic
- GPU Memory Bandwidth
According to Intel, the new software stack is built with ease of use and industry standards in mind. The new container, which is designed with Linux in mind, is optimized to deliver up to 80% performance uplifts with multi-GPU scaling and PCIe P2P data transfers. It also comes with enterprise-class reliability and manageability features such as ECC, SRIOV, telemetry, and remote firmware updates.
As per the previous roadmap, this update will be followed by a more hardened container release in the same quarter, offering improved performance and vLLM serving. And finally, in Q4, Intel will move to a full feature set release.
Follow Wccftech on Google to get more of our news coverage in your feeds.
