AMD goes official with its next version of open software stack technologies in the form of ROCm 7, which further accelerates AI & developer productivity.
AMD Unveils ROCm 7: The Next-Generation of Open Stack Software Innovations With Focus on AI Inferencing
With the announcement of ROCm 7, AMD is finally moving forward from its ROCm 6 software stack, which itself has seen various updates over the last few years and since the advent of AI computing. The following are some of the main features that AMD is focusing on with ROCm 7:
- Latest Algorithms & Models
- Advanced Features for Scaling AI
- MI350 series support
- Cluster Management
- Enterprise Capabilities
With ROCm, AMD says that it is focusing more on the growing inference capabilities within its software stack. The ROCm 7 stack will include enhanced frameworks such as vLLM v1, llm-d, SGLang, and also focuses on serving various optimizations such as Distributed Inference, Prefill, and Disaggregation. New Kernels and Algorithms coming to ROCm 7 include GEMM Autotuning, MoE, Attention, and Python-Based Kernel Authoring.
AMD has already announced FP6 and FP4 support for its MI350 series, and ROCm 7 also includes full support for these advanced datatypes such as FP8, FP6, FP4, and Mixed precision.
In terms of performance, AMD says that inference has been the largest area of focus with ROCm 7, adding up to 3.5x performance uplifts in AI workloads. Breaking down the performance uplifts, we can see up to a 3.2x increase in Llama 3.1 70B, a 3.4x increase in Qwen2-72B, and up to 3.8x in Deep Seek R1, versus ROCm 6.
In DeepSeek R1, AMD also compares its ROCm 7 stack running on an Instinct MI355X GPU against the NVIDIA Blackwell B200 platform running CUDA. ROCm 7 achieves a 30% faster throughput performance in DeepSeek R1 (FP8 Throughput) versus NVIDIA's CUDA.
As for training performance, ROCm 7 still delivers a significant uplift over ROCm 6 with a 3x uplift across Llama 2 70B, Llama 3.1 8B, and Quen 1.5 7B.
The new ROCm software stack will also be extended to Enterprise AI with complete end-to-end solutions, secure data integration, and ease of deployment. The software stack will work in coherence with GPUs, CPUs, and DPUs, and will support various workloads with a key focus on GenAI workloads.
Finally, AMD is opening ROCm support on Ryzen-based laptops and workstations later this year, along with in-box Linux and Full Windows support in the second half of this year.
Follow Wccftech on Google to get more of our news coverage in your feeds.
