AMD EPYC Gets Performance Boost In Linux 5.18, Improvements To Scheduler/NUMA Benchmarked
Michael Larabel—site owner of Phoronix, Linux software engineer, and open-source analyst—has recently had the chance of testing the new AMD EPYC processors in several Linux workloads. He has also run comparison tests with a selection of AMD Zen processors to see how the new processor held up with previous iterations. Recently, he reported an upcoming change to the sched/core in Linux 5.18, which is intended to improve the performance of AMD processors, specifically the EPYC and later Zen series chips. After several adjustments to fix the NUMA balancing while it transfers through several LLCs, Larabel shares his benchmark results with the public.
AMD EPYC processors see significant improvements in performance for the upcoming Linux 5.18 kernel
The new update that originates in "sched/core," released before Linux 5.18, and the release of the kernel next month is not entirely focused on AMD processors only. However, the new change does offer better results for Zen series processors with their specific cache layouts.
Mel Gorman. The author of the new AMD patch explains in further detail:
[A kernel scheduler change from 2020] allowed an imbalance between NUMA nodes such that communicating tasks would not be pulled apart by the load balancer. This works fine when there is a 1:1 relationship between LLC and node but can be suboptimal for multiple LLCs if independent tasks use CPUs sharing cache. Zen* has multiple LLCs per node with local memory channels, and due to the allowed imbalance, it's far harder to tune some workloads to run optimally than it is on hardware with 1 LLC per node. This patch allows an imbalance to exist up to the point where LLCs should be balanced between nodes.
Gorman's initial benchmark tests show a 272% improvement in the Stream memory benchmark test. Additionally, he noticed 10% increased performance (with the maximum at 17%) in Coremark tests and 18% SPECjbb Java performance. The NPB parallel benchmark was 17% improvement, surprising due to the less than adequate past tests. This new addition to the Linux 5.18 kernel adds 50 lines, which, even with the small number of lines, is still a substantial improvement overall.
We will post all the results at the end of the article and highlight some significant advancements made to the AMD EPYC performance in Linux.
- AMD EPYC 75F3 2P server constructed around the ASRockRack ROME2D16-2T motherboard running Ubuntu 21.10 shows performance compared to Linux 5.17 Git and sched/core Git. Larabel used the same kernel Kconfig between both kernels in the tests. Both gits added the "sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCs" patch.
- Several workloads perform excellently on the AMD EPYC Zen 3 2P server with "sched/core" code slated for Linux 5.18 and outperform Linux 5.17 Git.
- Improvements were quite significant and coincided with Gorman's initial and successful tests.
- The Graph500 HPC benchmark was the only case where sched/core seemingly retrogressed with the current Linux scheduler code.
- The sched/core Git state showed several positives in light of Linux 5.18.
- The AMD EPYC 75F3 2P server testing with sched/core Git used the PostgreSQL database server. Improvements were high with the uncommitted Linux kernel code, offering increased throughput and minimized latency processing requests.
- The RocksDB key-value store shows consistent improvements with the new kernel build from sched/core thanks to NUMA imbalance changes for processors where numerous LLCs were used per node.
- Tests for the AMD EPYC 72F3 server built around a Supermicro H12SSL-i motherboard for additional test coverage saw duplicated tests for comparison purposes. The comparison used was Linux 5.17 Git and the identical sched/core Git kernel build employing the same Kconfig as the 5.17 state.
- Graph500 was the only test to relapse on the EPYC 75F3 2P server, but on the much smaller EPYC Zen 3 server, it displays superior improvement with the sched/core Git.
- MariaDB shows improvement. However, it was delivered to deviate higher during processing on the server used.
The great news for AMD Linux users is that they utilize the "sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCs" patch within TIP's sched/core. Since there were plenty of promising results for EPYC performance in several workloads, it looks to be a win for AMD at the moment. Larabel plans to run other benchmarks once Linux 5.18 is officially released and cover Intel and NVIDIA's additions to the new kernel.
Provided images from Phoronix benchmark tests: