AMD Ryzen Performance Negatively Affected by Windows 10 Scheduler Bug
A newly discovered bug in Windows 10's scheduler has been found to be negatively affecting performance of AMD Ryzen CPUs. The bug has been confirmed to affect all Windows 10 versions but not Windows 7. It's not clear yet if Windows 8.1 is affected.
Growing Pains With AMD's New Simultaneous Multi-Threading
Ryzen processors are AMD's first ever to feature simultaneous multi-threading technology. Which enables each CPU core to execute two threads simultaneously. A primary thread for each core in addition to one auxiliary thread for added throughput in highly threaded workloads. The principle thread executed by each core is allocated maximum instruction per clock throughput, i.e. maximum performance. The additional SMT thread on the other hand can only opportunistically leverage underutilized resources in a given core.
Intel's hyper-threading technology works in a very similar fashion. Providing each "hyper-thread" with only a fraction of the resources available to the principle thread in any given CPU core. In best case scenarios SMT provides about 20-30% of additional throughput give or take in both Intel's latest Skylake microarchitecture and AMD's Zen microarchitecture.
Not All Threads Are Created Equal
Windows 10' scheduler correctly identifies Intel's hyper-threads as lesser performing than principal core threads and schedules tasks in a way that's takes advantage of the additional throughput without negatively impacting performance. Unfortunately the scheduler currently is not able to differentiate principal core threads from virtual SMT threads with Ryzen and in fact sees 16 thread Ryzen 7 processors as processors with 16 physical cores with equal resources per thread.
Because it does not give any preferential prioritization of scheduling tasks to primary threads over SMT threads like it does on Intel platforms, a massively larger percentage of tasks can and do end up getting scheduled for a virtual SMT thread rather than a principal core thread. Resulting in significant artificial performance degradation.
Ryzen In The Eyes Of The Windows 10 Scheduler
It also incorrectly identifies the amount of cache available per thread. Adding up the amount of L2 and L3 cache Windows 10's scheduler "thinks" is there totals to an insane 136MB of cache, when Ryzen 7 in fact only has 20MB of L2+L3 cache combined.
Windows 10 Scheduler Single Core Thread Mapping:
*--------------- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 *--------------- Instruction Cache 0, Level 1, 64 KB, Assoc 4, LineSize 64 *--------------- Unified Cache 0, Level 2, 512 KB, Assoc 8, LineSize 64 *--------------- Unified Cache 1, Level 3, 16 MB, Assoc 16, LineSize 64 -*-------------- Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 -*-------------- Instruction Cache 1, Level 1, 64 KB, Assoc 4, LineSize 64 -*-------------- Unified Cache 2, Level 2, 512 KB, Assoc 8, LineSize 64 -*-------------- Unified Cache 3, Level 3, 16 MB, Assoc 16, LineSize 64
What To Do In The Meantime
First things first, we've been informed that AMD has become aware of the issue. I'm sure they must've had some stern words for Microsoft over this mishap. The company has been pushing hardware manufacturers to adopt its brand new OS for years, so it must've left a bitter taste in AMD's mouth after embracing Microsoft's Windows 10 push for it to be rewarded with poor hardware support. With that being said, it's safe to assume the pair are actively working together to get this issue resolved.
We've seen similar issues in the past in the early days of Intel's hyper-threading. It took some time and a few patches for it to work as intended and we imagine it'll be the same for the all new Ryzen microarchitecture. The good news is that Windows 7 does not exhibit the same issue and motherboard makers have thankfully released Windows 7 drivers for their AM4 motherboards. So if users choose to go this route they should have some comfort in knowing that Windows 7 is bug free and is officially supported by the board makers.
If you're on Windows 10 there are still things you can do to bypass the scheduler issue and improve performance in specific workloads. For games you can disable SMT and in most cases see an improvement in performance. This in fact explains some of the performance disparities we've seen in some games with SMT. If a lot of your work involves lightly threaded or a mix of single and lightly threaded workloads we'd recommend disabling SMT until Microsoft releases an update to address the issue. If you're rendering or doing some other heavy multi-threaded work you should keep SMT enabled as this scheduler issue should not affect performance in this scenario to any significant degree.
We will continue to experiment with Ryzen here in the lab and update you as we learn more.