The Ultimate CPU and GPU Floating Point Performance Benchmark Battle : AMD Vs Intel
AnandTech recently did one of the more interesting showdowns in floating point history. A Benchmark spanning the CPU and GPU floating point peak performance test spanning Kaveri Trinity Llano Haswell and Ivy Bridge.
CPU and iGPU Floating Point Performance of AMD And Intel Bench marked.
The biggest power of the APUs is the HPC market niche. Or so they are slated. Therefore both the floating point 32 and floating point 64 capabilities of the CPU and iGPU are considered. AMD released that the fp64 capability is roughly 1/16th of its fp32 capability.
The benchmarks are per-cycle based and the peak is calculated in Gigaflops. The base frequency, not the turbo freq is used for peak calculation and for the iGPU the turbo freq is considered. The instruction sets considered were SSE, AVX (without FMA) and AVX with FMA (either FMA3 or FMA4).
Without further ado the benchmarks:
As you can see the floating point 64 scenario is somewhat of a mess. Problem is that Intel only enables fp64 for Direct Compute and NOT open CL, which is not the case with AMD. According to a source of AnandTech the speed is roughly 1/4 compared to fp32.
Sadly even on the AMD side of things, fp64 is a mess. AMD’s Trinity and Richland does not have standards compliant fp64 under OpenCL. It apparently depends on a set extension (which is proprietary btw) cl_amd_fp64. Also it seems that the do not support fp64 under Direct Compute at all. However Kaveri is indeed a turning point for APUS. Because under GCN kaveri’s gpu support fp64 under all APIs. Not to mention that HSA and Mantle should open up more possibilities in the future.
Kaveri’s fp64 peak including both the CPU and GPU is 110 gflops. Running the same code, optimized for AVX or FMA, on Haswell will grant better results. If you have Win 8 you can optimize it through C++ and Iris Pro. That said, it is doubtful that Kaveri will ever excel at fp64 intensive work.Where fp32 is concerned, Kaveri outperforms Haswell GT2 igpu and ivy bridge. However the GT3e should be more powerful in theory considering the Haswell CPU cores and Iris Pro graphics.