⋮  

Intel Core i7-6700K Skylake-K CPU Review With ASUS Z170 Pro Gaming

Submit

Intel Skylake CPU Architecture Analysis

We have all been waiting to know the details for Intel's 6th generation Skylake processors and the time has finally come when we can fully unwrap the details of Intel's latest 14nm core architecture. To start off, we would like to tell you four specific features that Skylake is going to focus on that include Scalability, power, performance, media and graphics. The Intel Skylake development started off with just a 3x TDP scale, 2x form factor range and a classic set of PC IO features but ended with 20x TDP scale, 4x form factor range and delivering a wide range of I/O features across PC and tablet devices that are going to launch in Q4 2015.

Technically, each Skylake core is bigger and wider, features better instructions per clock and improved power efficiency. The basic core block of a Skylake chip includes four of these cores which share a LLC through an enhanced interconnect ring known as the SOC Ring. The die includes graphics processors that range from GT1, GT2, GT3 to more performance oriented and advanced designs that include GT3e and GT4e with embedded DRAM (L4 Cache up to 128 MB) that feature support for OpenCL 2.0, DirectX 12 and OpenGL 4.4. The system agent includes the dual channel DDR4 memory controller, the display system for embedded and external displays while the PCI-Express lanes can be used to connect discrete graphics card for higher PC performance on desktop setups. Audio DSPs and sensor hubs also get an update with Skylake while a single integrated camera ISP can also be found inside the Skylake die for better imaging quality and lastly, Skylake delivers extended overclocking capabilities which we will talk about in a short moment.

So these are the general details of the Skylake microarchitecture but the core needs to be examined a bit more. Skylake features a vastly improved front end design with improved branch predictions that comes with a higher capacity compared to Haswell and has wider instruction supply with deeper buffers and fast prefetch. The deeper out-of-order buffers extract more instructions parallelism while the improved EUs (Execution Units) have lower latency, more units, can power down when idle and improve AES-GCM by 17% and AES-CBC by 33%. The load and store bandwidth is also larger with prefetcher improvements, deeper store buffer, better L2 cache miss bandwidth, improved page miss handling and new instructions for better cache management. Hyper threading performance is also improved with more wider retirement and Skylake gets higher queues of 64 per thread compared to 56/thread on Haswell and 28/thread on Sandy Bridge. The more parallelism optimized architecture of Skylake enhances the FP register file to 180 versus 168 on Haswell, scheduled entries to 97 versus 60 on Haswell to extract more parallelism out of the core design. The window size has now been increased exceptionally to 224 compared to 168 on Sandy Bridge and 192 on Haswell.

The new interconnect ring delivers double the throughput (bandwidth) without sacrificing power. LLC (Last Level Cache) throughput is also doubled with cache miss handling, the DDR4 DRAM as a whole deliver vastly improved bandwidth to the system while eDRAM based chips can effectively reduce latency and transfer speeds within the die block. The system acts as a fully coherent design to share and manage memory and data transfer and store loads in the processor. We won't dive into the power details in this topic since it is covered in the next page but let's dive into the overclocking architecture infused in Skylake.

Intel Skylake Recieves Full Range BCLK Improvements and Finer Grain Tuning For DDR4 Memory

With Skylake, Intel is leveraging their overclock support on their processors. Intel already added significant overclocking features on Haswell with real-time overclocking software, ration base clock overclocking and latest Intel XMP modes with their Intel Extreme Tuining utility. With Skylake, Intel adds full range BCLK tune options and improved DDR4 memory overclocking. With a fully unlocked turbo design that is controllable through software and BIOS, the full BCLK overclocking allows full range, 1 MHz increments over Haswell's Ration-based tune in 100/125/166 MHz. The unlocked core ratios can be tuned up to 83 in 100 MHz increments with complete turbo overrides for voltage, power limits, IccMax. The Skylake processors also fully support DDR4 overclocking with override capabilities of up to 4133 MT/s and DDR steps tuning in 100/133 MHz compared to 200/266 MHz with finer grain increments. Even the graphics clock can be tuned with ratios up to 60 in 50 MHz increments, with fully turbo voltage controls.

Intel has also focused on Top-down Microarchitecture Analysis Method (TMAM) is an industry-proven systematic approach that identifies performance bottlenecks in out-of-order cores. Identifying true bottlenecks lets developers focus software tuning to remediate them and improve efficiently on same hardware. TMAM simplifies cycle-accounting using microarchitecture independent metrics organized in one single hierarchy which makes analysis simple. Using TMAM, the high-learning curve associated with each microarchitecture generation is replaced by a structured drill-down that guides the user to true performance limiters.

Note: Before ending this section, I need to point out that Intel clarified that there's no Inverse Hyper Threading found on Skylake CPUs and everything that has been in the talks for the past few days are rumors.

Share on Reddit