Intel has officially detailed its next-generation Sapphire Rapids-SP CPU lineup which will be part of the 4th Gen Xeon Scalable family. The Intel Sapphire Rapids-SP lineup will consist of a range of new technologies with the most important being the seamless integration of multiple chiplets or 'Tiles', as Intel refers to them, through their EMIB technology.
Intel Fully Details Next-Gen Sapphire Rapids-SP Xeon CPUs, Multi-Tile Chiplet Design Based on 'Intel 7' Process Node
The Sapphire Rapids-SP family will be replacing the Ice Lake-SP family and will go all on board with the 'Intel 7' process node (formerly 10nm Enhanced SuperFin) that will be making its formal debut later this year in the Alder Lake consumer family. The server lineup will feature the performance-optimized Golden Cove core architecture which delivers a 20% IPC improvement over Willow Cove core architecture. Several cores are featured on multiple tiles and packaged together through the use of EMIB.
Sapphire Rapids: Combining Intel’s Performance-cores with new accelerator engines, Sapphire Rapids sets the standard for next-generation data center processors. At the heart of Sapphire Rapids is a tiled, modular SoC architecture that delivers significant scalability while still maintaining the benefits of a monolithic CPU interface thanks to Intel’s EMIB multi-die interconnect packaging technology and advanced mesh architecture.
For Sapphire Rapids-SP, Intel is using a quad multi-tile chiplet design which will come in HBM and non-HBM flavors. While each tile is its own unit, the chip itself acts as one singular SOC and each thread has full access to all resources on all tiles, consistently providing low-latency & high cross-section bandwidth across the entire SOC. Each tile is further composed of three main IP blocks & which are detailed below:
- Acceleration Engines
- CXL 1.1
- PCIe Gen 5
- UPI 2.0
We have already taken an in-depth look at the P-Core over here but some of the key changes that will be offered to the data center platform will include AMX, AiA, FP16, and CLDEMOTE capabilities. The Accelerator Engines will increase the effectiveness of each core by offloading common-mode tasks to these dedicated accelerator engines which will increase performance & decrease the time taken to achieve the necessary task.
In terms of I/O advancements, Sapphire Rapids-SP Xeon CPUs will introduce CXL 1.1 for accelerator and memory expansion in the data center segment. There's also an improved multi-socket scaling via Intel UPI, delivering up to 4 x24 UPI links at 16 GT/s and a new 8S-4UPI performance-optimized topology. The new tile architecture design also boosts the cache beyond 100 MB along with Optane Persistent Memory 300 series support.
Intel has also detailed its Sapphire Rapids-SP Xeon CPUs with HBM memory. From what Intel has shown, their Xeon CPUs will house up to four HBM packages, all offering significantly higher DRAM bandwidth versus a baseline Sapphire Rapids-SP Xeon CPU with 8-channel DDR5 memory. This is going to allow Intel to offer a chip with both increased capacity and bandwidth for customers that demand it. The HBM SKUs can be used in two modes, an HBM Flat mode & an HBM caching mode.
Intel also showed a demo of their Sapphire Rapids-SP Xeon CPUs running an internal GEMM Kernel with and without AMX instructions. The AMX enabled solution delivered a 7.8x improvement over the non-AMX solution. This demo was also from early silicon so final performance may further improve. Intel didn't disclose any additional details regarding the test platform.
Intel Sapphire Rapids-SP Xeon CPU Platform
The Sapphire Rapids lineup will make use of 8 channel DDR5 memory with speeds of up to 4800 Mbps & support PCIe Gen 5.0 on the Eagle Stream platform. The Eagle Stream platform will also introduce the LGA 4677 socket which will be replacing the LGA 4189 socket for Intel's upcoming Cedar Island & Whitley platform which would house Cooper Lake-SP and Ice Lake-SP processors, respectively. The Intel Sapphire Rapids-SP Xeon CPUs will also come with CXL 1.1 interconnect that will mark a huge milestone for the blue team in the server segment.
Coming to the configurations, the top part is started to feature 56 cores with a TDP of 350W. What is interesting about this configuration is that it is listed as a low-bin split variant which means that it will be using a tile or MCM design. The Sapphire Rapids-SP Xeon CPU will be composed of a 4-tile layout with each tile featuring 14 cores each.
Following are the leaked configurations:
- Sapphire Rapids-SP 24 Core / 48 Thread / 45.0 MB / 225W
- Sapphire Rapids-SP 28 Core / 56 Thread / 52.5 MB / 250W
- Sapphire Rapids-SP 40 Core / 48 Thread / 75.0 MB / 300W
- Sapphire Rapids-SP 44 Core / 88 Thread / 82.5 MB / 270W
- Sapphire Rapids-SP 48 Core / 96 Thread / 90.0 MB / 350W
- Sapphire Rapids-SP 56 Core / 112 Thread / 105 MB / 350W
It looks like AMD will still hold the upper hand in the number of cores & threads offered per CPU with their Genoa chips pushing for up to 96 cores whereas Intel Xeon chips would max out at 56 cores if they don't plan on making SKUs with a higher number of tiles. Intel will have a wider and more expandable platform that can support up to 8 CPUs at once so unless Genoa offers more than 2P (dual-socket) configurations, Intel will have the lead in the most number of cores per rack with an 8S rack packing up to 448 cores and 896 threads.
The Intel Saphhire Rapids CPUs will contain 4 HBM2 stacks with a maximum memory of 64 GB (16GB each). The presence of memory so near to the die would do absolute wonders for certain workloads that require huge data sets and will basically act as an L4 cache.
AMD has been taking away quite a few wins from Intel as seen in the recent Top500 charts from ISC '21. Intel would really have to up their game in the next couple of years to fight back the AMD EPYC threat. Intel is expected to launch Sapphire Rapids-SP in 2022 followed by HBM variants that are expected to launch around 2023.
Intel Xeon SP Families (Preliminary):
|Family Branding||Skylake-SP||Cascade Lake-SP/AP||Cooper Lake-SP||Ice Lake-SP||Sapphire Rapids||Emerald Rapids||Granite Rapids||Diamond Rapids|
|Process Node||14nm+||14nm++||14nm++||10nm+||Intel 7||Intel 7||Intel 3||Intel 3?|
|Platform Name||Intel Purley||Intel Purley||Intel Cedar Island||Intel Whitley||Intel Eagle Stream||Intel Eagle Stream||Intel Mountain Stream|
Intel Birch Stream
|Intel Mountain Stream
Intel Birch Stream
|Core Architecture||Skylake||Cascade Lake||Cascade Lake||Sunny Cove||Golden Cove||Raptor Cove||Redwood Cove?||Lion Cove?|
|IPC Improvement (Vs Prev Gen)||10%||0%||0%||20%||19%||8%?||35%?||39%?|
|MCP (Multi-Chip Package) SKUs||No||Yes||No||No||Yes||Yes||TBD (Possibly Yes)||TBD (Possibly Yes)|
|Socket||LGA 3647||LGA 3647||LGA 4189||LGA 4189||LGA 4677||LGA 4677||TBD||TBD|
|Max Core Count||Up To 28||Up To 28||Up To 28||Up To 40||Up To 56||Up To 64?||Up To 120?||Up To 144?|
|Max Thread Count||Up To 56||Up To 56||Up To 56||Up To 80||Up To 112||Up To 128?||Up To 240?||Up To 288?|
|Max L3 Cache||38.5 MB L3||38.5 MB L3||38.5 MB L3||60 MB L3||105 MB L3||120 MB L3?||240 MB L3?||288 MB L3?|
|Memory Support||DDR4-2666 6-Channel||DDR4-2933 6-Channel||Up To 6-Channel DDR4-3200||Up To 8-Channel DDR4-3200||Up To 8-Channel DDR5-4800||Up To 8-Channel DDR5-5600?||Up To 12-Channel DDR5-6400?||Up To 12-Channel DDR6-7200?|
|PCIe Gen Support||PCIe 3.0 (48 Lanes)||PCIe 3.0 (48 Lanes)||PCIe 3.0 (48 Lanes)||PCIe 4.0 (64 Lanes)||PCIe 5.0 (80 lanes)||PCIe 5.0 (80 Lanes)||PCIe 6.0 (128 Lanes)?||PCIe 6.0 (128 Lanes)?|
|TDP Range (PL1)||140W-205W||165W-205W||150W-250W||105-270W||Up To 350W||Up To 375W?||Up To 400W?||Up To 425W?|
|3D Xpoint Optane DIMM||N/A||Apache Pass||Barlow Pass||Barlow Pass||Crow Pass||Crow Pass?||Donahue Pass?||Donahue Pass?|
|Competition||AMD EPYC Naples 14nm||AMD EPYC Rome 7nm||AMD EPYC Rome 7nm||AMD EPYC Milan 7nm+||AMD EPYC Genoa ~5nm||AMD EPYC Bergamo||AMD EPYC Turin||AMD EPYC Venice|