Intel Sapphire Rapid-SP Xeon CPUs To Feature Up To 64 GB HBM2e Memory, Also Talks Next-Gen Xeon & Data Center GPUs For 2023+

At SC21 (Supercomputing 2021), Intel hosted a brief session where they discussed their next-generation data center roadmap and talked about their upcoming Ponte Vecchio GPUs & the Sapphire Rapids-SP Xeon CPUs.

Intel Talks Sapphire Rapids-SP Xeon CPUs & Ponte Vecchio GPUs at SC21 - Also Reveals Next-Gen Data Center Lineup For 2023+

Intel had already discussed most of the technical details regarding its next-gen data center CPU & GPU lineup at Hot Chips 33. They are reaffirming what they've said and also revealing a few more tidbits at SuperComputing 21.

Related StoryHassan Mujtaba
Intel Offering Arc Scavanger Hunt Winners Core i7-12700K & Core i5-12600K CPUs As An Alternative To Arc A770 & A750 Prizes

The current generation of Intel Xeon Scalable processors has been extensively adopted by our HPC ecosystem partners, and we are adding new capabilities with Sapphire Rapids – our next-generation Xeon Scalable processor that is currently sampling with customers. This next-generation platform delivers multi-capabilities for the HPC ecosystem, bringing for the first time in-package high bandwidth memory with HBM2e that leverages the Sapphire Rapids multi-tile architecture. Sapphire Rapids also brings enhanced performance, new accelerators, PCIe Gen 5 and other exciting capabilities optimized for AI, data analytics and HPC workloads.

HPC workloads are evolving rapidly. They are becoming more diverse and specialized, requiring a mix of heterogeneous architectures. While the x86 architecture continues to be the workhorse for scalar workloads, if we are to deliver orders-of magnitude performance gains and move beyond the exascale era, we must critically look at how HPC workloads are run within vector, matrix and spatial architectures, and we must ensure these architectures seamlessly work together.Intel has adopted an “entire workload” strategy, where workload-specific accelerators and graphics processing units (GPU) can seamlessly work with central processing units (CPU) from both hardware and software perspectives.

We are deploying this strategy with our next-generation Intel Xeon Scalable processors and Intel Xe HPC GPUs (code-named “Ponte Vecchio”) that will power the 2 exaflop Aurora supercomputer at Argonne National Laboratory. Ponte Vecchio has the highest compute density per socket and per nodes, packing 47 tiles with our advanced packaging technologies: EMIB and Foveros. There are over 100 HPC applications running on Ponte Vecchio. We are also working with partners and customers including – ATOS, Dell, HPE, Lenovo, Inspur, Quanta and Supermicro – to deploy Ponte Vecchio in their latest supercomputers.

via Intel

Intel Sapphire Rapids-SP Xeon Data Center CPUs

According to Intel, the Sapphire Rapids-SP will come in two package variants, a standard, and an HBM configuration. The standard variant will feature a chiplet design composed of four XCC dies that will feature a die size of around 400mm2. This is the die size for a singular XCC die and there will be four in total on the top Sapphire Rapids-SP Xeon chip. Each die will be interconnected via EMIB which has a pitch size of 55u and a core pitch of 100u.

The standard Sapphire Rapids-SP Xeon chip will feature 10 EMIB interconnects and the entire package will measure at a mighty 4446mm2. Moving over to the HBM variant, we are getting an increased number of interconnects which sit at 14 and are needed to interconnect the HBM2E memory to the cores.

The four HBM2E memory packages will feature 8-Hi stacks so Intel is going for at least 16 GB of HBM2E memory per stack for a total of 64 GB across the Sapphire Rapids-SP package. Talking about the package, the HBM variant will measure at an insane 5700mm2 or 28% larger than the standard variant. Compared to the recently leaked EPYC Genoa numbers, the HBM2E package for Sapphire Rapids-SP would end up 5% larger while the standard package will be 22% smaller.

Related StoryJason R. Wilson
AMD & NVIDIA GPU Sales Continue To Decline, Reports Estimates Shipments Drop By 50%
  • Intel Sapphire Rapids-SP Xeon (Standard Package) - 4446mm2
  • Intel Sapphire Rapids-SP Xeon (HBM2E Package) - 5700mm2
  • AMD EPYC Genoa (12 CCD Package) - 5428mm2

Intel also states that the EMIB link provides twice the bandwidth density improvement and 4 times better power efficiency compared to standard package designs. Interestingly, Intel calls the latest Xeon lineup Logically monolithic which means that they are referring to the interconnect that'll offer the same functionality as a single-die would but technically, there are four chiplets that will be interconnected together. You can read the full details regarding the standard 56 core & 112 thread Sapphire Rapids-SP Xeon CPUs here.

Intel Xeon SP Families (Preliminary):

Family BrandingSkylake-SPCascade Lake-SP/APCooper Lake-SPIce Lake-SPSapphire RapidsEmerald RapidsGranite RapidsDiamond Rapids
Process Node14nm+14nm++14nm++10nm+Intel 7Intel 7Intel 3Intel 3?
Platform NameIntel PurleyIntel PurleyIntel Cedar IslandIntel WhitleyIntel Eagle StreamIntel Eagle StreamIntel Mountain Stream
Intel Birch Stream
Intel Mountain Stream
Intel Birch Stream
Core ArchitectureSkylakeCascade LakeCascade LakeSunny CoveGolden CoveRaptor CoveRedwood Cove?Lion Cove?
IPC Improvement (Vs Prev Gen)10%0%0%20%19%8%?35%?39%?
MCP (Multi-Chip Package) SKUsNoYesNoNoYesYesTBD (Possibly Yes)TBD (Possibly Yes)
SocketLGA 3647LGA 3647LGA 4189LGA 4189LGA 4677LGA 4677TBDTBD
Max Core CountUp To 28Up To 28Up To 28Up To 40Up To 56Up To 64?Up To 120?Up To 144?
Max Thread CountUp To 56Up To 56Up To 56Up To 80Up To 112Up To 128?Up To 240?Up To 288?
Max L3 Cache38.5 MB L338.5 MB L338.5 MB L360 MB L3105 MB L3120 MB L3?240 MB L3?288 MB L3?
Vector EnginesAVX-512/FMA2AVX-512/FMA2AVX-512/FMA2AVX-512/FMA2AVX-512/FMA2AVX-512/FMA2AVX-1024/FMA3?AVX-1024/FMA3?
Memory SupportDDR4-2666 6-ChannelDDR4-2933 6-ChannelUp To 6-Channel DDR4-3200Up To 8-Channel DDR4-3200Up To 8-Channel DDR5-4800Up To 8-Channel DDR5-5600?Up To 12-Channel DDR5-6400?Up To 12-Channel DDR6-7200?
PCIe Gen SupportPCIe 3.0 (48 Lanes)PCIe 3.0 (48 Lanes)PCIe 3.0 (48 Lanes)PCIe 4.0 (64 Lanes)PCIe 5.0 (80 lanes)PCIe 5.0 (80 Lanes)PCIe 6.0 (128 Lanes)?PCIe 6.0 (128 Lanes)?
TDP Range (PL1)140W-205W165W-205W150W-250W105-270WUp To 350WUp To 375W?Up To 400W?Up To 425W?
3D Xpoint Optane DIMMN/AApache PassBarlow PassBarlow PassCrow PassCrow Pass?Donahue Pass?Donahue Pass?
CompetitionAMD EPYC Naples 14nmAMD EPYC Rome 7nmAMD EPYC Rome 7nmAMD EPYC Milan 7nm+AMD EPYC Genoa ~5nmAMD EPYC BergamoAMD EPYC TurinAMD EPYC Venice

Intel Ponte Vecchio Data Center GPUs

Moving over to Ponte Vecchio, Intel outlined some key features of its flagship data center GPU such as 128 Xe cores, 128 RT units, HBM2e memory, and a total of 8 Xe-HPC GPUs that will be connected together. The chip will feature up to 408 MB of L2 cache in two separate stacks that will connect via the EMIB interconnect. The chip will feature multiple dies based on Intel's own 'Intel 7' process and TSMC's N7 / N5 process nodes.

Intel also previously detailed the package and die size of its flagship Ponte Vecchio GPU based on the Xe-HPC architecture. The chip will consist of 2 tiles with 16 active dies per stack. The maximum active top die size is going to be 41mm2 while the base die size which is also referred to as the 'Compute Tile' sits at 650mm2.

The Ponte Vecchio GPU makes use of 8 HBM 8-Hi stacks and contains a total of 11 EMIB interconnects. The whole Intel Ponte Vecchio package would measure 4843.75mm2. It is also mentioned that the bump pitch for Meteor Lake CPUs using High-Density 3D Forveros packaging will be 36u.

Aside from these, Intel also posted a roadmap in which they confirm that the next-generation Xeon Sapphire Rapids-SP family and the Ponte Vecchio GPUs will be available in 2022 but there's also the next-generation product lineup which is planned for 2023 and beyond. Intel hasn't explicitly told what it plans to bring but we know that Sapphire Rapids successor will be known as Emerald and Granite Rapids and the successor to that will be known as Diamond Rapids.

For the GPU side, we don't know what the successor to Ponte Vecchio will be known but expect it to be competing with NVIDIA's and AMD's next-generation GPUs for the data center market.


Moving forward, Intel has several next-generation solutions for advanced packaging designs such as Forveros Omni and Forveros Direct as they enter the Angstrom Era of transistor development.

Next-Gen Data Center GPU Accelerators

GPU NameAMD Instinct MI250XNVIDIA Hopper GH100Intel Ponte Vecchio
Packaging DesignMCM (Infinity Fabric)MonolithicMCM (EMIB + Foveros)
GPU ArchitectureAldebaran (CDNA 2)Hopper GH100Xe-HPC
GPU Process Node6nm4N7nm (Intel 4)
GPU Cores14,08016,89616,384 ALUs
(128 Xe Cores)
GPU Clock Speed1700 MHz~1780 MHzTBA
L2 / L3 Cache2 x 8 MB50 MB2 x 204 MB
FP16 Compute383 TOPs2000 TFLOPsTBA
FP32 Compute95.7 TFLOPs1000 TFLOPs~45 TFLOPs (A0 Silicon)
FP64 Compute47.9 TFLOPs60 TFLOPsTBA
Memory Capacity128 GB HBM2E80 GB HBM3128 GB HBM2e
Memory Clock3.2 Gbps3.2 GbpsTBA
Memory Bus8192-bit5120-bit8192-bit
Memory Bandwidth3.2 TB/s3.0 TB/s~3 TB/s
CoolingPassive Cooling
Liquid Cooling
Passive Cooling
Liquid Cooling
Passive Cooling
Liquid Cooling
LaunchQ4 20212H 20222022?
WccfTech Tv
Filter videos by