Panmnesia Unveils “CXL Protocol”, Allowing AI GPUs To Utilize Memory From DRAM Or SSDs With Minimal Latency

•

Jul 3, 2024 at 02:50pm EDT

Panmnesia, a KAIST startup, has unveiled a cutting-edge IP that enables adding external memory to AI GPUs using the CXL protocol over PCIe, breaking barriers of memory capacities.

Panmnesia's Newest CXL Solution Targets At Resolving HBM Limitations By Providing A Effective Infrastructure

Current AI accelerators are confined to onboard memory since manufacturers can only squeeze in a limited amount of HBM. With growing datasets and the need for power, the industry is focusing on racking up more AI GPUs, and the approach isn't sustainable for the longer run when considering the financial and manufacturing resources it takes up. In light of this, Panmnesia, a firm that is supported by the South Korean institute KAIST, has unveiled a CXL IP that can allow GPUs to leverage memory from DRAM or even SSDs, expanding from the in-built HBM.

To bridge the connectivity, CXL utilizes PCIe links, ensuring mass adoption among consumers. However, there's a catch. Traditional AI accelerators lack the necessary subsystems to connect with and utilize CXL for memory expansion directly, and solutions such as UVM ( Unified Virtual Memory) are quite slow, which defeats the purpose in the first place.

However, as a solution, Panmnesia has developed its own CXL 3.1-compliant Root Complex chip, which has multiple ports that connect the GPU to the external memory through a PCIe bus and the HDM (Host-Managed Device Memory) decoder acts as a bridge between the connection, managing memory allocation and translation.

Interestingly, Panmnesia decided to benchmark their solution (CXL-Opt) against prototypes developed by Samsung and Meta, which they have labeled as "CXL-Proto." To our surprise, CXL-Opt achieves a significantly lower round-trip latency, which is the time taken for data to travel from the GPU to the memory and back. CXL-Opt showed a two-digit nanosecond latency while CXL-Proto had 250ns of latency. Apart from that, CXL-Opt's execution time is far less than the UVM solution as it achieves IPC performance speeds 3.22 times more than UVM.

Panmnesia's solution can make massive strides in the markets, as it acts as an intermediary between stacking HBM chips and moving towards a more efficient solution. Given that the company is one of the first ones with an innovative CXL IP, if this gains traction, Panmnesia will benefit significantly.

News Source: Panmnesia

About the author: Muhammad Zuhair is a hardware and technology reporter for Wccftech, specializing in the semiconductor industry and the complex interplay between technology, manufacturing, and geopolitics. His coverage focuses on the corporate strategies and technological roadmaps of industry giants like TSMC, NVIDIA, Samsung, and Intel. Zuhair's expertise lies in deconstructing complex topics such as fabrication nodes (e.g., 2nm process), the economic impact of policies like the CHIPS Act, and the strategic development of AI infrastructure from NVIDIA, AMD and Intel.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Panmnesia Unveils “CXL Protocol”, Allowing AI GPUs To Utilize Memory From DRAM Or SSDs With Minimal Latency

Panmnesia's Newest CXL Solution Targets At Resolving HBM Limitations By Providing A Effective Infrastructure

Related Story Marvell’s Structera CXL Accelerators Compress Data By Up To 3.64x To Make Every Gigabyte Count As Memory Shortages Intensify

Further Reading

Qualcomm Claims Single-Core Leadership for Its First Server CPU, the Dragonfly C1000, Delivering 250+ Cores & 5 GHz By 2028

Just One Major AI GPU Lineup To Consume As Much Memory This Year As 100-150 Million Smartphones, Contributing To $100-$150 Per Unit Price Hikes From Apple And Samsung

Jefferies: NVIDIA Likely Has Between 600,000 And 900,000 H20 GPUs In Inventory, While Chinese Demand Is Around 1.8 Million Units

China Launches Its First 6nm GPUs For Gaming & AI, the Lisuan 7G106 12 GB & 7G105 24 GB, Up To 24 TFLOPs, Faster Than RTX 4060 In Synthetic Benchmarks & Even Runs Black Myth Wukong at 4K High With Playable FPS