Hardware Report

NVIDIA & IBM Working To Connect GPUs Directly to SSDs For Major Performance Boost Instead of Relying on CPUs

Jason R. Wilson • Mar 17, 2022 at 12:58am EDT

NVIDIA Confirms Cyberattack, Hackers Were Able To Run Away With Proprietary Information But Systems Operating As Per Normal

NVIDIA, IBM, and several university members have created an architecture to provide fast "fine-grain access" to considerable amounts of data storage for GPU-accelerated applications. This technology will benefit areas such as artificial intelligence, analytics, and machine-learning training.

Breakthrough in GPU performance technology from NVIDIA, IBM, and universities to increase the performance by directly connecting to SSDs instead of relying on the CPU

Big accelerator Memory, or BaM, is an intriguing endeavor to lower the dependence of NVIDIA GPUs & comparable hardware accelerators on a standard CPU such as accessing storage, which will improve performance and capacity.

The goal of BaM is to extend GPU memory capacity and enhance the effective storage access bandwidth while providing high-level abstractions for the GPU threads to easily make on-demand, fine-grain access to massive data structures in the extended memory hierarchy.

— BaM design paper written by the researchers

NVIDIA is the most prominent member of the BaM team, using their extensive resources for inventive projects such as moving routine CPU-focused tasks to GPU performance cores. Instead of depending on virtual address translation, page-fault-based on-demand data loading, and additional standard CPU-based mechanisms for managing considerable amounts of data, the new BaM will deliver software and hardware architecture allowing NVIDIA graphics processors to grab data straight from memory and storage areas and function that data without relying on only CPU cores.

Dissecting BaM for viewers, we see two prominent features: a software-managed cache of GPU memory. The assignment of transferring info between data storage and the graphics card is managed by the threads located on the cores of the GPU, through a process of using RDMA, PCI Express interfaces, and custom Linux kernel drivers, allowing for the SSDs to write and read memory from the GPU when required. Secondly, the software library for GPU threads requests data directly from NVMe SSDs by communicating with those drives. Driver commands are prepared by the GPU threads only under the order if the specific data requested is not located in the software-managed cache locations.

Algorithms operating on the graphics processor to complete heavy workloads will be able to access the information required efficiently and of utmost importance in such a way that is optimized for their specific data access routines.

From L-R: Comparison of traditional CPU-centric method to accessing storage, the GPU-directed BaM method, and how the GPU would physically be connected to the SSDs. Image source: Qureshi et al. via The Register

A CPU-centric strategy causes excessive CPU-GPU synchronization overhead and/or I/O traffic amplification, diminishing the effective storage bandwidth for emerging applications with fine-grain data-dependent access patterns like graph and data analytics, recommender systems, and graph neural networks,” the researchers stated in their paper this month.

BaM provides a user-level library of highly concurrent NVMe submission/completion queues in GPU memory that enables GPU threads whose on-demand accesses miss from the software cache to make storage accesses in a high-throughput manner," they continued. "This user-level approach incurs little software overhead for each storage access and supports a high-degree of thread-level parallelism.

Researchers from the three groups experimented on a prototype Linux-based system utilizing BaM and standard GPUs and NVMe SSDs to exhibit the design as a viable alternative to the current approach of the CPU directing all matters. Researches explain that the storage access can be put into simultaneous work, that the synchronization limitations are dismissed, and I/O bandwidth is used to boost application performance much more efficiently than before.

With the software cache, BaM does not rely on virtual memory address translation and thus does not suffer from serialization events like TLB misses.

— NVIDIA's chief scientist Bill Dally, who previously led Stanford's computer science department, and other prominent authors notate in the paper.

The new details of the BaM design will be open-sourced for both the company's hardware and software optimization for other companies to create such designs of their own. Similar functionality is AMD's Radeon Solid State Graphics card that positioned flash next to a graphics card processor.

News Source: The Register

About the author: Jason R. Wilson is a member of the Hardware news team at Wccftech. Equipped with a background in graphic design and writing, Jason works daily to improve his craft and continues to create new and innovative ideas every day.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on NVIDIA & IBM Working To Connect GPUs Directly to SSDs For Major Performance Boost Instead of Relying on CPUs

NVIDIA & IBM Working To Connect GPUs Directly to SSDs For Major Performance Boost Instead of Relying on CPUs

Breakthrough in GPU performance technology from NVIDIA, IBM, and universities to increase the performance by directly connecting to SSDs instead of relying on the CPU

Trending Stories

GameStop May Have Leaked Zelda: Ocarina of Time Remake Pre-Orders for August 4, Hinting First Real Footage Isn’t Far

Intel’s Former CEO Gelsinger Admits Firm ‘Scoffed’ at NVIDIA’s GPUs While Riding High on CPU Dominance & Makes Big Quantum Computing Claims

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

Square Enix’s Final Fantasy VII Rebirth Looks Like a Remaster on PC, as Shader Injector 2.0 Delivers Series’ Best Visuals

PlayStation 6 Patent Scraps Liquid Metal Cooling After PS5 Leaks Fried APUs And Motherboards For Years

Popular Discussions

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

NVIDIA & IBM Working To Connect GPUs Directly to SSDs For Major Performance Boost Instead of Relying on CPUs

Breakthrough in GPU performance technology from NVIDIA, IBM, and universities to increase the performance by directly connecting to SSDs instead of relying on the CPU

Related Story NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

Further Reading

Trending Stories

Popular Discussions