AMD's 3D V-Cache CPUs deliver a huge boost versus the Non-X3D part in AI benchmarks, showcasing why they are best suited for RAG pipelines.
AMD 3D V-Cache Vs Non 3D V-Cache CPU Benchmarks in AI Showcase a Massive Uplift For RAG Pipelines
We know that there are two ways to do AI: the first is LLM, which is currently the most popular model. LLMs are AI models that have been pre-trained on a large set of data and feature various parameter sizes. But LLMs' limitations can be seen when it needs to generate responses on data it wasn't trained on.
That's where RAG or Retrieval-Augmented Generation) comes into play. RAG AI models use an external database to retrieve the answer for external queries. This offers a much more detailed answer, but can be slightly slower than LLMs (Large Language Models).
RAG AI relies heavily on vector database searches. Despite GPUs being the primary component used for AI processing due to their highly parallel nature, a large portion of vector searches is performed on the CPU. The more the requests, the higher the likelihood of CPUs failing, causing AI bottlenecks in the system.
As Agentic AI workloads gain momentum, we will continue to see CPU processing becoming just as important as GPU compute. Better CPUs will be required to address the latency bottlenecks as workflows become more search-driven.
CPUs with higher cache configurations are particularly useful in such scenarios. The HNSW (Hierarchical Navigable Small World) search algorithm is one example that relies on the CPU when the GPU is used to perform LLM inferencing. The larger cache on CPUs can be used to reduce the time required for HNSW to retrieve graphs. This leads to improved AI performance.
To test whether this theory is true or not, GiggleHD ran the X3D RAG Benchmark on a range of CPUs, including AMD's latest Ryzen 9000X3D lineup. The results are obvious from the start.
X3D RAG Benchmark: An open-source benchmark for measuring how CPU cache and architecture affect graph-based vector search and related stages in local/on-prem RAG pipelines. Designed for x86 CPUs (tested on AMD and Intel systems).
This benchmark targets personal-PC and small-team, single-node setups (roughly 100K–200K vectors). It is not intended to represent large-scale, distributed vector database services.
In the 100K Batch Search, the AMD Ryzen 3D V-Cache CPUs ended up to 88% faster than the non-3D V-Cache chips. In the 200K Batch Search test, the Ryzen 7 9850X3D offered a 50%+ boost over the Ryzen 7 9700X. Both of these are 8-core CPUs. The 8-Core 3D V-cache CPU was also much faster than the 16-core Ryzen 9 9950X.
The 100K Index Build test saw the times slashed by 50% and by 39% in 200K tests. The Throughput was also faster on the 3D V-Cache chips. Lastly, in the Concurrent RAG Throughput tests, the 8-Core Ryzen 3D V-cache CPUs performed well, but in TTFT Throughput, the differences between all CPUs were slimmer due to the fact that this task relies heavily on the GPU rather than the CPU.
Overall, this is an interesting showcase for chips with higher cache configurations, especially AMD's 3D V-Cache lineup, which not only offers strong gaming performance but can also be used as a strong AI RAG chip. The main highlights are the strong Vector Search, Index Building, and Concurrent Processing abilities of these chips.
AMD is also going to launch the Ryzen 9 9950X3D2 CPU in a few days, featuring two 3D V-cache dies. We can expect some strong numbers on that chip too since it offers the highest cache capacity of any Ryzen desktop processor to date.
Follow Wccftech on Google to get more of our news coverage in your feeds.
