QNAP Pairs a 6-Year-Old Zen 2 EPYC With NVIDIA’s 96GB RTX PRO 6000 Blackwell in Its New Edge AI NAS

May 2, 2026 at 02:00pm EDT
Two NVIDIA GPUs are placed on a QNAP storage unit, with a monitor displaying a desert scene in the background.

QNAP has introduced its new AI NAS, which packs a 16-Core AMD EPYC "Zen 2" CPU & can be paired with up to an RTX PRO 6000 Blackwell GPU.

Old Meets New In QNAP's "QAI-h1290FX" AI NAS: 16 Zen 2 Cores & 96 GB RTX PRO 6000 GPU

The "QAI-h1290FX" is QNAP's latest Edge AI offering that combines two distinct hardware components. But before we get into the details, it should be mentioned that QNAP's latest AI NAS is designed for LLM, RAG, and various GenAI applications.

Related Story Computex 2026 Will Be NVIDIA’s Biggest Event Of The Year. Here’s What To Expect

Two components power the server, the first is the AMD EPYC 7302P CPU, which features 16 cores and 32 threads. This chip is based on the Zen 2 core architecture and offers enough performance to handle AI inference tasks at edge.

The second component is the GPU, which QNAP offers two options: either the 32 GB RTX PRO 4500 Blackwell or NVIDIA's flagship 96 GB RTX PRO 6000 Blackwell. Both offer absolutely behemoth compute capabilities for AI, with the PRO 4500 aimed for up to ~30B LLMs, while the PRO 6000 is ideal for 70B+ AI LLMs.

Besides the CPU/GPU, the QNAP QAI-h1290FX features support for 12 U.2 NVMe/SATA SSDs, and there is also high-speed networking support in the form of dual 25GbE and dual 2.5GbE LAN ports. The PCIe slots can also carry additional 100GbE AICs, though those are sold separately. The NAS is also compatible with QNAP's JBOD expansion enclosures for large-scale AI data storage.

The main features of the NAS include:

QNAP also shares some real-world performance of its new AI NAS using the NVIDIA RTX PRO 6000 96 GB Blackwell GPU. The tests include various models of different sizes, offering up to 172 Tokens/second. The results can be seen below:

ModeToken/secVRAM Usage
gpt-oss:120b (MXFP4)90 Token/sec~63GB
deepseek-r1:70b (q4_K_M)24 Token/sec~41GB
qwen3:32b (q4_K_M)46 Token/sec~21GB
gemma3:27b (q4_K_M)54 Token/sec~19GB
deepseek-r1:8b (q4_K_M)140 Token/sec~7GB
qwen3:8b (q4_K_M)172 Token/sec~7GB

Besides the LLMs run natively through Ollama, QNAP is sharing the vLLM concurrent inference tests for the same configuration with the results provided below:

Tested Large Language Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B (Hugging Face)

ThreadTotal Token/secavg Token/Thread/Sec
179 Token/sec79 Token/sec
2166 Token/sec83 Token/sec
5410 Token/sec82 Token/sec
10688 Token/sec68.8 Token/sec
20810 Token/sec40.5 Token/sec
50850 Token/sec17 Token/sec

Tested Large Language Model: openai/gpt-oss-20b (Hugging Face)

ThreadTotal Token/secavg Token/Thread/Sec
1218 Token/sec218 Token/sec
2340 Token/sec170 Token/sec
51045 Token/sec209 Token/sec
10880 Token/sec88 Token/sec
20600 Token/sec30 Token/sec

QNAP offers a wide range of storage, network, and interface expansion cards, which can be bought separately to expand the capabilities of the AI server. RAM is also sold separately with options ranging from 8 GB DDR4-3200 modules up to 64 GB DDR4-3200 kits. The system comes with a 5-year warranty, and is priced at $8999 for the 64 GB, $13,499 for the 128 GB, and $15,999 for the 256 GB variant.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.