QNAP Pairs a 6-Year-Old Zen 2 EPYC With NVIDIA’s 96GB RTX PRO 6000 Blackwell in Its New Edge AI NAS

Hassan Mujtaba • May 2, 2026 at 02:00pm EDT

Two NVIDIA GPUs are placed on a QNAP storage unit, with a monitor displaying a desert scene in the background.

QNAP has introduced its new AI NAS, which packs a 16-Core AMD EPYC "Zen 2" CPU & can be paired with up to an RTX PRO 6000 Blackwell GPU.

Old Meets New In QNAP's "QAI-h1290FX" AI NAS: 16 Zen 2 Cores & 96 GB RTX PRO 6000 GPU

The "QAI-h1290FX" is QNAP's latest Edge AI offering that combines two distinct hardware components. But before we get into the details, it should be mentioned that QNAP's latest AI NAS is designed for LLM, RAG, and various GenAI applications.

Two components power the server, the first is the AMD EPYC 7302P CPU, which features 16 cores and 32 threads. This chip is based on the Zen 2 core architecture and offers enough performance to handle AI inference tasks at edge.

The second component is the GPU, which QNAP offers two options: either the 32 GB RTX PRO 4500 Blackwell or NVIDIA's flagship 96 GB RTX PRO 6000 Blackwell. Both offer absolutely behemoth compute capabilities for AI, with the PRO 4500 aimed for up to ~30B LLMs, while the PRO 6000 is ideal for 70B+ AI LLMs.

Besides the CPU/GPU, the QNAP QAI-h1290FX features support for 12 U.2 NVMe/SATA SSDs, and there is also high-speed networking support in the form of dual 25GbE and dual 2.5GbE LAN ports. The PCIe slots can also carry additional 100GbE AICs, though those are sold separately. The NAS is also compatible with QNAP's JBOD expansion enclosures for large-scale AI data storage.

The main features of the NAS include:

All-Flash Storage Architecture: Twelve U.2 NVMe/SATA SSD slots enable ultra-fast I/O for high-frequency AI model execution and data streaming.
16-core AMD EPYC 7302P Processor: Provides 32 threads of server-class compute power—ideal for AI inference, virtualization, and heavy parallel workloads.
GPU-ready Architecture: Supports optional NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation GPU, featuring up to 96GB of GPU memory and support for CUDA, Tensor, and Transformer Engine acceleration—significantly boosting performance for on-prem LLM inference, image generation, and deep learning workloads.
Containerized AI Environment & GPU Resource Management: Supports Docker and LXD with intuitive GPU allocation. Users can quickly launch AI tools via the built-in AI app center and assign GPU resources without command-line configuration.
Fully Local Deployment with No Cloud Dependency: Run AI-powered chat assistants, document search engines, or knowledge bases fully on-premises. Keep sensitive data in-house while accelerating AI workflows.
High-speed Networking and Scalable Architecture: Comes with dual 25GbE and dual 2.5GbE ports. PCIe slots support optional 100GbE upgrades. Compatible with QNAP JBOD expansion enclosures for large-scale AI data storage.

QNAP also shares some real-world performance of its new AI NAS using the NVIDIA RTX PRO 6000 96 GB Blackwell GPU. The tests include various models of different sizes, offering up to 172 Tokens/second. The results can be seen below:

Mode	Token/sec	VRAM Usage
gpt-oss:120b (MXFP4)	90 Token/sec	~63GB
deepseek-r1:70b (q4_K_M)	24 Token/sec	~41GB
qwen3:32b (q4_K_M)	46 Token/sec	~21GB
gemma3:27b (q4_K_M)	54 Token/sec	~19GB
deepseek-r1:8b (q4_K_M)	140 Token/sec	~7GB
qwen3:8b (q4_K_M)	172 Token/sec	~7GB

Besides the LLMs run natively through Ollama, QNAP is sharing the vLLM concurrent inference tests for the same configuration with the results provided below:

Tested Large Language Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B (Hugging Face)

Thread	Total Token/sec	avg Token/Thread/Sec
1	79 Token/sec	79 Token/sec
2	166 Token/sec	83 Token/sec
5	410 Token/sec	82 Token/sec
10	688 Token/sec	68.8 Token/sec
20	810 Token/sec	40.5 Token/sec
50	850 Token/sec	17 Token/sec

Tested Large Language Model: openai/gpt-oss-20b (Hugging Face)

Thread	Total Token/sec	avg Token/Thread/Sec
1	218 Token/sec	218 Token/sec
2	340 Token/sec	170 Token/sec
5	1045 Token/sec	209 Token/sec
10	880 Token/sec	88 Token/sec
20	600 Token/sec	30 Token/sec

QNAP offers a wide range of storage, network, and interface expansion cards, which can be bought separately to expand the capabilities of the AI server. RAM is also sold separately with options ranging from 8 GB DDR4-3200 modules up to 64 GB DDR4-3200 kits. The system comes with a 5-year warranty, and is priced at $8999 for the 64 GB, $13,499 for the 128 GB, and $15,999 for the 256 GB variant.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on QNAP Pairs a 6-Year-Old Zen 2 EPYC With NVIDIA’s 96GB RTX PRO 6000 Blackwell in Its New Edge AI NAS

QNAP Pairs a 6-Year-Old Zen 2 EPYC With NVIDIA’s 96GB RTX PRO 6000 Blackwell in Its New Edge AI NAS

Old Meets New In QNAP's "QAI-h1290FX" AI NAS: 16 Zen 2 Cores & 96 GB RTX PRO 6000 GPU

Trending Stories

Xbox Studio Leaders Reportedly Detest Game Pass, Arguing it Destroyed the Value of Their $40+ Games Now Available for Pennies

CXMT Supply Chain To Witness Major Process Transition To Seize DDR6 Opportunity Before Commercialization, Threatening Samsung’s And SK hynix’s Global Hold

Over 80% Of Samsung Foundry Workers Are Planning To Leave Amid A Yawning Pay Gap With The Memory Division

Square Enix’s Final Fantasy VII Rebirth Looks Like a Remaster on PC, as Shader Injector 2.0 Delivers Series’ Best Visuals

SpaceX Awards Foxconn A Part In A Huge $52 Billion Order For 13,000 Racks Of NVIDIA GB300 AI Servers, Where Each Rack Costs $4 Million And The Total Order Spans Nearly 1 Million GPUs

Popular Discussions

AMD Medusa Point 10-Core “Zen 6” CPU Beats Strix Point 10-Core “Zen 5” By Nearly 35% While Operating at 5.4 GHz

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

AMD Ryzen 7 7800X3D CPU Drops To $299 A Day Ahead of 7700X3D’s Launch, Bringing 3D V-Cache Goodness To Mainstream Gamers

QNAP Pairs a 6-Year-Old Zen 2 EPYC With NVIDIA’s 96GB RTX PRO 6000 Blackwell in Its New Edge AI NAS

Old Meets New In QNAP's "QAI-h1290FX" AI NAS: 16 Zen 2 Cores & 96 GB RTX PRO 6000 GPU

Related Story OpenAI’s First Custom Chip Is As Hot As A Jalapeño For AI, As The Firm Calls It The “Best Inference Platform” for LLMs

Further Reading

Trending Stories

Popular Discussions