Tenstorrent Vows to ‘Crush Everyone’ as Galaxy Blackhole Hits 350 Tokens/s on DeepSeek R1, Undercutting NVIDIA’s GB300 5x AI TCO

May 2, 2026 at 07:15am EDT
Several Tenstorrent server units are installed in a data center rack, displaying intricate geometric vent designs with visible green LED indicators.

Tenstorrent made a bold claim during their TT-Deploy livestream, saying they are going to crush everyone at everything, including AI, with their Galaxy servers.

Tenstorrent Galaxy Supercluster Offers 10x Faster GenAI Video, And Destroys Current-Gen GPUs With "Blitz Mode", Offering 350+ Tokens/s In DeepSeek R1

Jim Keller and his Tenstorrent are on a mission to challenge the existing AI hierarchy with their RISC-V-powered platforms.

Related Story Tenstorrent’s Optimized AI Model, Running on Blackhole Servers, Generates 5-Sec Video In Just 2.4 Seconds

As such, the company unveiled its latest Galaxy Blackhole servers for AI at scale. With Galaxy Blackhole, Tenstorrent offers a fully Networked and native AI solution that includes compute, memory, and networking, all unified into a single system optimized for the latest AI workloads.

The chip inside Galaxy servers is called Blackhole and is based on the RISC-V architecture, which competes against ARM and x86. During the event, Jim Keller said that the A0 silicon is already shipping, but there are software bugs that they are addressing.

To showcase the performance of its Galaxy Blackhole supercluster, Tenstorrent ran various demos during the TT-Deploy livestream.

Let's start with the specifications set by Tenstorrent. The Tensor core powering the Blackhole chips is called Tensix and features five RISC processors with matrix-multiply units, vector units, and local SRAM. Each RISC processor is fully programmable, and each core is attached to a high-bandwidth NOC. And several of these Tensor "Tensix" cores are deployed together to make a chip.

Tenstorrent explains that while competing GPUs such as the GB300 from NVIDIA. The company claims that to achieve higher Token throughput, the number of users is drastically decreased on competing platforms. That's not the case with Tenstorrent's Galaxy servers, which retain lower Token Cost ($6 vs ~$30), and achieve much lower TCO for firms using these servers.

We talked about this last week, too, and Tenstorrent has officially showcased up to 10x faster Video GenAI performance running on its Galaxy Supercluster. The system is able to generate an 81-frame (720p) video in just 2.4 seconds. That's a 5-sec video being generated in 2.4 seconds, faster than real-time.

In addition to the GenAI demo, Tenstorrent also showcased Blitz Mode for its Galaxy Blackhole server. Blitz Mode on Galaxy is optimized for premium, latency-sensitive AI workloads. With this mode, Galaxy servers can rack up to 350 tokens/s on Deepseek R1-0528 671B, swiftly outpacing the GPU competition. The two benchmarks demoed are listed below:

In terms of pricing and availability, the Tenstorrent Galaxy Blackhole server will be available in an air-cooled rack configuration with next-generation Blackhole chips and a fully open-source software stack, starting at $110,000. The system offers 23 PFLOPs of FP8 (AI) compute through 32 Blackhole chips, 6.2 GB of on-chip SRAM at 2.9 PB/s, 1 TB of DRAM at 16 TB/s, and 56 x 800G Ethernet Ports for up to 11.2 GB/s of scale-out bandwidth.

Customers can also purchase Galaxy Blackhole in supercluster configurations with 4-36 Galaxy servers. The base configuration with 4 Galaxy servers starts at $440,000.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.