US-based AI company, Tensordyne, has announced the successful tape-out of its Napier chip, which it claims to demolish NVIDIA's Blackwell & Rubin chips with leading token throughput and efficiency.
Tensordyne’s new Napier AI Chip arrives with one clear mission: to make NVIDIA’s Blackwell and Rubin chips look considerably less impressive
The Napier chip will be the core component of the Tensordyne Napier TDN system, which is designed in collaboration with Broadcom and HPE Juniper Networks. The Napier platform has one goal: to unify AI through novel logarithmic AI math, a tightly integrated memory architecture, and a high-performance scale-up interconnect that drives higher token throughput at low power.
Napier is built on TSMC's 3nm process, and with its successful tape-out, the chip is now in production. With the primary milestone achievement, Tensordyne is now working towards beta deployment and a broader infrastructure plan that represents over $200 million in forecasted Napier system demand. And the key area of focus is AI inferencing.

We just talked about how current AI infrastructure is constrained by power consumption, but to tackle these constraints, solutions such as 800V DC are going to incur a huge deployment cost. Infrastructures such as power and cooling alone make up 50% of the cost of major AI deployments, and to address these, Tensordyne has come up with a new inference stack across math, compute, memory, and networking:
TDN Math (Logarithmic Mathematics)
TDN replaces large-scale multiplication operations with simplified addition-based computation, significantly improving performance-per-watt efficiency across frontier AI models.
TDN AIP (Artificial Intelligence Processor)
Each TDN processor tightly integrates substantial fast SRAM alongside HBM memory, minimizing idle compute cycles and supporting efficient execution of the industry’s largest models.
TDN Link (Any-to-Any Scale-Up Interconnect)
Tensordyne’s proprietary scale-up fabric delivers sub-microsecond communication latency between processors, maximizing compute utilization and minimizing interconnect bottlenecks.

All of this is brought together in Tensordyne's TDN72 Inference Pod and Rack system. Each Pod is fitted with 72 Napier AI chips, which are composed of NVIDIA's NVL72 rack with 72 Blackwell or Rubin GPUs. It requires way less infrastructure capacity, and a Napier Rack combines for TDN72 pods to deliver:
- 17x more tokens per watt (vs NVIDIA Blackwell)
- 13x more tokens per second (vs NVIDIA Blackwell)
- Up to $33 million more annual revenue per rack
Tensordyne doesn't stop at just Blackwell comparison; they also compare the Napier solution against NVIDIA's upcoming Rubin platform. The company claims that its platform supports multi-trillion parameter models with a throughput of 1000 tokens/s per use in a single-rack configuration. To do the same, NVIDIA will require nine Rubin + Groq LPX racks.

Tensordyne’s Napier platform represents a bold leap forward in AI inference. By delivering 17× more tokens per watt and 13× higher throughput than NVIDIA Blackwell, while matching the performance of nine Rubin-based racks in a single compact footprint, it shatters the traditional speed-versus-cost and power-versus-performance trade-offs.
With dramatically lower infrastructure demands, up to $33 million more annual revenue per rack, and efficient scaling for multi-trillion parameter models, Napier doesn’t just compete with NVIDIA’s Blackwell & Rubin; it redefines what’s possible for next-generation AI deployment.
Follow Wccftech on Google to get more of our news coverage in your feeds.





