Google Bets On The Agentic AI Era With Its AI Hypercomputer, Merges 8th-Gen TPUs, NVIDIA Rubin, & Axion CPUs Together

Apr 22, 2026 at 09:10am EDT
Google Bets On The Agentic AI Era With Its AI Hypercomputer, Merges 8th-Gen TPUs, NVIDIA Rubin, & Axion CPUs Together

Google has announced the AI Hypercomputer, which brings together TPUv8 series, NVIDIA Rubin, & Axion CPUs to power the Agentic AI era.

Google Cloud Next 26: AI Hypercomputer Announcement Gives Agentic AI The Next Push, Leverages In-House TPUs, CPUs & Scales Beyond With NVIDIA Rubin

Gone are the days of supercomputers; the Agentic AI era will be all about hypercomputers, which will combine various compute options to deliver customers the most flexible and performant AI architecture ever built.

Related Story Google Splits TPUv8 Strategy Into Two Chips, Handing Broadcom Training and MediaTek Inference Duties

Today, at Google's Cloud Next 26 event, the company formally announced its AI Hypercomputer. The new high-performance computing datacenter for Agentic AI houses an advanced, purpose-built architecture that unifies performance-optimized hardware for compute, storage, networking, open software, and ML frameworks.

To make Google's AI Hypercomputer possible, the company had to go above and beyond. It will house its latest custom TPUv8 series, Axion Cloud CPUs, and will also deploy NVIDIA Rubin GPUs. Today's announcement also comes with the launch of Google's 8th Gen TPU lineup, which comes in two flavors: the TPU 8t and the TPU 8i.

Google TPU 8t - Training Chip

The Google TPU 8t chip is designed as a training powerhouse, reducing the deployment of frontier models from months to weeks. The chip offers the highest possible compute throughput, shared memory, and interchip bandwidth in the most power-efficient package ever built. The TPU 8t chip has a total FP4 compute capacity of 121 Exaflops per pod, 2.84x higher than Ironwood.

The key features include:

Google TPU 8i - Inferencing Chip

The second chip, TPU 8i, is designed for inference and pairs an incredible 288 GB of HBM memory with 384 MB of on-chip SRAM, which is a 3x boost in capacities over the previous generation. With such a large SRAM, you can keep models active entirely on the chip. The TPU 8i chip has a total FP8 compute capacity of 331.8 Exaflops per pod, 6.74x higher than Ironwood.

The salient features of TPU 8i include:

When it comes to generation over generation improvements, the TPU8t Training chip offers a 2.7x better performance per dollar improvement over Ironwood "TPUv7" in large-scale training, the TPU8i Inference chip offers a 80% performance per dollar improvement over Ironwood "TPUv7" in low-latency targets for MoE model. Both chips also deliver twice the performance per watt improvement, which is vital for AI TCO.

Both chips support Google's 4th Gen liquid cooling technology that is able to sustain the higher compute and performance densities, not possible with air cooling.

FeatureTPU 8tTPU 8i
Primary WorkloadLarge-scale pre-trainingSampling, serving, and reasoning
Network Topology3D torusBoardfly 
Specialized Chip FeaturesSparseCore (Embeddings) & LLM Decoder EngineCAE (Collectives Acceleration Engine)
HBM Capacity216 GB288 GB
On-Chip SRAM (Vmem)128 MB384 MB
Peak FP4 PFLOPs12.610.1
HBM Bandwidth6,528 GB/s8,601 GB/s (~1.3x of TPU 8t)
CPU HeaderArm AxionArm Axion

And with that, let's round up the main highlights of the Google AI Hypercomputer, which are listed below:

Google Cloud will also be one of the first AI infrastructures to offer NVIDIA VR200 (Vera Rubin) accelerators. The Rubin GPUs will be paired with Google's brand new Virgo network, offering massive-scale training clusters alongside Google's own 8th Gen TPU family.

The Google AI hypercomputer will be used by several customers, including big names such as the US DOE, Boston Dynamics, Citadel Securities, Thinking Machine Labs, and Axia Energy.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.