Etched Pulls 400+ Engineers From NVIDIA, TSMC & More to Build a New Frontier Inference Cluster For AI Which Is Already Worth $1B in Demand

Jun 30, 2026 at 03:05pm EDT
A detailed underside view of a data center rack with dense black cabling, featuring an 'etched' logo on the metal surface.

Etched, a new startup with a team of 400+ engineers, has announced its latest AI solution, the Frontier Inference Clusters.

Etched Comes Out of Stealth To Unveil Its Frontier Inference Clusters, Achieves Successful A0 Tapeout & On-Track For $1B+ In AI Contracts

What happens when 400+ Engineers from leading firms such as NVIDIA, Google, Broadcom, TSMC, SK Hynix, and more come together? Well, Etched happens.

Related Story Qualcomm’s HBC Stacks Compute Beneath DRAM To Smash The AI Memory Wall, Claiming 6x The Bandwidth Per Watt Of HBM

This is a new AI startup who are co-designing chips, racks, software, and advanced manufacturing methods for frontier models. The company promises best-in-class throughput, latency, cost, and power efficiency for both prefill and decode workloads, and they have first silicon proof to back up their claims.

Today, Etched officially announced itself and the successful tapeout of its very first A0 silicon. The tapeout was actually done earlier this year on TSMC's N4P process technology, and they've since been busy validating their first rack-scale product, which has already racked in $1B+ in AI customer demand. The company has so far raised $800M across four unannounced financings, including a strategic investment from VentureTech Alliance, and is working to deepen & expand partnerships with leading semiconductor firms.

VLI Processor Pumps Out 80% Peak Flops At Half The Voltage of Existing AI Chips

So coming to the infrastructure, Etched lays out the plans to tackle frontier models including multi-trillion-parameter MoEs, long-context and agentic AI workloads. For this, the company had to work on a range of new chips, packages, PCBs, cold plates, interconnects, and more.

The first of these is a Low-Voltage Inference (LVI) for high-throughput workloads. This chip comes with a new architecture that allows it to do math at half the voltage of most AI chips. With this approach, Etched solves the sustained performance issues that occur in most AI chips that throttle down as chips draw more power at full-voltage mode, leading to under half the peak FLOPs.

CMS Accelerator Offers HBM/SRAM Combo

The second part is the Cluster Scale Memory (CSM) for low-latency workloads. We've seen the move towards massive SRAM blocks over HBM for faster decode speeds, but SRAM chips don't offer good FLOPs throughput or memory capacity. Etched's solution with CSM is a lower-latency and shared memory pool product that retains high-bandwidth interconnect for faster memory access. This HBM/SRAM hybrid tackles both memory capacity and memory latency while offering lower cost, higher reliability, better yields, and improved thermal characteristics.

All combined, Etched has been able to achieve state-of-the-art throughput, latency, and power efficiency in early customer tests across inference workloads. Etched's VLI processor can run trillion-parameter sparse MoEs at 80% peak FLOPs without thermal throttling.

Currently, Etched is scaling production at an unprecedented pace and has built a 2MW datacenter in its offices, along with the opening of a factory in Taiwan for 24/7 engineering. The company is promising more updates on performance and roadmaps this summer.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.