This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

Muhammad Zuhair • Feb 20, 2026 at 01:21pm EST

The image shows a Taalas HCI Technology Demonstrator featuring the Llama 3.1 8B model, TSMC 6nm technology, 815mm² area, 53 — Image Credits: Taalas

Well, it appears that the chip startup Taalas has found a solution to LLM response latency and performance by creating dedicated hardware that 'hardwires' AI models.

Taalas Manages to Achieve 10x Higher TPS With Meta's Llama 8B LLM, That Too With 20x Lower Production Costs

When you look at today's world of AI compute, latency is emerging as a massive constraint for modern-day compute providers, mainly because, in an agentic environment, the primary moat lies in token-per-second (TPS) figures and how quickly you can get a task done. One solution the industry sees is integrating SRAM into their offerings, and companies like Cerebras and Groq are already exploring it. However, the startup Taalas has apparently explored a rather intriguing route: pivot away from general-purpose computing towards ASICs for LLMs.

Founded 2.5 years ago, Taalas developed a platform for transforming any AI model into custom silicon. From the moment a previously unseen model is received, it can be realized in hardware in only two months. The resulting Hardcore Models are an order of magnitude faster, cheaper, and lower power than software-based implementations.

- Taalas

According to the company, its approach focuses on two different fundamentals. The first is the specialization of AI workloads at the hardware level. And when we say hardware-focused, it literally means mapping specific neural networks of LLMs onto the silicon itself, to optimize infrastructure for each model. The second target area is what the company calls "merging storage and computation", and here, the focus is on overcoming memory walls and the overhead in data communications within a general-purpose system.

With their solution, all computation happens at "DRAM-level" density to ensure faster intercommunication, which is one of the reasons Taalas has managed to solve the latency problem with LLMs. Their solution doesn't include advanced cooling, HBM, packaging, and complex integration; instead, all the innovation happens within the engineering dynamics of silicon. Taalas has also showcased its first product, called HC1, which integrates Meta's Llama 3.1 8B LLM. The performance results are 'shocking' to say the least.

Taalas delivers 10x the TPS of today's "high-end" infrastructure while achieving 20x lower production costs. Well, you might think that latency and performance constraints are solved here, but let's look at the HC1 chip from a technical angle. It features TSMC's 6nm node and a chip size up to 815 mm², which is almost the size of NVIDIA's H100 chip. The HC1 hosts an eight-billion-parameter model, while today's frontier LLMs scale up to one trillion parameters. And, if you have guessed it by now, Taalas would need to rework its silicon strategy.

And the only way to scale up performance is to offer a cluster-based approach, and according to Taalas, they have already done this with DeepSeek's R1, achieving a 12,000 TPS/user figure in a 30-chip configuration. So, the primary constraints now lie in market adoption and the business model. Given this hardwired approach, hardware would indeed be specific to certain LLMs, without the option to change model weights, but given the startup's speed figures, it isn't a bad bet.

About the author: Muhammad Zuhair is a hardware and technology reporter for Wccftech, specializing in the semiconductor industry and the complex interplay between technology, manufacturing, and geopolitics. His coverage focuses on the corporate strategies and technological roadmaps of industry giants like TSMC, NVIDIA, Samsung, and Intel. Zuhair's expertise lies in deconstructing complex topics such as fabrication nodes (e.g., 2nm process), the economic impact of policies like the CHIPS Act, and the strategic development of AI infrastructure from NVIDIA, AMD and Intel.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

Taalas Manages to Achieve 10x Higher TPS With Meta's Llama 8B LLM, That Too With 20x Lower Production Costs

Trending Stories

Xbox Layoffs Reduce id Tech Engine Team to 1 Developer, As Unreal Engine Dominance Is Set To Grip The Industry

All It Took For A Verizon Loyalty Discount Subscriber To Reduce His Monthly Bill By A Whopping 40% Was Saying The “Magic” Word To The Company Representative

Intel’s Arc Pro B70 Beats NVIDIA’s RTX 5090D In DeepSeek R1 AI LLM, Despite Costing A Quarter As Much, Offers Over 2000 Tokens/s

Avowed 2 Isn’t Dead Yet, Chris Avellone Says, as Obsidian Quietly Fights to Re-Pitch It After Xbox’s Cancellation

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

Popular Discussions

Intel’s Shot At Fabricating Apple’s A20 Chip For The Base iPhone 18 Collapses As A Credible Leaker Calls The Original Source A ‘Blowhard’

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

AMD Prepares For Zen 6 EPYC CPUs Launch For July 22nd-23rd, Confirms AMD’s Mark Papermaster

NVIDIA’s RTX 3060 12 GB Graphics Card Comeback Proves Just How Bad Things Are For The PC Gaming Market

AMD Ryzen Becomes The Top CPU Choice While Radeon Powers 1 In Every 3 Desktop Gaming GPUs Sold at Microcenter

This New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

Taalas Manages to Achieve 10x Higher TPS With Meta's Llama 8B LLM, That Too With 20x Lower Production Costs

Related Story NVIDIA Blackwell Costs Twice As Much As Google And Amazon’s Custom AI Chips, Yet Morgan Stanley Says It’s Worth It

Further Reading

Trending Stories

Popular Discussions