Unigen's latest AI module runs on a standard M.2 slot and offers up to 60 TOPS, 32 GB memory, & can run up to 20B parameter LLMs.
Unigen Amaretti AI Module Packs a 60 TOPS NPU, 32 GB Memory, & Consumes Just 10W Power On A M.2 or E1.S Slot
With the rise of local AI agents, many companies are putting out some unique AI products. Unigen is one of these manufacturers that has announced the Amaretti E1.S AI module, a tiny M.2-compatible module that looks like a regular SSD but houses some strong AI capabilities.
The Unigen Amaretti E1.S is based on the SAKURA-II AI accelerator from EdgeCortix, which was initially designed for low-power AI platforms, bringing these capabilities to Raspberry Pi5 & other ARM-based products. The accelerator chip features an NPU with 60 TOPS of INT8 and 30 TFLOPS of BF16 compute. It features a dual 64-bit LPDDR4x memory controller and packs 20MB of in-chip SRAM cache. The 19x19 BGA package consumes roughly 8-10W.
What Unigen has done is taken the SAKURA-II AI accelerator and put it on an E1.S board, along with an impressive memory capacity of up to 32 GB. The module is available in both 16 GB and 32 GB flavors, offering up to 68 GB/s bandwidth. The Amaretti module is rated at 10W, so you are getting 6 TOPS per Watt.
Now, in terms of performance, the 32 GB memory capacity enables the module to easily run AI LLMs with up to 20B parameters. This is ideal for low-power AI solutions that need to run GenAI & Agentic AI workflows. Furthermore, these modules can be stacked into multiple M.2 slots, further increasing their overall capabilities. EdgeCortix already offers a higher-end PCIe configuration which features two of these chips & additional capabilities, but the M.2 solution is definitely an interesting choice.
Many PCs, Desktops & Laptops, have idle sitting M.2 slots. If you are looking for localized AI and want to speed up your system, then these modules make a lot of sense.
According to Unigen, the AI module supports all the latest AI frameworks such as TensorFlow, PyTorch, ONNX, and Hugging Face. The main highlights of the module include:
- E1.S AI Module
- AI Accelerator: SAKURA-II
- Up to 1920 TOPS of inference performance with air-cooled Dual CPU Servers
- Use 20% of the Watts with TPUs compared to training GPUs
- GenAI LLMs for up to 20B parameters
- 14-week lead times, significantly less than the typical for GPU servers
- Up to 32GB per module
Unigen ships the Amaretti E1.S AI module with a pre-equipped heatsink. There's no information regarding the price, but the memory capacity should give a hint at what to expect.
Follow Wccftech on Google to get more of our news coverage in your feeds.
