Compact AI devices have become increasingly mainstream, but a new startup has broken the barriers by introducing the world's smallest AI supercomputer, which appears highly capable on paper.
Tiiny's AI New AI Pocket Lab Features the Newest ARM v9.2 Cores Onboard, Bringing Power to Deploy 120B LLMs
Edge AI has become an emerging segment of the computing industry, primarily because deploying open-source models on local machines allows for a more personalized workload. However, it also requires expensive hardware. Devices like NVIDIA's DGX Spark can cost up to $4,000, which isn't feasible for a general consumer. A startup called Tiiny AI plans to bridge this gap, not only by introducing a cost-effective solution, but also by introducing a device that is claimed to be the 'world's smallest' supercomputer, called the Tiiny AI Pocket Lab.
Interestingly, the device measures just 14.2 × 8 × 2.53 cm, weighing 300g, yet Tiiny AI claims that the supercomputer can successfully deploy a 120-billion-parameter model, a one-of-a-kind achievement. LLMs usable with this machine are said to be perfect for "PhD-level reasoning, multi-step analysis, and deep contextual understanding." With on-device capabilities, the AI Pocket Lab is ideal not only for consumers but also for those seeking to experiment with local LLM deployment.
| Category | Specification |
|---|---|
| Processor | ARMv9.2 12-core CPU |
| AI Compute Power | Custom heterogeneous module (SoC + dNPU), ≈ 190 TOPS |
| Memory & Storage | 80GB LPDDR5X RAM + 1TB SSD |
| Model Capacity | Runs up to 120B-parameter LLMs fully on-device |
| Power Efficiency | 30W TDP, ~65W typical system power |
| Dimensions & Weight | 14.2 × 8 × 2.53 cm, ~300g (pocket-sized) |
| Ecosystem | One-click deployment for dozens of open-source LLMs & agent frameworks |
| Connectivity | Fully offline operation — no internet or cloud required |
Based on what Tiiny AI has disclosed, the AI Pocket Lab supports models from GPT-OSS, Llama, Qwen, DeepSeek, Mistral, and Phi. One of the most impressive aspects of the AI Pocket Lab is that it can deliver 190 TOPS with a discrete NPU onboard. With 80 GB of LPDDR5X RAM onboard, you can enable aggressive quantization, allowing a 120B model to run seamlessly in a local environment. Moreover, Tiiny AI says that the firm has employed two techniques that make a 120B interface practical, and here they are:
TurboSparse, a neuron-level sparse activation technique, significantly improves inference efficiency while maintaining full model intelligence.
PowerInfer, an open-source heterogeneous inference engine with more than 8,000 GitHub stars, accelerates heavy LLM workloads dynamically distributing computation across CPU and NPU, enabling sever-grade performance at a fraction of traditional power consumption. Together, these technologies allow Tiiny AI Pocket Lab to deliver capabilities that previously required professional GPUs costing thousands of dollars.
The device is set to be showcased at CES 2026. Although the firm hasn't disclosed details about the release date and retail availability, the AI Pocket Lab certainly appears to be a promising device. It will be interesting to see how its industry debut turns out.
Follow Wccftech on Google to get more of our news coverage in your feeds.
