Samsung’s Tiny AI Model Outperforms Huge LLMs Like Gemini 2.5 Pro On ARC-AGI Puzzles

Rohail Saleem
Samsung AI text with robotic hand on a digital network background.
Samsung's Tiny Recursive Model beats competitors.

Samsung's camera division might be bereft of any meaningful innovation at the moment, but the same can't be said of its AI efforts, aptly epitomized by its latest AI model, which just beat some of the other Large Language Models (LLMs) that are around 10,000x larger!

Samsung's New Tiny Recursive Model

In a paper titled "Less is More: Recursive Reasoning with Tiny Networks," Samsung has just detailed the novel architecture of its new Tiny Recursive Model (TRM), which relies on a single, 2-layered model:

Related Story SK Hynix Samples HBM4E With 48 GB Capacity and 16 Gbps as AI Chip Demand Forces DRAM Makers Into Overdrive
  1. The TRM is manifestly small, at just 7 million parameters vs. the billions that populate large LLMs.
  2. The model use its own output to delineate its next steps, constituting a self-improving feedback loop.
  3. By passing each output through iterative reasoning, the model can simulate a much deeper architecture, bereft of the associated memory or computational costs.
  4. With each recursive cycle, the model is able to produce progressively better predictions or results.

Samsung's approach, which is akin to a person re-reading their own draft, fixing mistakes with each read through, is quite superior to the more conventional approach, where LLMs often choke on logic problems if a single step goes wrong, collapsing their entire reasoning. Of course, chain-of-thought helps, but remains quite brittle.

The takeaway: Keep it simple

Samsung tried to increase the model's layers but found that the step decreased generalization due to overfitting. Decreasing the layers but increasing the number of recursions actually improved the TRM's overall performance.

Results:

  1. 87.4 percent accuracy on Sudoku-Extreme (vs. just 55 percent for Hierarchical Reasoning Models).
  2. 85 percent accuracy on Maze-Hard puzzles.
  3. 45 percent accuracy on ARC-AGI-1.
  4. 8 percent accuracy on ARC-AGI-2.

Critically, Samsung's TRM either surpasses or closely matches the performance of various LLMs, including DeepSeek R1, Google's Gemini 2.5 Pro, and OpenAI's o3-mini, despite using only a very, very small proportion of their parameters.

Rohail Saleem Photo

About the author: Writing is my one incontrovertible passion. Over the past six years, he has authored over 2,200 distinct articles on financial and tech-related topics, spanning nearly 1 million words. And he has been a member of Wcctech mobile team since 2025. As an alumnus of the University of Toronto, Rotman Commerce Program, I bring nuance, in-depth knowledge, and a unique perspective to every topic that I cover. When I'm not writing, I'm traveling the world, exploring hidden confectionaries and restaurants as an aspiring food connoisseur.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Button