xAI Claims Grok 3 Is The “World’s Smartest AI,” Betting Markets Agree, But Experts Remain Split

Feb 18, 2025 at 11:12am EST
This is not investment advice. The author has no position in any of the stocks mentioned. Wccftech.com has a disclosure and ethics policy.

After days of building up hype, xAI officially released its Grok 3 LLM on Monday in a live stream hosted by Elon Musk himself. While the AI company continues to tout the new LLM's capabilities as the best in-class, some experts are pointing out critical shortcomings in the released benchmarks.

Related Story SpaceX Locks Google Into A $920 Million-Per-Month Compute Deal After Anthropic, As xAI Abandons Colossus 1’s Messy GPU Mix

To wit, as per xAI's post on X, the Grok 3 LLM is the "world's smartest AI."

You can watch the entire demo video by opening the above X post. In what we can characterize as the DeepSeek effect, Elon Musk has announced that the older Grok 2 LLM will be open-sourced in a few months.

xAI has taken pains to note that the Grok 3 LLM beats the publicly released versions of all other foundational models, including the DeepSeek-V3 and GPT-4o, on math, science, and coding benchmarks. What's more, the LLM has scored an unprecedented score of 1,402 on the Arena benchmark.

Meanwhile, Manifold Markets' betting contract on Grok 3 being the most powerful AI in the world is now expected to close with a "yes" resolution. We note, though, that the probability of the ayes winning has declined from 91 percent on late Monday night to just 78 percent at the time of writing.

We can theorize that the emerging critical commentary around xAI's Grok 3, though sparse, is likely playing a role in this development.

For instance, Zihan Wang, who also happened to work at DeepSeek in the past, showed Grok 3 a picture of two iron balls of varying sizes hanging from the Leaning Tower of Pisa at different heights, and then asked which ball would land first. A logical answer would only be the ball A, as it is heavier and closer to the ground. However, the LLM answered that both balls would land at the same time.

What's more, many others are questioning why xAI did not release Grok 3's scores on FrontierMath, Arc-AGI, or HLE benchmarks.

Of course, we point to these shortcomings not to denigrate the Grok 3, which we are sure is a very capable AI model, but to question the veracity of xAI's best-in-class claims.

In other news, Bloomberg recently reported that xAI was in talks with existing investors to raise as much as $10 billion in a new funding round that would value the startup at $75 billion. In the last such funding round, xAI had raised $6 billion at a $40 billion valuation.

Finally, we note that xAI's Guodang Zhang recently disclosed that Grok 3 was trained on 100,000 GPUs, with "more to come." It is hardly a surprise, thetefore, that BESI thinks the revenue from selling AI chips will climb to $227 billion by 2032.

About the author: Writing is my one incontrovertible passion. Over the past six years, he has authored over 2,200 distinct articles on financial and tech-related topics, spanning nearly 1 million words. And he has been a member of Wcctech mobile team since 2025. As an alumnus of the University of Toronto, Rotman Commerce Program, I bring nuance, in-depth knowledge, and a unique perspective to every topic that I cover. When I'm not writing, I'm traveling the world, exploring hidden confectionaries and restaurants as an aspiring food connoisseur.

Follow Wccftech on Google to get more of our news coverage in your feeds.