xAI Claims Grok 3 Is The “World’s Smartest AI,” Betting Markets Agree, But Experts Remain Split

•

Feb 18, 2025 at 11:12am EST

This is not investment advice. The author has no position in any of the stocks mentioned. Wccftech.com has a disclosure and ethics policy.

After days of building up hype, xAI officially released its Grok 3 LLM on Monday in a live stream hosted by Elon Musk himself. While the AI company continues to tout the new LLM's capabilities as the best in-class, some experts are pointing out critical shortcomings in the released benchmarks.

grok 3 is the world’s smartest AI

now available to all Premium+ subscribers

— Grok (@grok) February 18, 2025

To wit, as per xAI's post on X, the Grok 3 LLM is the "world's smartest AI."

GROK 3: SOLVING PHYSICS, GAMES, AND THE UNIVERSE

Full presentation and demo of xAI's latest model

0:00 xAI's mission: Understand the universe
1:20 Team presentation
2:01 Grok means to profoundly understand
2:29 From Grok 2 to Grok 3
6:30 Grok 3 benchmarks
9:07 Grok 3 improves… https://t.co/7qbB6O16Yb pic.twitter.com/BomGwAOa1I

— Mario Nawfal (@MarioNawfal) February 18, 2025

You can watch the entire demo video by opening the above X post. In what we can characterize as the DeepSeek effect, Elon Musk has announced that the older Grok 2 LLM will be open-sourced in a few months.

xAI's new 'Grok 3' model (released last night) beats all other publicly-released foundational models (including DeepSeek-V3 & GPT-4o) in math, science & coding benchmarks. pic.twitter.com/iB6KuDPsdc

— Stock Talk (@stocktalkweekly) February 18, 2025

xAI has taken pains to note that the Grok 3 LLM beats the publicly released versions of all other foundational models, including the DeepSeek-V3 and GPT-4o, on math, science, and coding benchmarks. What's more, the LLM has scored an unprecedented score of 1,402 on the Arena benchmark.

xAI beat expectations

seems like Grok 3 is the most powerful AI in the world pic.twitter.com/OtO6rGD22e

— Manifold (@ManifoldMarkets) February 18, 2025

Meanwhile, Manifold Markets' betting contract on Grok 3 being the most powerful AI in the world is now expected to close with a "yes" resolution. We note, though, that the probability of the ayes winning has declined from 91 percent on late Monday night to just 78 percent at the time of writing.

We can theorize that the emerging critical commentary around xAI's Grok 3, though sparse, is likely playing a role in this development.

I guess Grok3 is a genius who doesn't bother to spend time on these simple questions pic.twitter.com/DhBDBYXw3X

— Zihan Wang - on RAGEN (@wzihanw) February 18, 2025

For instance, Zihan Wang, who also happened to work at DeepSeek in the past, showed Grok 3 a picture of two iron balls of varying sizes hanging from the Leaning Tower of Pisa at different heights, and then asked which ball would land first. A logical answer would only be the ball A, as it is heavier and closer to the ground. However, the LLM answered that both balls would land at the same time.

You can tell influencer vs real folks. Even @Teknium1 kissing the ring. There is reason they didn’t talked about FrontierMath, Arc-AGI or HLE while hyping this as “smartest model”. My initial testing has same vibe as @karpathy: approaching o1-pro but not even close to o3-mini.

— relletreknit (@relletreknit) February 18, 2025

What's more, many others are questioning why xAI did not release Grok 3's scores on FrontierMath, Arc-AGI, or HLE benchmarks.

Of course, we point to these shortcomings not to denigrate the Grok 3, which we are sure is a very capable AI model, but to question the veracity of xAI's best-in-class claims.

In other news, Bloomberg recently reported that xAI was in talks with existing investors to raise as much as $10 billion in a new funding round that would value the startup at $75 billion. In the last such funding round, xAI had raised $6 billion at a $40 billion valuation.

We were barely able to train at 10k early last year, but we got 100k training non-stop for Grok3. So proud, more to come!

— Guodong Zhang (@Guodzh) February 18, 2025

Finally, we note that xAI's Guodang Zhang recently disclosed that Grok 3 was trained on 100,000 GPUs, with "more to come." It is hardly a surprise, thetefore, that BESI thinks the revenue from selling AI chips will climb to $227 billion by 2032.

About the author: Writing is my one incontrovertible passion. Over the past six years, he has authored over 2,200 distinct articles on financial and tech-related topics, spanning nearly 1 million words. And he has been a member of Wcctech mobile team since 2025. As an alumnus of the University of Toronto, Rotman Commerce Program, I bring nuance, in-depth knowledge, and a unique perspective to every topic that I cover. When I'm not writing, I'm traveling the world, exploring hidden confectionaries and restaurants as an aspiring food connoisseur.

Follow Wccftech on Google to get more of our news coverage in your feeds.

xAI Claims Grok 3 Is The “World’s Smartest AI,” Betting Markets Agree, But Experts Remain Split

Related Story Perplexity Bets on NVIDIA’s Vera CPU, Calling The Max Single-Threaded Chip a “Dead-On” Fit After It Ran 1.5x Faster in Agentic Coding

Further Reading

SpaceX Locks Google Into A $920 Million-Per-Month Compute Deal After Anthropic, As xAI Abandons Colossus 1's Messy GPU Mix

Elon Musk Just Visited Intel's Oregon Fab, The Same Fab Producing Cutting-Edge Panther Lake & Next-Gen Chips on 18A

xAI Is Reportedly Using Just 11% of Its 550,000 NVIDIA GPUs, While Meta and Google Squeeze Out 43-46% From Their Fleets

Winning Tesla With 14A Is A Big Boost For Intel's Foundry Business, As CEO Points Out Multiple 18AP/14A Customer Engagements