Elon Musk's AI company, xAI, has finally released its latest generation AI model called Grok 2. xAI, and its founder's ownership stakes in Tesla and X have allowed the company to generate demand for its products and utilize expensive computing resources needed to train artificial intelligence models. Today's announcement follows Musk's comments earlier this year, which promised an update on the model soon.
Along with Grok, Amazon backed Anthropic's Claude, Microsoft backed OpenAI's ChatGPT, Facebook owner Meta's Llama and Google's Gemini are among the leading edge AI software products in the world. All these offer AI features for general consumer and enterprise use cases, and Grok 2's release covers both of these aspects, too.
xAI Releases Grok 2 & Grok 2 Mini With Claims Of Major Performance Edge Over OpenAI's GPT 4 & Amazon's Claude
xAI's latest Grok release includes an early preview of Grok 2 and a mini Grok 2 model, both of which will be available to users on Musk's X social media platform. Grok 2 has been tested on UC Berkley's Large Model Systems Organization's (LMSYS) AI benchmark, revealing that it has nearly matched OpenAI's GPT-4o.
According to LMSYS, Grok 2 ranked 2nd best in math and coding and third in the ability to respond to hard prompts, which led to a third position on the overall leaderboard. Ahead of Grok 2 are ChatGPT 4.0 and Google's Gemini 1.5 Pro.
xAI's own data shows that Grok 2 outperforms GPT 4 Turbo and lags GPT 4o by a small margin. However, OpenAI's ChatGPT 4o is the king of AI performance, even in xAI's data, courtesy of its overall LMSYS ELO rating of 1,314. xAI's early version of Grok 2, on the other hand, has a rating of 1,281, while Gemini 1.5 Pro has a median score of 1,297.
When it comes to chatbot performance, Grok 2 lags Gemini 1.5 Pro in the 'win rate,' which measures the percentage of responses that were rated better. Its rate against Google's product is 48%, and xAI's data does not show comparable figures for OpenAI's ChatGPT 4o, a model allowing users to upload images and ask the AI to generate responses based on them.
Improving factual correctness is another key area where xAI claims to have improved Grok 2's performance. Early AI models have been criticized for being factually incorrect, and the firm's internal 'AI Tutors' gave Grok 2 and Grok 2 mini win rates of 62.9% and 59.6% in factuality - for major improvements over the previous iteration's 50% win rate.
Grok 2 comes with "advanced capabilities in both text and vision understanding," says xAI, adding that the model uses data available on X. Like other AI products, Grok 2 mini appears to be geared towards general consumer use supporting features such as writing, coding or generating textual prompt responses.
xAI shares that Grok 2 and Grok 2 mini will be available to developers by the end of this month for the enterprise use cases of its products. The API offers "multi-region inference deployments for low-latency access across the world" as well as compulsory multi-factor authentication, data analytics for billing, traffic analysis, and integration with in-house business systems.
Woah, another exciting update from Chatbot Arena❤️🔥
The results for @xAI’s sus-column-r (Grok 2 early version) are now public**!
With over 12,000 community votes, sus-column-r has secured the #3 spot on the overall leaderboard, even matching GPT-4o! It excels in Coding (#2),… https://t.co/gqSWSwYN0z pic.twitter.com/j9UYDBYNt4
— lmsys.org (@lmsysorg) August 14, 2024
Follow Wccftech on Google to get more of our news coverage in your feeds.
