OpenAI finally released open-weight models a few days ago, marking its entry into a segment currently heavily dominated by Chinese AI models.
OpenAI's First Mainstream Open-Weight Models Manages to Beat Chinese Alternatives In Several Areas
Well, it seems like American AI companies have now started to follow what their Chinese counterparts have been doing for several years, which is integrating an open-source ecosystem with LLMs. Interestingly, in President Trump's 'AI action' plan, a priority has been given to open-source AI models, and OpenAI has actually followed this by releasing the gpt-oss models. For those unaware, they are its first set of open-weight models since GPT-2, and they mainly come in two different configurations, the gpt-oss-20b and gpt-oss-120b.
Diving into the specifics of OpenAI's latest open-weight models, the gpt-oss-20b features a 21 billion parameter count, with an MoE transformer. More importantly, it offers a context window of up to 131,072 tokens and can run effectively on 16GB VRAM platforms, so most modern-day consumer GPUs will easily run it locally. On the other hand, the gpt-oss-120b is a larger open-weight model, with a 117 billion parameter count, and features strong reasoning performance, which is why to run it, you need at least a single NVIDIA H100 platform.
More importantly, these models are released under the Apache 2.0 license, a permissive license allowing commercial use, modification, and redistribution. This gives them a fully open-source nature, similar to Chinese counterparts. For OpenAI, this release is one-of-a-kind and probably targeted towards Chinese developments. When you look at it, Chinese AI firms like DeepSeek, Alibaba and many others have an open-source environment running for several years now, while in the US, apart from Meta's LLaMA, little mainstream models have made their way into such an ecosystem.
So, now that OpenAI has finally decided to include open-weight models, we could expect new releases from them as well, but for now, let's compare the gpt-oss with Chinese alternatives. When you take the parameter count as the metric, Chinese alternatives beat OpenAI's options by a huge margin, with models like the DeepSeek V2, Qwen 3, and many others having higher figures, evenwith active parameters as well. Considering China's top AI models from DeepSeek and Alibaba, here's how things pan out:
| Category | GPT‑OSS 120B / 20B | DeepSeek-V2 / R1 | Qwen3 / Qwen2.5 / QwQ |
|---|---|---|---|
| Organization | OpenAI | DeepSeek (China) | Alibaba (China) |
| Model Type | Sparse MoE (Mixture of Experts) | Sparse MoE | Dense & MoE hybrids |
| Total Parameters | 120B / 20B | 236B / 67B | 235B / 72B / 32B / others |
| Active Parameters | ~5.1B / ~3.6B | ~21B / ~6.7B | ~22B (Qwen3-235B) / ~3B (Qwen3-30B-A3B) |
| Context Window | 128K tokens | 128K tokens | 128K (Qwen3), 32K (Qwen2.5) |
The total/active parameter count isn't the only deciding factor in determining whether a model is superior, but just for PR purposes, Chinese models do have a considerable edge over OpenAI right now, mainly because they have been in the game for several years now. Now, let's factor in the real-time performance of these AI models across several well-known workloads such as MMLU (Massive Multitask Language Understanding), AIME Math (American Invitational Mathematics Exam) and many others, which we have taken from testing by Clarifai.
| Benchmark Task | GPT‑OSS‑120B | GLM‑4.5 | Qwen‑3 Thinking | DeepSeek R1 | Kimi K2 |
|---|---|---|---|---|---|
| MMLU‑Pro (Reasoning) | ~90.0% | 84.6% | 84.4% | 85.0% | 81.1% |
| AIME Math (w/tools) | ~96.6–97.9% | ~91% | ~92.3% | ~87.5% | ~49–69% |
| GPQA (PhD Science) | ~80.9% | 79.1% | 81.1% | 81.0% | 75.1% |
| SWE‑bench (Coding) | 62.4% | 64.2% | — | ~65.8% | ~65.8% |
| TAU‑bench (Agents) | ~67.8% | 79.7% | ~67.8% | ~63.9% | ~70.6% |
| BFCL‑v3 (Function Calling) | ~67–68% | 77.8% | 71.9% | 37% | — |
This shows that gpt-oss beats the competition in reasoning workloads by a huge margin, and the same is true for mathematical operations. Moreover, it has a smaller active parameter footprint than many dense models, allowing for more cost-effective options for those who want to use the AI model locally. However, the benchmarks do indicate that for agentic workloads and multilingual capability, gpt-oss-120b does lag behind Chinese alternatives, but it is still a top-tier choice for this ecosystem.
Open-weight models are the way to go in the AI industry since they bring several benefits to the general ecosystem. OpenAI's efforts will definitely strengthen the position of the US in this segment, which previously had been dominated by Chinese AI companies. Sam Altman and his team would definitely be happy with the results.
Follow Wccftech on Google to get more of our news coverage in your feeds.
