NVIDIA has announced its latest Open Models under the Nemotron 3 family, which includes three sizes & achieves faster AI performance.
NVIDIA Nemotron 3 Open AI Models In Nano "30B", Super "100B", Ultra "500B" Sizes Announced
Press Release: NVIDIA today announced the NVIDIA Nemotron 3 family of open models, data, and libraries designed to power transparent, efficient, & specialized agentic AI development across industries.
The Nemotron 3 models — with Nano, Super, and Ultra sizes — introduce a breakthrough hybrid latent mixture-of-experts (MoE) architecture that helps developers build and deploy reliable multi-agent systems at scale.
NVIDIA Nemotron supports NVIDIA’s broader sovereign AI efforts, with organizations from Europe to South Korea adopting open, transparent, and efficient models that allow them to build AI systems aligned to their own data, regulations, and values.
Early adopters, including Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, and Zoom, are integrating models from the Nemotron family to power AI workflows across manufacturing, cybersecurity, software development, media, communications, and other industries.
Open Nemotron 3 models enable startups to build and iterate faster on AI agents and accelerate innovation from prototype to enterprise deployment. Portfolio companies from Mayfield are exploring Nemotron 3 to build AI teammates that support human-AI collaboration.
Nemotron 3 Reinvents Multi-Agent AI With Efficiency and Accuracy
The Nemotron 3 family of MoE models includes three sizes:
- Nemotron 3 Nano, a small 30-billion-parameter model with 3 billion active, for targeted, highly efficient tasks.
- Nemotron 3 Super, a high-accuracy reasoning model with approximately 100 billion parameters and 10 billion active, for multi-agent applications.
- Nemotron 3 Ultra, a large reasoning engine with about 500 billion parameters and 50 billion active, for complex AI applications.
Available today, Nemotron 3 Nano is the most compute-cost-efficient model, optimized for targeted tasks such as software debugging, content summarization, AI assistants, and information retrieval at low inference costs. The model uses a unique hybrid MoE architecture, delivering gains in efficiency and scalability.
This design achieves up to 4x higher token throughput compared with Nemotron 2 Nano and reduces reasoning-token generation by up to 60%, significantly lowering inference costs. With a 1-million-token context window, Nemotron 3 Nano remembers more, making it more accurate and better capable of connecting information over long, multistep tasks.
Artificial Analysis, an independent organization that benchmarks AI, ranked the model as the most open and efficient among models of the same size, with leading accuracy.
Nemotron 3 Super excels at applications that require many collaborating agents to achieve complex tasks with low latency. Nemotron 3 Ultra serves as an advanced reasoning engine for AI workflows that demand deep research and strategic planning.
Nemotron 3 Super and Ultra use NVIDIA’s ultra-efficient 4-bit NVFP4 training format on the NVIDIA Blackwell architecture, significantly cutting memory requirements and speeding up training. This efficiency allows larger models to be trained on existing infrastructure without compromising accuracy relative to higher-precision formats.
With the Nemotron 3 family of models, developers can choose the open model that is right-sized for their specific workloads, scaling from dozens to hundreds of agents while benefiting from faster, more accurate long-horizon reasoning for complex workflows.
Get Started With NVIDIA Open Models
Nemotron 3 Nano is available today on Hugging Face and through inference service providers, including Baseten, Deepinfra, Fireworks, FriendliAI, OpenRouter, and Together AI.
Nemotron is offered on enterprise AI and data infrastructure platforms, including Couchbase, DataRobot, H2O.ai, JFrog, Lambda, and UiPath. For customers on public clouds, Nemotron 3 Nano will be available on AWS via Amazon Bedrock (serverless) as well as supported on Google Cloud, Coreweave, Nebius, Nscale, and Yotta soon.
Nemotron 3 Nano is available as an NVIDIA NIMTM microservice for secure, scalable deployment anywhere on NVIDIA-accelerated infrastructure for maximum privacy and control. NVIDIA Nemotron 3 Super and Ultra are expected to be available in the first half of 2026.
Follow Wccftech on Google to get more of our news coverage in your feeds.
