NVIDIA Nemotron 3 Models Announced: Open AI Models In Nano, Super, Ultra Sizes, 4x Faster Vs Nemotron 2

Dec 15, 2025 at 09:00am EST
NVIDIA's Nemotron 3 Super Tops The Open-Source AI Model Chart, Beating DeepSeek & GPT-OSS 1

NVIDIA has announced its latest Open Models under the Nemotron 3 family, which includes three sizes & achieves faster AI performance.

NVIDIA Nemotron 3 Open AI Models In Nano "30B", Super "100B", Ultra "500B" Sizes Announced

Press Release: NVIDIA today announced the NVIDIA Nemotron 3 family of open models, data, and libraries designed to power transparent, efficient, & specialized agentic AI development across industries.

Related Story NVIDIA’s Nemotron 3 Super Tops The Open-Source AI Model Chart, Beating DeepSeek & GPT-OSS

The Nemotron 3 models — with Nano, Super, and Ultra sizes — introduce a breakthrough hybrid latent mixture-of-experts (MoE) architecture that helps developers build and deploy reliable multi-agent systems at scale.

NVIDIA Nemotron supports NVIDIA’s broader sovereign AI efforts, with organizations from Europe to South Korea adopting open, transparent, and efficient models that allow them to build AI systems aligned to their own data, regulations, and values.

Early adopters, including Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, and Zoom, are integrating models from the Nemotron family to power AI workflows across manufacturing, cybersecurity, software development, media, communications, and other industries.

Open Nemotron 3 models enable startups to build and iterate faster on AI agents and accelerate innovation from prototype to enterprise deployment. Portfolio companies from Mayfield are exploring Nemotron 3 to build AI teammates that support human-AI collaboration.

Nemotron 3 Reinvents Multi-Agent AI With Efficiency and Accuracy

The Nemotron 3 family of MoE models includes three sizes:

Available today, Nemotron 3 Nano is the most compute-cost-efficient model, optimized for targeted tasks such as software debugging, content summarization, AI assistants, and information retrieval at low inference costs. The model uses a unique hybrid MoE architecture, delivering gains in efficiency and scalability.

This design achieves up to 4x higher token throughput compared with Nemotron 2 Nano and reduces reasoning-token generation by up to 60%, significantly lowering inference costs. With a 1-million-token context window, Nemotron 3 Nano remembers more, making it more accurate and better capable of connecting information over long, multistep tasks.

Artificial Analysis, an independent organization that benchmarks AI, ranked the model as the most open and efficient among models of the same size, with leading accuracy.

Nemotron 3 Super excels at applications that require many collaborating agents to achieve complex tasks with low latency. Nemotron 3 Ultra serves as an advanced reasoning engine for AI workflows that demand deep research and strategic planning.

Nemotron 3 Super and Ultra use NVIDIA’s ultra-efficient 4-bit NVFP4 training format on the NVIDIA Blackwell architecture, significantly cutting memory requirements and speeding up training. This efficiency allows larger models to be trained on existing infrastructure without compromising accuracy relative to higher-precision formats.

With the Nemotron 3 family of models, developers can choose the open model that is right-sized for their specific workloads, scaling from dozens to hundreds of agents while benefiting from faster, more accurate long-horizon reasoning for complex workflows.

Get Started With NVIDIA Open Models

Nemotron 3 Nano is available today on Hugging Face and through inference service providers, including Baseten, Deepinfra, Fireworks, FriendliAI, OpenRouter, and Together AI.

Nemotron is offered on enterprise AI and data infrastructure platforms, including Couchbase, DataRobot, H2O.ai, JFrog, Lambda, and UiPath. For customers on public clouds, Nemotron 3 Nano will be available on AWS via Amazon Bedrock (serverless) as well as supported on Google Cloud, Coreweave, Nebius, Nscale, and Yotta soon.

Nemotron 3 Nano is available as an NVIDIA NIMTM microservice for secure, scalable deployment anywhere on NVIDIA-accelerated infrastructure for maximum privacy and control. NVIDIA Nemotron 3 Super and Ultra are expected to be available in the first half of 2026.

About the author: A Software Engineer by training and a PC enthusiast by passion, Hassan Mujtaba serves as Wccftech's Senior Editor for hardware section. With years of experience in the industry, he specializes in deep-dive technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work involves not only breaking news on upcoming technologies but also extensive hands-on reviews and benchmarking.

Follow Wccftech on Google to get more of our news coverage in your feeds.