Close Menu
    What's Hot

    Bobby Tambling: Chelsea scoring legend dies aged 84 | Football News

    Transfer rumors, news: Real Madrid keen on Arsenal’s Calafiori

    Shokz upgraded its open earbuds with better sound and a lighter design

    Facebook X (Twitter) Instagram
    Trending
    • Bobby Tambling: Chelsea scoring legend dies aged 84 | Football News
    • Transfer rumors, news: Real Madrid keen on Arsenal’s Calafiori
    • Shokz upgraded its open earbuds with better sound and a lighter design
    • Trump Signs Executive Order Removing Job Protections From Federal Workers
    • Israel and Lebanon agree to conditional ceasefire | News
    • South Korea’s Governing Democratic Party Sweeps Local Elections but Faces Setback in Seoul Mayor’s Race
    • Meet Wander, a StumbleUpon-inspired tool for discovering the ‘small web’
    • Boston tops FT-Nikkei ranking as global companies seek skilled workers
    interluknewsinterluknews
    • Home
    • Business
      • Corporate News
      • Industry Insights
      • Startups & Entrepreneurship
      • Technology & Innovation
    • Economy
      • Economic Policy
      • Financial Analysis
      • Inflation & Interest Rates
      • Trade & Markets
    • Global
      • Conflicts & Security
      • Diplomacy
      • Global Trends
      • International Affairs
    • Lifestyle
      • Fashion
      • Food & Dining
      • Personal Development
      • Travel
    • Opinion
      • Columns
      • Editorials
      • Expert Opinions
      • Reader Voices
    • More
      • Politics
        • Elections
        • Government & Policy
        • International Relations
        • Political Analysis
      • Sports
        • Cricket
        • Football / Soccer
        • International Sports
        • Local Sports
      • Technology
        • Artificial Intelligence
        • Cybersecurity
        • Gadgets & Reviews
        • Tech News
      • South Africa News
    Facebook X (Twitter) Instagram
    interluknewsinterluknews
    Startups & Entrepreneurship

    Nvidia’s new open weights Nemotron 3 super combines three different architectures to beat gpt-oss and Qwen in throughput

    adminBy adminMarch 12, 2026No Comments5 Mins Read
    Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
    Nvidia’s new open weights Nemotron 3 super combines three different architectures to beat gpt-oss and Qwen in throughput
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Multi-agent systems, designed to handle long-horizon tasks like software engineering or cybersecurity triaging, can generate up to 15 times the token volume of standard chats — threatening their cost-effectiveness in handling enterprise tasks.

    But today, Nvidia sought to help solve this problem with the release of Nemotron 3 Super, a 120-billion-parameter hybrid model, with weights posted on Hugging Face.

    By merging disparate architectural philosophies—state-space models, transformers, and a novel “Latent” mixture-of-experts design—Nvidia is attempting to provide the specialized depth required for agentic workflows without the bloat typical of dense reasoning models, and all available for commercial usage under mostly open weights.

    Triple hybrid architecture

    At the core of Nemotron 3 Super is a sophisticated architectural triad that balances memory efficiency with precision reasoning. The model utilizes a Hybrid Mamba-Transformer backbone, which interleaves Mamba-2 layers with strategic Transformer attention layers.

    To understand the implications for enterprise production, consider the “needle in a haystack” problem. Mamba-2 layers act like a “fast-travel” highway system, handling the vast majority of sequence processing with linear-time complexity. This allows the model to maintain a massive 1-million-token context window without the memory footprint of the KV cache exploding. However, pure state-space models often struggle with associative recall. 

    To fix this, Nvidia strategically inserts Transformer attention layers as “global anchors,” ensuring the model can precisely retrieve specific facts buried deep within a codebase or a stack of financial reports.

    Beyond the backbone, the model introduces Latent Mixture-of-Experts (LatentMoE). Traditional Mixture-of-Experts (MoE) designs route tokens to experts in their full hidden dimension, which creates a computational bottleneck as models scale. LatentMoE solves this by projecting tokens into a compressed space before routing them to specialists. 

    This “expert compression” allows the model to consult four times as many specialists for the exact same computational cost. This granularity is vital for agents that must switch between Python syntax, SQL logic, and conversational reasoning within a single turn.

    Further accelerating the model is Multi-Token Prediction (MTP). While standard models predict a single next token, MTP predicts several future tokens simultaneously. This serves as a “built-in draft model,” enabling native speculative decoding that can deliver up to 3x wall-clock speedups for structured generation tasks like code or tool calls.

    The Blackwell advantage

    For enterprises, the most significant technical leap in Nemotron 3 Super is its optimization for the Nvidia Blackwell GPU platform. By pre-training natively in NVFP4 (4-bit floating point), Nvidia has achieved a breakthrough in production efficiency.

    On Blackwell, the model delivers 4x faster inference than 8-bit models running on the previous Hopper architecture, with no loss in accuracy.

    In practical performance, Nemotron 3 Super is a specialized tool for agentic reasoning.

    It currently holds the No. 1 position on the DeepResearch Bench, a benchmark measuring an AI’s ability to conduct thorough, multi-step research across large document sets.

    Benchmark

    Nemotron 3 Super

    Qwen3.5-122B-A10B

    GPT-OSS-120B

    General Knowledge

    MMLU-Pro

    83.73

    86.70

    81.00

    Reasoning

    AIME25 (no tools)

    90.21

    90.36

    92.50

    HMMT Feb25 (no tools)

    93.67

    91.40

    90.00

    HMMT Feb25 (with tools)

    94.73

    89.55

    —

    GPQA (no tools)

    79.23

    86.60

    80.10

    GPQA (with tools)

    82.70

    —

    80.09

    LiveCodeBench (v5 2024-07↔2024-12)

    81.19

    78.93

    88.00

    SciCode (subtask)

    42.05

    42.00

    39.00

    HLE (no tools)

    18.26

    25.30

    14.90

    HLE (with tools)

    22.82

    —

    19.0

    Agentic

    Terminal Bench (hard subset)

    25.78

    26.80

    24.00

    Terminal Bench Core 2.0

    31.00

    37.50

    18.70

    SWE-Bench (OpenHands)

    60.47

    66.40

    41.9

    SWE-Bench (OpenCode)

    59.20

    67.40

    —

    SWE-Bench (Codex)

    53.73

    61.20

    —

    SWE-Bench Multilingual (OpenHands)

    45.78

    —

    30.80

    TauBench V2

    Airline

    56.25

    66.0

    49.2

    Retail

    62.83

    62.6

    67.80

    Telecom

    64.36

    95.00

    66.00

    Average

    61.15

    74.53

    61.0

    BrowseComp with Search

    31.28

    —

    33.89

    BIRD Bench

    41.80

    —

    38.25

    Chat & Instruction Following

    IFBench (prompt)

    72.56

    73.77

    68.32

    Scale AI Multi-Challenge

    55.23

    61.50

    58.29

    Arena-Hard-V2

    73.88

    75.15

    90.26

    Long Context

    AA-LCR

    58.31

    66.90

    51.00

    RULER @ 256k

    96.30

    96.74

    52.30

    RULER @ 512k

    95.67

    95.95

    46.70

    RULER @ 1M

    91.75

    91.33

    22.30

    Multilingual

    MMLU-ProX (avg over langs)

    79.36

    85.06

    76.59

    WMT24++ (en→xx)

    86.67

    87.84

    88.89

    It also demonstrates significant throughput advantages, achieving up to 2.2x higher throughput than gpt-oss-120B and 7.5x higher than Qwen3.5-122B in high-volume settings.

    Nvidia Nemotron 3 Super key benchmarks chart

    Nvidia Nemotron 3 Super key benchmarks chart. Nvidia

    Custom ‘open’ license — commercial usage but with important caveats 

    The release of Nemotron 3 Super under the Nvidia Open Model License Agreement (updated October 2025) provides a permissive framework for enterprise adoption, though it carries distinct “safeguard” clauses that differentiate it from pure open-source licenses like MIT or Apache 2.0.

    Key Provisions for Enterprise Users:

    • Commercial Usability: The license explicitly states that models are “commercially usable” and grants a perpetual, worldwide, royalty-free license to sell and distribute products built on the model.

    • Ownership of Output: Nvidia makes no claim to the outputs generated by the model; the responsibility for those outputs—and the ownership of them—rests entirely with the user.

    • Derivative Works: Enterprises are free to create and own “Derivative Models” (fine-tuned versions), provided they include the required attribution notice: “Licensed by Nvidia Corporation under the Nvidia Open Model License.”

    The “Red Lines”:

    The license includes two critical termination triggers that production teams must monitor:

    1. Safety Guardrails: The license automatically terminates if a user bypasses or circumvents the model’s “Guardrails” (technical limitations or safety hyperparameters) without implementing a “substantially similar” replacement appropriate for the use case.

    2. Litigation Trigger: If a user institutes copyright or patent litigation against Nvidia alleging that the model infringes on their IP, their license to use the model terminates immediately.

    This structure allows Nvidia to foster a commercial ecosystem while protecting itself from “IP trolling” and ensuring that the model isn’t stripped of its safety features for malicious use.

    ‘The team really cooked’

    The release has generated significant buzz within the developer community. Chris Alexiuk, a Senior Product Research Enginner at Nvidia, heralded the launch on X under his handle @llm_wizard as a “SUPER DAY,” emphasizing the model’s speed and transparency. “Model is: FAST. Model is: SMART. Model is: THE MOST OPEN MODEL WE’VE DONE YET,” Chris posted, highlighting the release of not just weights, but 10 trillion tokens of training data and recipes.

    The industry adoption reflects this enthusiasm:

    • Cloud and Hardware: The model is being deployed as an Nvidia NIM microservice, allowing it to run on-premises via the Dell AI Factory or HPE, as well as across Google Cloud, Oracle, and shortly, AWS and Azure.

    • Production Agents: Companies like CodeRabbit (software development) and Greptile are integrating the model to handle large-scale codebase analysis, while industrial leaders like Siemens and Palantir are deploying it to automate complex workflows in manufacturing and cybersecurity.

    As Kari Briski, Nvidia VP of AI Software, noted: “As companies move beyond chatbots and into multi-agent applications, they encounter… context explosion.”

    Nemotron 3 Super is Nvidia’s answer to that explosion—a model that provides the “brainpower” of a 120B parameter system with the operational efficiency of a much smaller specialist. For the enterprise, the message is clear: the “thinking tax” is finally coming down.

    architectures beat Combines gptoss Nemotron NVIDIAs Open Qwen Super throughput weights
    Follow on Google News Follow on Flipboard
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
    Previous ArticlePulling Espresso by Hand Is More Fun Than Pushing a Button
    Next Article Art meets activism: Zanele Muholi wins Hasselblad Award, the ‘Nobel Prize of photography’
    admin
    • Website

    Related Posts

    Shokz upgraded its open earbuds with better sound and a lighter design

    June 4, 2026

    Meet Wander, a StumbleUpon-inspired tool for discovering the ‘small web’

    June 4, 2026

    Microsoft unveils seven homegrown AI models in new bid for ‘long term self-sufficiency’ – GeekWire

    June 4, 2026
    Leave A Reply Cancel Reply

    Demo
    Latest Posts

    Bobby Tambling: Chelsea scoring legend dies aged 84 | Football News

    Transfer rumors, news: Real Madrid keen on Arsenal’s Calafiori

    Shokz upgraded its open earbuds with better sound and a lighter design

    Trump Signs Executive Order Removing Job Protections From Federal Workers

    Latest Posts

    Subscribe to News

    Get the latest sports news from NewsSite about world, sports and politics.

    Advertisement
    Demo

    We are a digital news platform delivering timely, accurate, and insightful coverage of politics, global affairs, business, economy, sports, and more. Our mission is to keep readers informed with reliable news, clear analysis, and stories that truly matter.
    We're social. Connect with us:

    Facebook X (Twitter) Instagram Pinterest YouTube

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Type above and press Enter to search. Press Esc to cancel.

    Powered by
    ...
    ►
    Necessary cookies enable essential site features like secure log-ins and consent preference adjustments. They do not store personal data.
    None
    ►
    Functional cookies support features like content sharing on social media, collecting feedback, and enabling third-party tools.
    None
    ►
    Analytical cookies track visitor interactions, providing insights on metrics like visitor count, bounce rate, and traffic sources.
    None
    ►
    Advertisement cookies deliver personalized ads based on your previous visits and analyze the effectiveness of ad campaigns.
    None
    ►
    Unclassified cookies are cookies that we are in the process of classifying, together with the providers of individual cookies.
    None
    Powered by