Close Menu
    What's Hot

    The oil shortage is ending — and now comes the glut

    The State Of REITs: June 2026 Edition

    Bernardo Silva: Real Madrid sign Manchester City midfielder on free transfer as Jose Mourinho impact continues | Football News

    Facebook X (Twitter) Instagram
    Trending
    • The oil shortage is ending — and now comes the glut
    • The State Of REITs: June 2026 Edition
    • Bernardo Silva: Real Madrid sign Manchester City midfielder on free transfer as Jose Mourinho impact continues | Football News
    • Tottenham transfer news: Spurs’ Luka Vuskovic dilemma as two Brighton bids rejected after Jan Paul van Hecke deal agreed | Football News
    • West Antarctica Is Missing Way Too Much Ice
    • Quantum computing is growing—in Chicago!—and PsiQuantum keeps racking up wins
    • Trump Seeks to Delay Hearing for National Intelligence Pick to Pressure Congress on Elections Bill
    • Elon Musk, SpaceX, and the Rise of Space Capitalism
    interluknewsinterluknews
    • Home
    • Business
      • Corporate News
      • Industry Insights
      • Startups & Entrepreneurship
      • Technology & Innovation
    • Economy
      • Economic Policy
      • Financial Analysis
      • Inflation & Interest Rates
      • Trade & Markets
    • Global
      • Conflicts & Security
      • Diplomacy
      • Global Trends
      • International Affairs
    • Lifestyle
      • Fashion
      • Food & Dining
      • Personal Development
      • Travel
    • Opinion
      • Columns
      • Editorials
      • Expert Opinions
      • Reader Voices
    • More
      • Politics
        • Elections
        • Government & Policy
        • International Relations
        • Political Analysis
      • Sports
        • Cricket
        • Football / Soccer
        • International Sports
        • Local Sports
      • Technology
        • Artificial Intelligence
        • Cybersecurity
        • Gadgets & Reviews
        • Tech News
      • South Africa News
    Facebook X (Twitter) Instagram
    interluknewsinterluknews
    Artificial Intelligence

    Teaching AI models to say “I’m not sure” | MIT News

    adminBy adminApril 22, 2026No Comments4 Mins Read
    Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
    Teaching AI models to say “I’m not sure” | MIT News
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Confidence is persuasive. In artificial intelligence systems, it is often misleading.

    Today’s most capable reasoning models share a trait with the loudest voice in the room: They deliver every answer with the same unshakable certainty, whether they’re right or guessing. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now traced that overconfidence to a specific flaw in how these models are trained, and developed a method that fixes it without giving up any accuracy.

    The technique, called RLCR (Reinforcement Learning with Calibration Rewards), trains language models to produce calibrated confidence estimates alongside their answers. In addition to coming up with an answer, the model thinks about its uncertainty in that answer, and outputs a confidence score. In experiments across multiple benchmarks, RLCR reduced calibration error by up to 90 percent while maintaining or improving accuracy, both on the tasks the model was trained on and on entirely new ones it had never seen. The work will be presented at the International Conference on Learning Representations later this month.

    The problem traces to a surprisingly simple source. The reinforcement learning (RL) methods behind recent breakthroughs in AI reasoning, including the training approach used in systems like OpenAI’s o1, reward models for getting the right answer, and penalize them for getting it wrong. Nothing in between. A model that arrives at the correct answer through careful reasoning receives the same reward as one that guesses correctly by chance. Over time, this trains models to confidently answer every question they are asked, whether they have strong evidence or are effectively flipping a coin.

    That overconfidence has consequences. When models are deployed in medicine, law, finance, or any setting where users make decisions based on AI outputs, a system that expresses high confidence regardless of its actual certainty becomes unreliable in ways that are difficult to detect from the outside. A model that says “I’m 95 percent sure” when it is right only half the time is more dangerous than one that simply gets the answer wrong, because users have no signal to seek a second opinion.

    “The standard training approach is simple and powerful, but it gives the model no incentive to express uncertainty or say I don’t know,” says Mehul Damani, an MIT PhD student and co-lead author on the paper. “So the model naturally learns to guess when it is unsure.” 

    RLCR addresses this by adding a single term to the reward function: a Brier score, a well-established measure that penalizes the gap between a model’s stated confidence and its actual accuracy. During training, models learn to reason about both the problem and their own uncertainty, producing an answer and a confidence estimate together. Confidently wrong answers are penalized. So are unnecessarily uncertain correct ones.

    The math backs it up: the team proved formally that this type of reward structure guarantees models that are both accurate and well-calibrated. They then tested the approach on a 7-billion-parameter model across a range of question-answering and math benchmarks, including six datasets the model had never been trained on.

    The results showed a consistent pattern. Standard RL training actively degraded calibration compared to the base model, making models worse at estimating their own uncertainty. RLCR reversed that effect, substantially improving calibration with no loss in accuracy. The method also outperformed post-hoc approaches, in which a separate classifier is trained to assign confidence scores after the fact. “What’s striking is that ordinary RL training doesn’t just fail to help calibration. It actively hurts it,” says Isha Puri, an MIT PhD student and co-lead author. “The models become more capable and more overconfident at the same time.”

    The team also demonstrated that the confidence estimates produced by RLCR are practically useful at inference time. When models generate multiple candidate answers, selecting the one with the highest self-reported confidence, or weighting votes by confidence in a majority-voting scheme, improves both accuracy and calibration as compute scales.

    An additional finding suggests that the act of reasoning about uncertainty itself has value. The researchers trained classifiers on model outputs and found that including the model’s explicit uncertainty reasoning in the input improved the classifier’s performance, particularly for smaller models. The model’s self-reflective reasoning about what it does and doesn’t know contains real information, not just decoration.

    In addition to Damani and Puri, other authors on the paper are Stewart Slocum, Idan Shenfeld, Leshem Choshen, and senior authors Jacob Andreas and Yoon Kim.

    MIT models news teaching
    Follow on Google News Follow on Flipboard
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
    Previous ArticleLG’s first RGB TV starts at $5,000 and is available to pre-order today
    Next Article USAID Whistleblower Says It Was Even Worse Than People Knew
    admin
    • Website

    Related Posts

    Bernardo Silva: Real Madrid sign Manchester City midfielder on free transfer as Jose Mourinho impact continues | Football News

    June 17, 2026

    Tottenham transfer news: Spurs’ Luka Vuskovic dilemma as two Brighton bids rejected after Jan Paul van Hecke deal agreed | Football News

    June 17, 2026

    Harry Kane: England captain may be taking part in his last World Cup but this is also his best shot at the Ballon d’Or | Football News

    June 17, 2026
    Leave A Reply Cancel Reply

    Demo
    Latest Posts

    The oil shortage is ending — and now comes the glut

    The State Of REITs: June 2026 Edition

    Bernardo Silva: Real Madrid sign Manchester City midfielder on free transfer as Jose Mourinho impact continues | Football News

    Tottenham transfer news: Spurs’ Luka Vuskovic dilemma as two Brighton bids rejected after Jan Paul van Hecke deal agreed | Football News

    Latest Posts

    Subscribe to News

    Get the latest sports news from NewsSite about world, sports and politics.

    Advertisement
    Demo

    We are a digital news platform delivering timely, accurate, and insightful coverage of politics, global affairs, business, economy, sports, and more. Our mission is to keep readers informed with reliable news, clear analysis, and stories that truly matter.
    We're social. Connect with us:

    Facebook X (Twitter) Instagram Pinterest YouTube

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Type above and press Enter to search. Press Esc to cancel.

    Powered by
    ...
    ►
    Necessary cookies enable essential site features like secure log-ins and consent preference adjustments. They do not store personal data.
    None
    ►
    Functional cookies support features like content sharing on social media, collecting feedback, and enabling third-party tools.
    None
    ►
    Analytical cookies track visitor interactions, providing insights on metrics like visitor count, bounce rate, and traffic sources.
    None
    ►
    Advertisement cookies deliver personalized ads based on your previous visits and analyze the effectiveness of ad campaigns.
    None
    ►
    Unclassified cookies are cookies that we are in the process of classifying, together with the providers of individual cookies.
    None
    Powered by