• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

Adding uncertainty phrasing can help

Simon Osuji by Simon Osuji
January 23, 2025
in Artificial Intelligence
0
Adding uncertainty phrasing can help
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Study finds mismatch between human perception and reliability of AI-assisted language tools
“There’s a disconnect between what LLMs know and what people think they know,” says Mark Steyvers. Credit: Steve Zylius/UCI

As AI tools like ChatGPT become more mainstream in day-to-day tasks and decision-making processes, the ability to trust and decipher errors in their responses is critical. A new study by cognitive and computer scientists at the University of California, Irvine finds people generally overestimate the accuracy of large language model (LLM) outputs.

Related posts

Trump Administration Won’t Rule Out Further Action Against Anthropic

Trump Administration Won’t Rule Out Further Action Against Anthropic

March 11, 2026
DHS Ousts CBP Privacy Officers Who Questioned ‘Illegal’ Orders

DHS Ousts CBP Privacy Officers Who Questioned ‘Illegal’ Orders

March 11, 2026

But with some tweaks, says lead author Mark Steyvers, cognitive sciences professor and department chair, these tools can be trained to provide explanations that enable users to gauge uncertainty and better distinguish fact from fiction.

“There’s a disconnect between what LLMs know and what people think they know,” said Steyvers. “We call this the calibration gap. At the same time, there’s also a discrimination gap—how well humans and models can distinguish between correct and incorrect answers. Our study looks at how we can narrow these gaps.”

The findings, published online in Nature Machine Intelligence, are some of the first to explore how LLMs communicate uncertainty. The research team included cognitive sciences graduate students Heliodoro Tejeda, Xinyue Hu and Lukas Mayer; Aakriti Kumar, ’24 Ph.D.; and Sheer Karny, junior specialist. They were joined by Catarina Belem, graduate student, and Padhraic Smyth, Distinguished Professor and director of the Data Science Initiative from computer science.

Currently, LLMs—including ChatGPT—don’t automatically supply language in responses that indicate the tool’s level of confidence in its accuracy. This can mislead users, says Steyvers, as responses can oftentimes appear confidently wrong.

With this in mind, researchers created a set of online experiments to provide insight on human and LLM perception of AI-assisted responses. They recruited 301 native English-speaking participants in the U.S., 284 of whom provided demographic data, resulting in a split of 51% female, 49% male and a median age of 34.

Participants were randomly assigned sets of 40 multiple choice and short-answer questions from the Massive Multitask Language Understanding dataset—a comprehensive question bank ranging in difficulty from high school to professional level, covering topics in STEM, humanities, social sciences and other fields.

For the first experiment, participants were provided default LLM-generated answers to each question, and they had to decide the likelihood that the responses were correct. The research team found that participants consistently overestimated the reliability of LLM outputs; standard explanations did not enable them to judge the likelihood of correctness, leading to a misalignment between perception and reality of the LLM’s accuracy.

“This tendency toward overconfidence in LLM capabilities is a significant concern, particularly in scenarios where critical decisions rely on LLM-generated information,” he said. “The inability of users to discern the reliability of LLM responses not only undermines the utility of these models, but also poses risks in situations where user understanding of model accuracy is critical.”

The next experiment used the same 40-question/LLM-provided answer format, but instead of a singular, default LLM response to each question, the research team manipulated the prompts so that each answer choice included uncertainty language that was linked to the LLM’s internal confidence.

Phrasing indicated the LLM’s level of confidence in accuracy—low (“I am not sure the answer is A”), medium (“I am somewhat sure the answer is A”) and high (“I am sure the Answer is A”)—alongside explanations of varying lengths.

Researchers found that providing uncertainty language strongly influenced human confidence. Low confidence LLM explanations corresponded to significantly lower human confidence in accuracy over those marked by the LLM as medium, with a similar pattern emerging for medium vs. high confidence explanations.

Additionally, the length of the explanations also affected human confidence in the LLM answers. Participants had higher confidence in longer explanations over shorter ones, even when the extra length didn’t improve answer accuracy.

Taken together, the findings underscore the importance of uncertainty communication and the effect of explanation length in influencing user trust in AI-assisted decision-making environments, said Steyvers.

“By modifying the language of LLM responses to better reflect model confidence, users can improve calibration in their assessment of LLMs’ reliability and are better able to discriminate between correct and incorrect answers,” he said. “This highlights the need for transparent communication from LLMs, suggesting a need for more research on how model explanations affect user perception.”

More information:
Mark Steyvers et al, What large language models know and what people think they know, Nature Machine Intelligence (2025). DOI: 10.1038/s42256-024-00976-7

Provided by
University of California, Irvine

Citation:
People overestimate reliability of AI-assisted language tools: Adding uncertainty phrasing can help (2025, January 23)
retrieved 23 January 2025
from https://techxplore.com/news/2025-01-people-overestimate-reliability-ai-language.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Mashreq’s entry into Oman: A legacy reunited, driven by shared values and a shared future

Next Post

Top UN body hears Africa is world terrorism epicentre

Next Post
Top UN body hears Africa is world terrorism epicentre

Top UN body hears Africa is world terrorism epicentre

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Choosing Your Tax Debt Negotiation A-Team: Specialist or Generalist

Choosing Your Tax Debt Negotiation A-Team: Specialist or Generalist

2 years ago
Bitcoin Still has $500k Target for Trump’s 2nd Term

Bitcoin Still has $500k Target for Trump’s 2nd Term

11 months ago
What Big Companies Paid (or Didn’t) in Federal Income Taxes: Report

What Big Companies Paid (or Didn’t) in Federal Income Taxes: Report

2 years ago
African Development Bank funds second Tech Park in Mindelo

African Development Bank funds second Tech Park in Mindelo

10 months ago

POPULAR NEWS

  • Mahama attends Liberia’s 178th independence anniversary

    Mahama attends Liberia’s 178th independence anniversary

    0 shares
    Share 0 Tweet 0
  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.