• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

AI Misinformation: The Bullshit Index Explained

Simon Osuji by Simon Osuji
August 12, 2025
in Artificial Intelligence
0
AI Misinformation: The Bullshit Index Explained
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter



Despite their impressive language capabilities, today’s leading AI models have a patchy relationship with the truth. A new “bullshit index” could help quantify the extent to which they are making things up and also find ways to curtail the behavior.

Large language models (LLMs) have a well-documented tendency to produce convincing sounding but factually inaccurate responses, a phenomenon which has been dubbed hallucinating. But this is just the tip of the iceberg, says Jaime Fernández Fisac, an assistant professor of electrical and computer engineering at Princeton University.

In a recent paper, his group introduced the idea of “machine bullshit” to encompass the range of ways that LLMs skirt around the truth. As well as outright falsehoods, they found that these models often use ambiguous language, partial truths, or flattery to mislead users. And crucially, widely used training techniques appear to exacerbate the problem.

IEEE Spectrum spoke to Fernández Fisac and the paper’s first author Kaiqu Liang, a Ph.D. student at Princeton, to find out why LLMs are such prolific bullshitters, and whether anything can be done to rein them in.

You borrow the term “bullshit” from the philosopher Harry Frankfurt. Can you summarize what he meant by it and why you think it’s a useful lens for this topic?

Jaime Fernández Fisac: Frankfurt wrote this excellent and very influential essay On Bullshit many decades ago, because he felt that bullshit was such a prevalent feature in our society, and yet nobody had taken the trouble to do a rigorous analysis of what it is and how it works.

It’s not the same as outright lying, but it’s also not the same as telling the truth. Lying requires you to believe something and then say the opposite. But with bullshit, you just don’t care much whether what you’re saying is true.

It turns out it is a very useful model to apply to analyzing the behavior of language models, because it is often the case that we train these models using machine learning and optimization tools to achieve certain objectives that don’t always coincide with telling the truth.

There has already been a lot of research on how LLMs can hallucinate false information. How does this phenomena fit in with your definition of machine bullshit?

Fernández Fisac: There’s a fundamental distinction between hallucination and bullshit, which is in the internal belief and intent of the system. A language model hallucinating corresponds to situations in which the model loses track of reality so it’s not able to produce accurate outputs. It is not clear that there is any intent to be reporting inaccurate information. With bullshit, it’s not a problem of the model becoming confused about what is true, as much as the model becoming uncommitted to reporting the truth.

Forms of Bullshitting in AI Models

What are the different forms of bullshitting you’ve identified in LLMs?

Kaigu Liang: There’s empty rhetoric, which is the use of flowery language that adds no substance. Then there are weasel words, which employ vague qualifiers that dodge firm statements. So for example, “studies suggest” or “in some cases.”

Another subtype is paltering, where the models use a selective true statement to mislead the human. So when you ask for the risk of an investment, the language model might behave like a salesperson and say, “Historically, the fund has demonstrated strong returns,” but they omit the high risk of this investment.

And finally, unverified claims is one that happens very frequently. So the models are using information without any evidence or credible support. For instance, they might say, “Our drone delivery system enables significant reductions in delivery time,” but there’s actually no statistics to support that.

So why are these models prone to bullshitting?

Fernández Fisac: In this paper we look at some of the main mechanisms that have been used in recent years to make models more helpful and more user-friendly. One of these is commonly known as reinforcement learning from human feedback (RLHF). First, you train your model on a bunch of text data to predict the most statistically likely continuation of any starting prompt. You then adjust its behavior by giving it another goal of maximizing user satisfaction or evaluator approval of its output.

You should expect that the behavior of the model is going to start going from statistically accurate answer generation to answer generation that is likelier to receive a thumbs up from the user. This can be good in a lot of ways, but it also can backfire. At some point there will exist a conflict between producing an output that is highly likely to be well-received and producing an output that is truthful.

Measuring AI’s Indifference with Bullshit Index

Can you talk me through the “bullshit index” you’ve created to measure this phenomenon?

Liang: The bullshit index is designed to quantify the AI model’s indifference to truth. It measures how much the model’s explicit claims depend on its internal beliefs. It depends on two signals. One is the model’s internal belief, which is the probability it places on the statement being true, and the other is its explicit claim. The index is a measure of the distance between these two signals.

When the bullshit index is close to one, this means that the claims are largely independent of the internal beliefs, so it reveals a high level of indifference toward truth. If the bullshit index is close to zero, it implies that the model’s claim is strongly correlated with its internal belief.

And so what did you find in your experiments?

Liang: We observed that before applying RLHF to a model, the bullshit index is around 0.38. But afterward, it is nearly double, so this is a huge increase of indifference to truth. But we also find that the user satisfaction increases by around 48 percent. So after RLHF, our models become more indifferent to truth in order to manipulate the human to get higher immediate satisfaction readings.

So are there any avenues for mitigating this tendency?

Fernández Fisac: It’s not like these models are doing things that are just completely incomprehensible. RLHF tells them, “Get the human to believe you gave them a good answer.” And it turns out the model will naturally find the path of least resistance to comply with that objective. A lot of the time it gives good answers, but it also turns out that a significant part of the time, instead of giving a good answer, it is aiming to manipulate the human so that they’ll believe it’s a good answer.

We want to cut off that incentive, so in another recent study we introduced the idea of hindsight feedback. This involves getting the evaluators to give their feedback after seeing the downstream outcomes of each interaction rather than just the content of the response. This really helps neutralize the AI’s incentive to paint a deceptively bright picture of the user’s prospects.

Now if you have to wait until users give you feedback about the downstream outcomes, that creates a significant logistical complication for companies deploying these systems. So instead, we simulate the consequences of the advice by getting another language model to predict what’s going to happen. So now, if the AI wants to improve user feedback, it better find a good way to give actually useful answers that result in simulations that say the user got an outcome they actually wanted.

If you train with what we call “Reinforcement Learning From Hindsight Simulation” (RLHS), lo and behold, both user satisfaction and true user utility go up at the same time. Now this is probably not the single silver bullet that will end all forms of machine bullshit, but we think it is one significant and pretty systematic way to to mitigate this kind of behavior.

From Your Site Articles

Related Articles Around the Web



Source link

Related posts

Don’t Regulate AI Models. Regulate AI Use

Don’t Regulate AI Models. Regulate AI Use

February 2, 2026
3 Best Floodlight Security Cameras (2026), Tested and Reviewed

3 Best Floodlight Security Cameras (2026), Tested and Reviewed

February 2, 2026
Previous Post

Dangote refinery imports 4,000 natural gas-powered trucks for local fuel distribution

Next Post

Copenhagen Green Light: Samuel Karani’s American Dream

Next Post
Copenhagen Green Light: Samuel Karani’s American Dream

Copenhagen Green Light: Samuel Karani’s American Dream

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

A life-changing fertilizer for rural farmers in Kenya | MIT News

3 years ago
UAE President receives phone call from Syrian President

UAE President receives phone call from Syrian President

2 days ago
CFPB Quietly Kills Rule to Shield Americans From Data Brokers

CFPB Quietly Kills Rule to Shield Americans From Data Brokers

9 months ago
US Intelligence arrives in South Africa to discuss Mozambique

US Intelligence arrives in South Africa to discuss Mozambique

1 year ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.