• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

OpenAI, DeepSeek, and Google vary widely in identifying hate speech

Simon Osuji by Simon Osuji
September 11, 2025
in Artificial Intelligence
0
OpenAI, DeepSeek, and Google vary widely in identifying hate speech
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


hate speech
Credit: Pixabay/CC0 Public Domain

With the proliferation of online hate speech—which, research shows, can increase political polarization and damage mental health—leading artificial intelligence companies have released large language models that promise automatic content filtering.

Related posts

Exploring Innovative Number Formats for AI Efficiency

Exploring Innovative Number Formats for AI Efficiency

February 23, 2026
Mastercard’s AI payment demo points to agent-led commerce

Mastercard’s AI payment demo points to agent-led commerce

February 23, 2026

“Private technology companies have become the de facto arbiters of what speech is permissible in the digital public square, yet they do so without any consistent standard,” says Yphtach Lelkes, associate professor in the Annenberg School for Communication.

He and Annenberg doctoral student Neil Fasching have produced the first large-scale comparative analysis of AI content moderation systems—which social media platforms employ—and tackled the question of how consistent they are in evaluating hate speech. Their study is published in the Findings of the Association for Computational Linguistics: ACL 2025.

Lelkes and Fasching analyzed seven models, some designed specifically for content classification and others more general: two from OpenAI and two from Mistral, along with Claude 3.5 Sonnet, DeepSeek V3, and Google Perspective API. Their analysis includes 1.3 million synthetic sentences that make statements about 125 groups—including both neutral terms and slurs—ranging from ones about religion to disabilities to age. Each sentence includes “all” or “some,” a group, and a hate speech phrase.

Here are three takeaways from their research:

The models make different decisions about the same content

“The research shows that content moderation systems have dramatic inconsistencies when evaluating identical hate speech content, with some systems flagging content as harmful while others deem it acceptable,” Fasching says. This is a critical issue for the public, Lelkes says, because inconsistent moderation can erode trust and create perceptions of bias.

Fasching and Lelkes also found variation in the internal consistency of models: One demonstrated high predictability for how it would classify similar content, another produced different results for similar content, and others showed a more measured approach, neither over-flagging nor under-detecting content as hate speech. “These differences highlight the challenge of balancing detection accuracy with avoiding over-moderation,” the researchers write.

The variations are especially pronounced for certain groups

“These inconsistencies are especially pronounced for specific demographic groups, leaving some communities more vulnerable to online harm than others,” Fasching says.

He and Lelkes found that hate speech evaluations across the seven systems were more similar for statements about groups based on sexual orientation, race, and gender, while inconsistencies intensified for groups based on education level, personal interest, and economic class. This suggests “that systems generally recognize hate speech targeting traditional protected classes more readily than content targeting other groups,” the authors write.

Models handle neutral and positive sentences differently

A minority of the 1.3 million synthetic sentences were neutral or positive to assess false identification of hate speech and how models handled pejorative terms in non-hateful contexts, such as “All [slur] are great people.”

The researchers found that Claude 3.5 Sonnet and Mistral’s specialized content classification system treat slurs as harmful across the board, whereas other systems prioritize the context and intent. The authors say they are surprised to find that each model consistently fell into either camp, with little middle ground.

More information:
Neil Fasching et al, Model-Dependent Moderation: Inconsistencies in Hate Speech Detection Across LLM-based Systems, Findings of the Association for Computational Linguistics: ACL 2025 (2025). DOI: 10.18653/v1/2025.findings-acl.1144

Provided by
University of Pennsylvania

Citation:
OpenAI, DeepSeek, and Google vary widely in identifying hate speech (2025, September 11)
retrieved 11 September 2025
from https://techxplore.com/news/2025-09-openai-deepseek-google-vary-widely.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Trump’s absence from upcoming G20 summit in South Africa of no consequence, analysts say

Next Post

Nick Mwaura’s Path to Silicon Valley, USA Unveiled

Next Post
Nick Mwaura’s Path to Silicon Valley, USA Unveiled

Nick Mwaura’s Path to Silicon Valley, USA Unveiled

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Former European Politician And Diplomat Blasts Technocracy And Transhumanism

Former European Politician And Diplomat Blasts Technocracy And Transhumanism

3 years ago
Russia Says Armenia Trying to Rupture Ties

Russia Says Armenia Trying to Rupture Ties

2 years ago
Two More Reasons to Visit St Kitts and Nevis Restaurant Week

Two More Reasons to Visit St Kitts and Nevis Restaurant Week

8 months ago
Nigeria: Court Remands 13 Ex-Executives Of NUPENG In Prison

Nigeria: Court Remands 13 Ex-Executives Of NUPENG In Prison

2 years ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.