Wednesday, May 28, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

Building reliable AI models requires understanding the people behind the datasets

Simon Osuji by Simon Osuji
August 8, 2023
in Artificial Intelligence
0
Building reliable AI models requires understanding the people behind the datasets
0
SHARES
2
VIEWS
Share on FacebookShare on Twitter


Building reliable AI models requires understanding the people behind the datasets
Correlation with the original Ruddit offensiveness score by race. Annotations by White participants have the highest correlation with the Ruddit score, while annotations by Asian and Black participants are significantly less correlated. Credit: arXiv (2023). DOI: 10.48550/arxiv.2306.06826

Social media companies are increasingly using complex algorithms and artificial intelligence to detect offensive behavior online.

Related posts

Regulating AI seems like an impossible task, but ethically and economically, it’s a vital one

Regulating AI seems like an impossible task, but ethically and economically, it’s a vital one

May 28, 2025
The impact of Google AI Overview on SEO

The impact of Google AI Overview on SEO

May 28, 2025

These algorithms and AI systems all rely on data to learn what is offensive. But who’s behind the data, and how do their backgrounds influence their decisions?

In a new study, University of Michigan School of Information assistant professor David Jurgens and doctoral candidate Jiaxin Pei found that the backgrounds of data annotators—the people labeling texts, videos and online media—matter a lot.

“Annotators are not fungible,” Jurgens said. “Their demographics, life experiences and backgrounds all contribute to how they label data. Our study suggests that understanding the background of annotators and collecting labels from a demographically balanced pool of crowdworkers is important to reduce the bias of datasets.”

Through an analysis of 6,000 Reddit comments, the study shows annotator beliefs and decisions around politeness and offensiveness impact the learning models used to flag the online content we see each day. What is rated as polite by one part of the population can be rated much less polite by another.

“AI systems all use this kind of data and our study helps highlight the importance of knowing who is labeling the data,” Pei said. “When people from only one part of the population label the data, the resulting AI system may not represent the average viewpoint.”

Through their research, Jurgens and Pei set out to better understand the differences between annotator identities and how their experiences impact their decisions. Previous studies have only looked at one aspect of identity, like gender. Their hope is to help AI models better model the beliefs and opinions of all people.

The results demonstrate:

  • While some existing studies suggest that men and women may have different ratings of toxic language, their research found no statistically significant difference between men and women. However, participants with nonbinary gender identities tended to rate messages as less offensive than those identifying as men and women.
  • People older than 60 tend to perceive higher offensiveness scores than middle-aged participants.
  • The study found significant racial differences in offensiveness ratings. Black participants tended to rate the same comments with significantly more offensiveness than all the other racial groups. In this sense, classifiers trained on data annotated by white people may systematically underestimate the offensiveness of a comment for Black and Asian people.
  • No significant differences were found with respect to annotator education.

Using these results, Jurgens and Pei created POPQUORN, the Potato-Prolific dataset for Question Answering, Offensiveness, text Rewriting and politeness rating with demographic Nuance. The dataset offers social media and AI companies an opportunity to explore a model that accounts for intersectional perspectives and beliefs.

“Systems like ChatGPT are increasingly used by people for everyday tasks,” Jurgens said. “But whose values are we instilling in the trained model? If we keep taking a representative sample without accounting for differences, we continue marginalizing certain groups of people.”

Pei said that POPQUORN helps ensure everyone has equitable systems that match their beliefs and backgrounds.

The study is published on the arXiv preprint server.

More information:
Jiaxin Pei et al, When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset, arXiv (2023). DOI: 10.48550/arxiv.2306.06826

Journal information:
arXiv

Provided by
University of Michigan

Citation:
Building reliable AI models requires understanding the people behind the datasets (2023, August 8)
retrieved 8 August 2023
from https://techxplore.com/news/2023-08-reliable-ai-requires-people-datasets.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Over 40 Countries Ready To Join the Alliance

Next Post

Lessons from Harlem: The Continuous Fight for Racial Equity in Healthcare

Next Post
Lessons from Harlem: The Continuous Fight for Racial Equity in Healthcare

Lessons from Harlem: The Continuous Fight for Racial Equity in Healthcare

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Ten ways artificial intelligence will shape the next five years

Ten ways artificial intelligence will shape the next five years

1 year ago
Money Matters: 6 tips for mastering time management | News, Sports, Jobs

Money Matters: 6 tips for mastering time management | News, Sports, Jobs

2 years ago
Doha Festival City observes Qatar National Day with gratitude to the community

Doha Festival City observes Qatar National Day with gratitude to the community

1 year ago
Which European Countries Want To Join the BRICS Alliance?

Which European Countries Want To Join the BRICS Alliance?

2 years ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.