• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

Using a multilingual dataset to enhance hateful video detection on YouTube and Bilibili

Simon Osuji by Simon Osuji
October 21, 2024
in Artificial Intelligence
0
Using a multilingual dataset to enhance hateful video detection on YouTube and Bilibili
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


Using MultiHateClip to enhance hateful video detection on YouTube and Bilibili
Credit: Zero Crossing Rate of English YouTube videos. Y-axis: Zero Crossing Indicator, X-axis: Time(sec.)

Social media has revolutionized the way information is shared throughout communities, but it can also be a cesspit of hateful content. Current research on hateful content detection focuses on text-based analysis, while hateful video detection remains underexplored.

Related posts

My Favorite Bluetooth Speaker Is on Sale for $50 Off Right Now

My Favorite Bluetooth Speaker Is on Sale for $50 Off Right Now

February 24, 2026
Life in Cuba Under Trump’s Pressure Campaign: No Electricity, No Oil, and Impossible Choices

Life in Cuba Under Trump’s Pressure Campaign: No Electricity, No Oil, and Impossible Choices

February 24, 2026

“Hate speech in videos can be conveyed through body language, tone, and imagery, which traditional text analysis misses. As platforms like YouTube and TikTok reach large audiences, hateful content in video form can be more persuasive and emotionally engaging, increasing the risk of influencing or radicalizing viewers,” explained Roy Lee, Assistant Professor at the Singapore University of Technology and Design (SUTD).

In his paper “MultiHateClip: A multilingual benchmark dataset for hateful video detection on YouTube and Bilibili,” Asst Prof Lee led a team to develop MultiHateClip, a novel multilingual dataset that aims to enhance hateful video detection on social media platforms. He previously created SGHateCheck, a novel functional test evaluating hate speech in multilingual environments. The study is published on the arXiv preprint server.

Using hate lexicons and human annotations focused on gender-based hate, MultiHateClip classifies videos into three categories: hateful, offensive, and normal. Hateful content involves discrimination against a specific group of people based on specific attributes such as sexual orientation.

Offensive content is distressing, but lacks the targeted harm of hate speech and does not incite hatred. Normal content is neither hateful nor offensive. Compared to a binary classification (hateful versus non-hateful), this three-category system allows for a more nuanced approach to content moderation.

After reviewing over 10,000 videos, the team curated 1,000 annotated short clips each from YouTube and Bilibili to represent the English and Chinese languages, respectively, for MultiHateClip. Among these clips, a consistent pattern of gender-based hate against women emerged. Most of these videos used a combination of text, visual, and auditory elements to convey hate, underscoring the need for a multimodal approach to understanding hate speech.

Compared to existing datasets that are simpler and lack detail, MultiHateClip is enriched with fine-grained, comprehensive annotations. It distinguishes between hateful and offensive videos, and outlines which segments of the video are hateful, who the targeted victims are, and what modalities portray hatefulness (i.e., text, visual, auditory). It also provides a strong cross-cultural perspective as it includes videos from both Western (YouTube) and Chinese (Bilibili) contexts, highlighting how hate is expressed differently across cultures.

The team expected distinguishing hateful videos from offensive ones to be difficult as both share similarities, such as inflammatory language and controversial topics. Hateful speech targets specific groups, while offensive content causes discomfort without intending to discriminate. The subtle differences in tone, context, and intent make it challenging for human annotators and machine learning models to draw the line between hateful and offensive content.

“Additionally, cultural and linguistic nuances further complicate the distinction, particularly in multilingual contexts like English and Chinese, where expressions of hate or offense may vary significantly. This complexity underscores the need for more sophisticated detection models that can capture subtle distinctions,” emphasized Asst Prof Lee.

The study also tested state-of-the-art hate video detection models with MultiHateClip. Results highlighted three critical limitations in current models: the difficulty in distinguishing between hateful and offensive content, the limitations of pre-trained models in handling non-Western cultural data, and the insufficient understanding of implicit hate. These gaps emphasize the need for culturally sensitive and multimodal approaches to hate speech detection.

MultiHateClip reflects the value of intersecting design, artificial intelligence, and technology. Its real-world significance is clear—to detect hate speech and prevent its dissemination. Optimized for video content, the model has a cross-cultural focus and is especially useful on social media platforms where videos are the primary form of communication, such as YouTube, TikTok, and Bilibili. Content moderators, policymakers, and educational organizations will benefit from using MultiHateClip to understand and mitigate the spread of hate speech.

“Overall, MultiHateClip plays a crucial role in creating safer, more inclusive online environments,” said Asst Prof Lee, who shared the possibility of collaborating with social media platforms to deploy the model in real-world settings. In addition, the team could potentially look into broadening the dataset to include more languages and cultural contexts, improving model performance by creating better algorithms that can distinguish between hateful and offensive content, and developing real-time hate speech detection tools.

More information:
Han Wang et al, MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and Bilibili, arXiv (2024). DOI: 10.48550/arxiv.2408.03468

Journal information:
arXiv

Provided by
Singapore University of Technology and Design

Citation:
Using a multilingual dataset to enhance hateful video detection on YouTube and Bilibili (2024, October 21)
retrieved 21 October 2024
from https://techxplore.com/news/2024-10-multilingual-dataset-video-youtube-bilibili.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

SSEN lays out plans to deliver 1,000 homes in North Scotland

Next Post

LIVE! Vice President Kamala Harris in Oakland County, Michigan

Next Post
LIVE! Vice President Kamala Harris in Oakland County, Michigan

LIVE! Vice President Kamala Harris in Oakland County, Michigan

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

America’s Experimental X-37B Begins Aerobraking Maneuvers in Space

America’s Experimental X-37B Begins Aerobraking Maneuvers in Space

1 year ago
Anker offered Eufy camera owners $2 per video for AI training

Anker offered Eufy camera owners $2 per video for AI training

5 months ago
Lay intuition as effective at jailbreaking AI chatbots as technical methods, research suggests

Lay intuition as effective at jailbreaking AI chatbots as technical methods, research suggests

4 months ago
Tackling Africa’s foundational learning crisis

Tackling Africa’s foundational learning crisis

12 months ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.