• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

Meta unveils five AI models for multi-modal processing, music generation, and more

Simon Osuji by Simon Osuji
June 19, 2024
in Artificial Intelligence
0
Meta unveils five AI models for multi-modal processing, music generation, and more
0
SHARES
4
VIEWS
Share on FacebookShare on Twitter


Meta has unveiled five major new AI models and research, including multi-modal systems that can process both text and images, next-gen language models, music generation, AI speech detection, and efforts to improve diversity in AI systems.

The releases come from Meta’s Fundamental AI Research (FAIR) team which has focused on advancing AI through open research and collaboration for over a decade. As AI rapidly innovates, Meta believes working with the global community is crucial.

“By publicly sharing this research, we hope to inspire iterations and ultimately help advance AI in a responsible way,” said Meta.

Chameleon: Multi-modal text and image processing

Among the releases are key components of Meta’s ‘Chameleon’ models under a research license. Chameleon is a family of multi-modal models that can understand and generate both text and images simultaneously—unlike most large language models which are typically unimodal.

“Just as humans can process the words and images simultaneously, Chameleon can process and deliver both image and text at the same time,” explained Meta. “Chameleon can take any combination of text and images as input and also output any combination of text and images.”

Potential use cases are virtually limitless from generating creative captions to prompting new scenes with text and images.

Multi-token prediction for faster language model training

Meta has also released pretrained models for code completion that use ‘multi-token prediction’ under a non-commercial research license. Traditional language model training is inefficient by predicting just the next word. Multi-token models can predict multiple future words simultaneously to train faster.

“While [the one-word] approach is simple and scalable, it’s also inefficient. It requires several orders of magnitude more text than what children need to learn the same degree of language fluency,” said Meta.

JASCO: Enhanced text-to-music model

On the creative side, Meta’s JASCO allows generating music clips from text while affording more control by accepting inputs like chords and beats.

“While existing text-to-music models like MusicGen rely mainly on text inputs for music generation, our new model, JASCO, is capable of accepting various inputs, such as chords or beat, to improve control over generated music outputs,” explained Meta.

AudioSeal: Detecting AI-generated speech

Meta claims AudioSeal is the first audio watermarking system designed to detect AI-generated speech. It can pinpoint the specific segments generated by AI within larger audio clips up to 485x faster than previous methods.

“AudioSeal is being released under a commercial license. It’s just one of several lines of responsible research we have shared to help prevent the misuse of generative AI tools,” said Meta.

Improving text-to-image diversity

Another important release aims to improve the diversity of text-to-image models which can often exhibit geographical and cultural biases.

Meta developed automatic indicators to evaluate potential geographical disparities and conducted a large 65,000+ annotation study to understand how people globally perceive geographic representation.

“This enables more diversity and better representation in AI-generated images,” said Meta. The relevant code and annotations have been released to help improve diversity across generative models.

By publicly sharing these groundbreaking models, Meta says it hopes to foster collaboration and drive innovation within the AI community.

(Photo by Dima Solomin)

See also: NVIDIA presents latest advancements in visual AI

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, artificial intelligence, audioseal, chameleon, fair, jasco, meta, meta ai, models, music generation, open source, text-to-image



Source link

Related posts

This AI Agent Is Designed to Not Go Rogue

This AI Agent Is Designed to Not Go Rogue

February 26, 2026
Are You ‘Agentic’ Enough for the AI Era?

Are You ‘Agentic’ Enough for the AI Era?

February 26, 2026
Previous Post

CESA’s Bi-Annual Economic and Capacity Survey reveals record low business confidence amidst rising project cancellations and infrastructure challenges

Next Post

$1000 Invested During IPO Is Worth This Much Today

Next Post
$1000 Invested During IPO Is Worth This Much Today

$1000 Invested During IPO Is Worth This Much Today

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Students’ Summer Work Program in Central Region

Students’ Summer Work Program in Central Region

6 months ago
South Africa launches $1.5 billion mega city project to rescue ailing Johannesburg

South Africa launches $1.5 billion mega city project to rescue ailing Johannesburg

4 months ago
Google is sunsetting the Google Pay app in the US later this year

Google is sunsetting the Google Pay app in the US later this year

2 years ago
The price of Damian Lillard’s loyalty

The price of Damian Lillard’s loyalty

3 years ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • Mahama attends Liberia’s 178th independence anniversary

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.