Nearly 200 years after Beethoven’s death, a team of musicians and computer scientists created a generative artificial intelligence (AI) that completed his Tenth Symphony so convincingly that music scholars could not differentiate the music originating from the AI or from the composer’s handwritten notes.
Before such AI tools can generate new types of data, including songs, they need to be trained on huge libraries of that same kind of data. Companies that create generative AI models typically gather this training data from across the internet, often from websites where artists themselves have made their art available.
“Most of the high-quality artworks online are copyrighted, but these companies can get the copyrighted versions very easily,” said Jian Liu, an assistant professor in the Min H. Kao Department of Electrical Engineering and Computer Science (EECS) who specializes in cybersecurity and machine learning.
“Maybe they pay $5 for a song, like a normal user, and they have the full version. But that purchase only gives them a personal license; they are not authorized to use the song for commercialization.”
Companies will often ignore that restriction and train their AI models on the copyrighted work. Unsuspecting users paying for the generative tool may then generate new songs that sound suspiciously similar to the human-made, copyrighted originals.
This summer, Tennessee became the first state in the US to legally protect musical artists’ voices from unauthorized generative AI use. While he applauded that first step, Liu saw the need to go further—protecting not just vocal tracks, but entire songs.
In collaboration with his Ph.D. student Syed Irfan Ali Meerza and Lehigh University’s Lichao Sun, Liu has developed HarmonyCloak, a new program that makes musical files essentially unlearnable to generative AI models without changing how they sound to human listeners. They will present their research at the 46th IEEE Symposium on Security and Privacy (S&P) in May 2025.
“Our research not only addresses the pressing concerns of the creative community but also presents a tangible solution to preserving the integrity of artistic expression in the age of AI,” he said.
Giving AIs deja vu
Liu, Meerza, and Sun were committed to protecting music without compromising listeners’ experiences. They decided to find a way to trick generative AIs using their own core learning systems.
Like humans, generative AI models can tell whether a piece of data they encounter is new information or something that matches with their existing knowledge. Generative AIs are programmed to minimize that knowledge gap by learning as much as possible from each new piece of data.
“Our idea is to minimize the knowledge gap ourselves so that the model mistakenly recognizes a new song as something it has already learned,” Liu explained. “That way, even if an AI company can still feed your music into their model, the AI ‘thinks’ there is nothing to learn from it.”
Liu’s team also had to contend with the dynamic nature of music. Songs often mix multiple instrumental channels with human voices, each channel spanning its own frequency spectrum, and channels can fade from the foreground to the background and change tempo as time goes on.
Fortunately, just as there are ways to trick an AI model, there are ways to trick the human ear.
Undetectable perturbations
Human perception of sounds is dependent on a number of factors. Humans are unable to hear sounds that are very quiet (like music being played a mile away) or outside certain frequencies (like the pitch of a dog whistle). There are also ways to trick the ear into ignoring a sound that is technically audible. For example, a quiet noise played immediately after a louder one will go unnoticed, especially if the notes have similar frequencies.
Liu’s team built HarmonyCloak to introduce new notes, or perturbations, that can trick AI models but are masked enough by the song’s original notes that they evade human detection.
“Our system preserves the quality of music because we only add imperceptible noises,” Liu said. “We want humans to be unable to tell the difference between this perturbed music and the original.”
To test HarmonyCloak’s effectiveness, Liu, Meerza, and Sun recruited 31 human volunteers along with three state-of-the-art music-generative AI models.
The human volunteers gave the original and unlearnable songs similarly high ratings for pleasantness. (They can be compared at the team’s website). Meanwhile, the AI models’ outputs rapidly deteriorated, earning far worse scores from both humans and statistical metrics as more songs in their training libraries were protected by HarmonyCloak.
“These findings underscore the substantial impact of unlearnable music on the quality and perception of AI-generated music,” Liu said. “From the music composer’s perspective, this is the perfect solution; AI models can’t be trained on their work, but they can still make their music available to the public.”
More information:
HarmonyCloak: mosis.eecs.utk.edu/harmonycloak.html
University of Tennessee at Knoxville
Citation:
New tool makes songs unlearnable to generative AI (2024, October 23)
retrieved 23 October 2024
from https://techxplore.com/news/2024-10-tool-songs-unlearnable-generative-ai.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.