I want to tell you about something so absurd that it evokes the “turtles all the way down” expression relating to infinite regress. This story highlights the brazen theft of copyrighted material by AI companies and the mind-boggling ignorance of users who believe these AI tools somehow endow them with the same talents and rights as the original artists.
On April 8, 2024 x.com user Julien Blanchon tweeted this message to their followers:
“Want to train a text to song model like Suno? Here is 20k public songs from Suno with the prompt, lyrics, number of likes …”
Attached to the tweet was a link to an LLM training dataset hosted on Huggingface, titled ‘suno-20k-LAION’. This dataset (long since deleted, along with all discussions related to it) contained a large trove of music that had presumably been created by users around the world using the Suno AI service.
Suno, a startup with a $500M valuation, is the “AI song generator” that launched in December 2023 and soon went viral because of how easy it was to create short snippets of music and lyrics that you could share with friends on social media. Shortly thereafter, Microsoft announced a partnership where Suno would be embedded into their Copilot tool, giving it far greater reach.
Huggingface is a community-driven platform that has become the de-facto repository where data scientists, researchers, and hobbyists share and upload open-source LLM datasets and collaborate on various machine learning applications.
This is where the turtle inception comes in: users who downloaded the suno-20k-LAION dataset from Huggingface could use those AI-generated songs to train their own music generation model to produce further bastardizations — so ultimately, users would be creating synthetic music based on existing synthetic data which in turn was based on original copyrighted works scraped without permission by Suno.
The fun doesn’t stop there: soon after this was shared on x.com, certain Suno users started complaining that they objected to finding their Suno creations in the dataset, claiming that they owned the copyright to those AI-generated songs! One user with the handle The HepCat even went as far as filing this takedown notice, stating: “I am writing to notify you of the copyright infringement and unlawful use of my copyrighted material that appear on the service for which you are the designated agent.”
The two songs included in the takedown are I believe in the sun, which according to the prompt is a mashup of “psychedelic rock”, “progressive rock”, “symphonic rock”, “jazz rock”, and a “catchy hook”, and Garden of love, created with the prompt: “male vocal jazz, warm soulful vocals, deep velvet vocals, catchy hook, romantic.” I don’t want to give these plagiarized songs any more airtime so I’ll just tell you that I believe in the sun is supposed to feature a Hawaiian steel guitar but the AI seems to have ignored that part of the prompt completely. Probably a good thing, since I can’t imagine how it would possibly mesh with the weirdly distorted Bee Gees ripoff vocals, meandering Mellotron and the off-key and out of context electric guitar slides and solos that punctuate the entire song at random, uncanny intervals. As for Garden of love, it sounds like someone took the sounds of Boyz II Men, submerged it into a tone-distorting vat of pure New Kids On The Block and mixed it with completely nonsensical reverb and strangled trumpet sounds from some nightmare world. The lyrics feature such gems as “We're digging deeeeep–into each other's hearts like a gardener preparing the earth.”
This is a horrible tragic-comedy: there’s no way that songs created with Suno fall under copyright protection, which even Suno's FAQ very awkwardly attempts to clarify. Frankly, when Suno users cry foul that their embarrassing prompt-engineered “songs” are used in ways that threaten their ability to commercialize and monetize their work they are receiving nothing less than poetic justice. This case serves as a demonstration of how AI companies create complex data chains that obscure the sources of their theft and how they use marketing language to convince users that they’re creating new and original works of art. In reality, the only livelihoods threatened by this insanely circular process are those of the original artists!
It doesn’t matter that the kind of music “produced” by Suno is, in the words of one Politico journalist, “excruciating.” The service has gone viral because of the way it promotes itself as “democratizing” music in the same way that MidJourney and Dall-E profess to “democratize” visual art. It behooves me to point out that there can be true barriers to pursuing or appreciating art but the ability to simply “prompt” a piece of music into existence is completely missing the point and is purely extractive rather than additive to human expression. I for one hope that the various lawsuits against Suno and other generative AI music thieves play out in favor of real artists and that the general public sees AI-generated music as the unnecessary and uncreative slop that it is.