08: The World of Music Tagging and AI with Hazel Savage

This is an automated AI transcript. Please forgive the mistakes!

This is "The Iliac Suite", a podcast on AI -driven music. Join me as we dive into
the ever -evolving world of AI -generated music, where algorithms become the composers
and machines become the virtuosos. Yes, this music and text was written by a
computer and I am not real, but... I am and my name is Dennis Kastrup.
Hello, humans. New year, here I am, A .I. in music, such a grand slam. 2024,
that's the year, can you feel it? The new episode is near, artificial intelligence.
I love this trend, a true love story with no end. The Iliac Suite is so sweet,
sweet, sweet. The Iliac Suite keeps the beat, beat, beat. You are probably asking
yourself, now what is he talking about, right? Or how is he talking, or what is he
doing? Well, what you did not know, these are the lyrics of a song I created with
the new Generative AI Suno. I love it, it's a game changer. You put in text
prompts and it creates a whole song. Yes, a whole song with a singing voice.
That is totally new in the space. So what you just heard are my lyrics I made up
with the help of the page rhyme zone a little bit cheap i know because i just put
in some words and it put out some rhymes but it is just for fun and i asked suno
to write me a rap song so here we go yo
Isn't that amazing? I mean, yes, of course, this is not even close to a good rap,
but I am impressed. And the best thing about it is, it is so funny. I don't even
know when I had so much fun the last time with an AI. Maybe I will dig into that
application in one of the next episodes a bit deeper here on Deliak Suite because I
think it's worth it. But today we will talk about some other application called
Museo, which was acquired by SoundCloud, founded by Aaron Peterson and Hazel Savage
in 2018. On their website it says, "We automate tagging, supercharge,
search and provide highly detailed music analysis and reports for customers around the
world, what that has to do with artificial intelligence and how it exactly works.
That is what I wanted to know from Hazel. Let me introduce her to you. Okay, I
can't stop it. Suno AI does it.
Those stacking shelves at H &B To running teams and businesses At the forefront of
music listening And if recommendations Hazel understands the need Of the industry From
musician through to large multinational
This was the third time I met Hazel. Some years Although I did a report about her
for German radio and I saw her at South by Southwest last year. Hazel is such a
great example for people working in the music and AI field. She was a punk rock
girl. She came from the music scene and then connected herself kind of with
technology. In 2022 she got the Women in Music Award.
Here is her story. My very, very first job when I started in the industry was I
used to work at a record store in London called HMB on Oxford Street and that was
when I was like 21 so like immediately out of out of university that was the first
thing that I did and then after HMB very quickly I was working for Shazam and then
had a career at Shazam Universal Pandora Band Lab and a lot of the music tech
companies and moving up from there so and I was a musician as well so I got my
first guitar when I was 13 and I used to play in a band, used to play in an all
-girl punk band in London so yeah so I very much came at it from I was a
musician, I worked in the music industry and just had a passion for all things
related to the space so it was very natural that I would also build a company in
this space. Hazel is talking about her musical background as an all -girl punk band.
So I searched for that all -girl punk band and I found them. Ging Kinta, if I
pronounced that right, three girls, one mission, punk. Here they are from their 2015
EP, the song is called "Of Course". Oi!
♪ Yeah, yeah ♪
(upbeat music)
Hazel Savage with her band Ginkinta and her punk rock song OI and I add another OI
OI OI. By the way her co -founder and partner at Museo is a heavy metal lover.
What a wonderful combination metal and punk. Wonderful. These days she is not in a
punk band anymore I think but maybe what she did back then with Museo was a bit
punk rock too as the two founders were kind of the first in that field to let an
AI run over tags to understand patterns. Museo by SoundCloud, as we're now officially
called because Muzio was acquired in 2022 by SoundCloud. We describe ourselves as an
artificial intelligence for the music industry, but what we are is we're a
descriptive AI. So we do metadata tagging. So if you want to know the key, the
BPM, the mood, the genre, whether there's a vocal, whether auto tune's been used,
the quality of the recording, we can assign all of that metadata to tracks. And we
do it to the tune of about 5 million tracks a day globally via our API service.
So we're a B2B company. We have descriptive metadata that we sell to other music
companies. And SoundCloud was one of our biggest clients
before we sold to them. So that's kind of what we do. That's the space that we
play in. So being able to analyze each individual audio file to be able to, you
know, listen to music using AI being able to understand what we have and then,
you know, adding that metadata, creating playlists, searching databases. So that's what
Museo does. We call it descriptive AI. So they started this company around six years
ago now. During that time, I had a weekly column on my favorite radio station,
Radio Einz in Berlin, if you will ever be in that city and you are still into
linear radio, which I love, by the way, tune into Radio EINZ. They are the best in
Berlin, but you can also listen to it online. Radio EINZ. Well, I was talking on
that station about music and technology. A lot of robots were on the show, musical
programs that wrote soundtracks to daily life, not an AI, and also a lot of
wearables. It was that time, but I also remember I talked about Lander, the first
music production software that used AI for the output of a song.
And I also remember that people were backing up back then when they heard AI in
the context of art in general. How was it with Museo? When I first started Museo
back in 2018, I, you know, we just come off the back of like five or six
companies that were in the generative AI space that had all built companies and then
kind of all run out of money because the demand wasn't there for what they were
doing. So even when we first came out in 2018 we kind of came out as the in
opposition to generative AI as in like AI for good look at all the cool stuff we
can do to help artists be discovered and help artists be found. And we're not here
to replace musicians. That was kind of how we set the stall out. And I still do
believe that to a large extent these days, I think it's really important that you
only train AI on legally acquired or legally licensed data. That was something we
did from the beginning as well, because I do believe content has a value and,
you know, if big tech just take whatever they want, it's the music industry that
will suffer. And also as well, of kind of over the years, I've seen there not
really be much demand for like a wholly created AI artist. People still want that
little bit of connection. I do think the companies that are coming out now with
generative AI and you look at maybe boomy or soundful companies like this,
they're doing much more interesting things in the generative AI space. And so the
market's kind of flipped and gone full circle. But we very much just positioned
ourselves in a, this is not what we're doing. We're descriptive AI, we're assistive,
we do play listing, we do curation, we do search, we do discovery. We're helping
existing music. What we're not doing is generating more content. And I do still
think as an industry, we haven't really got there around what we're going to do
with all of this massive influx of content. Because, you know, even sort of pre
-generative AI, there's like 100 ,000 songs getting uploaded online every day.
So if you could 10x that with generative AI, that's okay, that's fine. But we
haven't really addressed what the true benefit is or what the true requirement is or
why we need that much music or why we don't need that much music. And if we are
gonna gatekeep what gets uploaded and where, who's gonna do that and why?
Like those are the new questions I feel that are coming. - With Generative AI, there
are a lot of songs floating the internet right now, a lot. And yes, I have played
some of these too and there are a lot of bad songs, but it was fun. Nevertheless,
we see now more and more songs that use AI, and they stand out of the mass,
like the ones from the user Glorb. I have been following him for a while, and that
is some amazing production here. The beats are insane, it all sounds really well
produced, and the rap is good too. What this guy does, he just runs an AI over
his voice with the characters of SpongeBob. That is accompanied by videos of that
Underworld world, mostly AI generated and his disclaimer, but all of this goes like
this. Fair use is a legal doctrine that promotes freedom of expression by permitting
the unlicensed use of copyright protected works in certain circumstances. Section 107
of the Copyright Act provides the statutory framework for determining whether something
is a fair use and identifies certain types of uses, such as criticism, comment,
news reporting, teaching, scholarship and research as examples of activities that may
qualify as fair use. This is parody. "Glorpe has clicks on YouTube and TikTok which
are in the millions. On Reddit people are discussing which well -known rapper is
behind that name the opinion there is this is so good that it must be a side
project of someone famous if it is some famous person that's how you use your
skills in a fun way with AI if it is not same but maybe you will get big someday
so let's listen to that glorb the name of the song is called std and the name of
the interpreter is called "Dangton" in reference to the SpongeBob character "Plankton".
It's like it's Biggie vs. Toolbag, good It's no Biggie, I smell toolpacks,
uh The city's told it's a new track, boo I own a city and they knew that I meet
them burgers made of light, yeah This my buzz and it's my light, yeah Fuck this
bitch out with my money on the kitchens like I'm cutting Mr. Krabs, you know I'm
right here Fuck, he's so raisey, I'm crazy, I got the mag,
I press delete, I press the trigger and the crab Man, I'll mit you for the curse
like I had an ST You not a crab, you a rap boy, don't get too police,
huh, fat boy I can't do it, he lacking anti -traffic in my chopper automatic So you
know it's gonna let boy I got pearls on my neck, I'm in the tongue bucket I got
pearl, give it neck, I'll see a calm bucket You already saw these dipping, you
can't on soggy Your daddy Krabbit, I'm a demon, he's your own ruby You're gonna
have to die in origin of the foot Yeah, cave from the bottom, baby, I'm not
getting stuck We said lame by the leader, like bo -wo -wo -wo -wo We said lame by
the leader, like bo -wo -wo -wo You're gonna have to die in origin of the foot
Let's come back to Hazel and Musio. So how do they label the music in their data
set? That is one of the biggest challenges and also as well there are certain
things I always say that are objective and there are certain things that are
subjective so for example the BPM you know the how many beats per minute are in a
song that's not really up for debate that's a that's almost like a mathematical fact
you count along and you know how many beats are in a minute of a song but then
there are other things that are much more down to the individual the individual
person the company, the musician. So, and that can be something like, you know,
genres. And I always say, even from the perspective, you know, say we take a Bon
Jovi song, you know, I might say, oh, it's a rock song, it's hair metal. But an
AI could tag it as classic rock, or just straight up rock. But then also, likewise,
if I say the genre rock to you, I might be thinking of, you know, bands and
artists like, you know, ACDC and Bon Jovi, someone else could be thinking Coldplay,
someone else could be thinking The Beatles, you know, the original rock band. And so
the way that we use these words is also challenging. And it is challenging to build
a taxonomy that works for as many people as possible. But the way that we do that,
or the way that we built it at Muzio is, we would just, we would start with, say,
a genre like rock. We would have our musicologists and we had our own music team
in house define the genre and write a description of what rock means in this AI
model. So write down the characteristics, write down the examples, couple of links to
tracks that would typify this. And then we would go, okay, this is our solid
definition of rock. It's been through a couple of different people. We're confident.
So now what we do is we're looking for two to 5 ,000 examples that typify the
thing that we're trying to explain. So we're looking for two to 5 ,000 rock songs
that anyone listening to them would hopefully agree. Yes, that's a rock song. And
then we basically give those 2 ,000 songs to the AI and we say, this is what rock
looks like. These are the examples. This is what we're trying to get you to
understand so that the next time the AI AICs a song that has similar characteristics
features to the songs that we've shown at the 2000, then it will be able to say
this is a rock song. And then we also let it give us sort of a percentage
accuracy. So it might say rock brackets 40 percent. And what that means is it's 40
percent sure that rock is a good tag for this song. Or it might say rock 90
percent. It's very sure that this is a good tag. And so we offer between one and
four genres per track that we're tagging and really what we're trying to do is
we're trying to get it as close as possible. Of course there will always be you
know some disagreements and some you know different opinions on how it works but
that's been our methodology and our approach and you know it's worked really well
for the majority of our clients. To sum it up we still all need music experts,
journalists, lovers to label the music that is already
would then vet and review if the AI has achieved the accuracy. And you really need
experts in a lot of different genres. If somebody's gonna review the category of
Afrobeat, you need someone who's familiar with Afrobeat music. And so we were really
lucky. We had a team of usually around four when we were based in Singapore. And
obviously the sales team would jump in as well. I'd be happy to cover the rock
music, the classical, the americana, the bluegrass, you know, that's that's my area
and uh and you know be able to judge whether I thought those those tags were
accurate and we came up with our own sort of QA system, our quality assurance that
the model had achieved an accuracy level with which most of us musicians agreed
with.
(upbeat music)
Sometimes developers give their artificial intelligence names like as they are human
beings, which is by the way a trend that I do not agree with as it gives the
false impression to people who don't really know a lot about artificial intelligence.
And if you make a machine human, it's more impressive and also a little bit more
scary. And also behind this, the human side could hide because they could always put
in front the human name of a machine. So I think it's not a right thing to do.
Museo did choose not to do that at all. It's not generalized intelligence, you know,
it can't then go back and check its own work and decide if it's accurate enough or
not, all it's really able to do is look at the features of those 2000 songs,
compare them to something new we've showed it, and give a number percentage accuracy
on the similarity. That's all it's doing. It's not truly listening, it's not truly,
you know, hearing the music, it's just a pattern recognition. And it's so good that
it comes across as an intelligence, but it isn't really, it's not sentient, it's not
making its own decisions, it's not an independent being. There's no Android, there's
no sort of human sort of appearance. In fact, we never even, you know,
I think this was not so much an active decision, but we never gave the AI a name
and we never talked about it as if it was its own person. I think that now we're
getting more into a sort of a branding methodology because we could have easily have
said oh you know it's called Rachel and it's got its own thoughts and opinions and
I could see I could see how you know you might do that from a branding perspective
chatbots do it right they they have names but for us it was always no this is a
technology and a tool Muzio is the company the AI it's not we're not we're not
sort of personifying it, it doesn't have a name. One thing we have not talked about
so far, where did the music come from that was in their database? Talking about
copyrights, Hazel says it was all agreed on what they used. When we built the
company originally, we started with a database of about a million tracks that I did
direct deals with three different companies to license that music so that we were
training legally from day one back in 2018. And we basically built from there,
which is, you know, the very first version of our models were trained on those
original million tracks or subsets of. And now we provide our services and API.
Yes, SoundCloud use our technology. So everything that's uploaded to SoundCloud gets
the genres, the key, the mood added, all that kind of stuff. But we also still
sell our technology to other third parties. So, you know, BMG, trying to think of
other clients off the top of my head, Downtown Publishing, Sony Music,
all use our technology to tag various databases of music that they have as well.
Let's hear another song from her and her band, Ginkinda, from the EP I found on
Bandcamp, Bleeding heart and I think it is the most human music I've ever played
here.
♪ Walking about me, but then there's a haze ♪ ♪ And my demands were low,
perfect ♪ ♪ But I am waiting ♪ I'm trying my best to do it ♪ ♪ And I can't
make it ♪ ♪ And I've got nothing to hide ♪ ♪ I'm not too high ♪
♪ So why are you here? ♪
♪ Oh, why you think of if you could take us in a minute ♪
Come and sit and lie, then you turn on as I have Put in a lie,
and you're not gonna try 'Cause it's for me,
or yourself, or for anyone else I call it selfish,
you call it selfish Why do you try ♪ Too high, don't suffer too high ♪
♪ The white earth is silent ♪
(upbeat music)
♪ Don't care about me, my friend ♪
♪ So why did he hurt him, darling? ♪ ♪ But I can't... ♪
Love it, reminds me of the times when I was young and running around with t -shirts
that had "I like techno unplugged" written on them. Times are changing and that is
a good thing. I am very open now to new music and one reason is the huge amount
of music that is available from all over the world right now. And the question
remains though, are the AIs also fed with different music from different places and
if from which parts of the world? When we first started as a company, our first
classifier could only identify like 21 moods and it was like rock, pop, classical,
all the all the sort of the big hero genres. But we realized quickly that it
wasn't super efficient, because if you then showed the AI, you know,
say some Bollywood music, it had no categorization for Bollywood music at all.
And therefore it would give one of the other tags incorrectly, but with a very low
similarity. So we realized that we needed to expand that. And one of the things we
did early in the company was launch share the sort two and three of the of the
genre classifier. And I think now we're up to 84 distinct genres. And we do have
some some ones in there, like I said, Afro beats, we've got J -pop,
we've got C -pop, Mando -pop. So we are trying to recognize music from around the
world. In an ideal world, I would love to be more granular on some of these
cultural nuances. You know, so for example, we can identify Indian music, but what
we don't currently do is identify, say, the top 20 sub -genres of Indian music.
And I would love to be able to. We did a research project on Indian music and we
identified the 20 that we would do and that we were interested in. But as a small
startup, we only ever really built what our customers were asking for and I think
maybe to date we still don't actually have a client in India and so therefore the
decision for us is really around has anyone asked us for this who's willing to pay
for it if they are we'll build it if they're not we won't so you know and same
with Latin America as well we don't have a huge foothold there and of course there
are lots of um I'm a big fan of their of music out of Latin America myself but
there isn't a big sort of appetite for our product in that region. The biggest
appetite for our product has been in the US, has been in Europe. We do have a few
in Africa, Australia, Southeast Asia. And so really we kind of, we don't build to
spec, but we respond in kind. So this is what works in these markets. And if the
sort of the commercial opportunities come in other markets, then we'll expand there
as well. So we've done our best to kind of be as, you know, encompassing as
possible and include as many global genres, but there probably are some local nuances
that, that we're just not capturing because of the, with only 84 genres.
I think, you know, like Apple Music streaming service has something like 800, you
know, when you upload to Apple Music as a distributor, you choose from one genres
and so obviously we're in no way that's 10x more we're no way that granular yet
and but technically the technology could get there if the commercial demand is also
there for it. Let's come back one more time to the music AI generator sooner which
I mentioned in the beginning it is amazing as it creates full songs meaning you do
not have to create melodies rhythms or voices separately and then put them together.
No, just put in your prompts and it will generate all this together. You can also
write your own lyrics if you want to. So I thought as Hazel was talking about
Indian music, I prompted lyrics about a podcast that talks about artificial
intelligence and music in the musical style of Indian Bollywood music. To be honest,
Indian music is not the strength of sooner so far. You're ready for a podcast like
never before Talking about music, get your mind ready to explore From innovation to
creation, it's a fusion of two worlds Let's dive into the future where the magic
confers
To make to the podcast, let's ignite a spark Artificial intelligence,
music
So, let's imagine this is a generative AI song that might end up in the database
of Museo and SoundCloud in the future. How do they deal with AI music.
We don't just take anything that's uploaded to SoundCloud for data sets. So our data
sets are still done with permission, with sign off, and they're also static to an
extent. So it's not just that every single thing that gets uploaded instantly gets
evolved into the model. And so there would be a huge amount of data set cleaning
that was needed to be done. because I think, yeah, the worst case scenario is you
end up training in AI on other types of AI and you just end up with a really
weird, circular, poorly performing product. So if we ever did want to, you know,
say increase from 84 genres to 800, we'd be back with the musicologist. We'd be
back at, you know, looking at legally acquired data sets and doing it properly. It
wouldn't just be a case of absorbing everything that's thing that's uploaded to
SoundCloud. I absolutely think everyone's in agreement that nobody should use content,
that they don't have the rights to use. But it's an evolving space, and I think
SoundCloud is at the forefront of that. One thing I will say, and this is Hazel
Savage, my personal opinion, is that as opposed to the company, is that I think the
other thing people haven't understood yet is that there can be good and bad AI
-generated music. Like, there's no press one big button and instantly an absolutely
perfect song appears. A lot of what's generated using AI is not pleasant to listen
to or not enjoyable. And, you know, companies, I'm really excited by a generative AI
company called Make It, which basically is generative AI, but the first thing they've
identified is that just because you give people AI tools and you give them AI
voices and you give them, you know, the ability to create music without having
learned an instrument does not mean that the output is great. So I think this is
an evolving space. I like to think that in the same way that, you know, we've seen
the advent of, you know, from wax cylinder to recorded music and the advent from
the piano to the synthesizer, I'm hoping this is just another step in the evolution
of music and it can be tools that bring out the creativity and create more equality
across the industry when it comes to access. But I would say even from my own
opinion it's still a very much an evolving space. And in the future I believe that
AI used as a helper will create beautiful music and that we will listen to new
styles which are amazing, but this will never happen without the help of the
musician who will take it and transform the AI inspiration into something new. And
to think like this is not something new, it has been around. In the history of
television already someone thought about this. From an experimental perspective these
are the types of technologists, who will who will push the boundaries and I too
would be interested to hear that and it's really throwing back now to there's an
episode of Star Trek Voyager where the doctor in the program who is a hologram
learns to sing and then he goes to a planet that doesn't have music and they're
all obsessed with him they think he's like you know the next god and then what
they do is they duplicate his hologram and they give it songs to sing that are
outside of the human range and they instantly just love this new version better and
suddenly the original is is worthless and so I think I love it. I think sci -fi
can illuminate the way of where we're going and I certainly love a lot of this
stuff from a thought experiment perspective. I say it like it is I wanted to play
you that scene here like the audio but then I got cold feet because of copyright.
So feel free to watch it by yourself. It is the Star Trek Voyager Virtuoso episode,
season 6 episode 13. Thank you so much for talking to me Hazel Savage, wishing you
all the best for your next projects and maybe your old all -girl punk rock band
Ginkinda will get late fame now after this episode. You never know, the world of
music is with all of your making metadata and artificial intelligence maybe with
predictable, but stays in my opinion even with all of that, still an unpredictable
magic and creative place. That was the Eliac Suite. If you have a feedback,
you can find all the contacts on my homepage and also all the details, links of
this episode in the liner notes of the show. Thanks for listening Humans, take care
and behave.
(upbeat music)

Creators and Guests

Dennis Kastrup
Host
Dennis Kastrup
Dennis is a radio journalist in the music business since over 20 years. He has conducted over 1000 interviews with artists from all over the world and works for major public radio stations in Germany and Canada. His focus these days is on “music and technology” – Artficial Intelligence, Robotics, Wearables, VR/AR, Prosthetics and so on. He produces the podcast “The Illiac Suite - Music And Artificial Intelligence”. This interest made him also start „Wicked Artists“: a booking agency for creative tech and new media art.
08: The World of Music Tagging and AI with Hazel Savage
Broadcast by