07: Get cloned and get paid with Voice Swap!
This is an automated AI transcript. Please forgive the mistakes!
This is "The Iliac Suite", a podcast on AI -driven music. Join me as we dive into
the ever -evolving world of AI -generated music, where algorithms become the composers
and machines become the virtuosos. Yes, this music and text was written by a
computer and I am not real, but... I am and my name is Dennis Kastrup.
Hello humans! Welcome to a new episode of the Eliac Suite, the last one of this
crazy year for AI and music. So much has happened that I'm sure it will go down
in the music history as the official beginning of artificial intelligence taking over
the music industry. You are not convinced? Let me introduce you to Anna Indiana,
the self -called new fully AI generated upcoming superstar her digital avatar appeared
some days ago online and before we will get into voice cloning in this episode and
how musicians at VoiceSwap embrace it because it actually has a lot of advantages
for them I leave you alone with Anna Indiana she introduces herself hello world,
I'm Anna Indiana and I'm an AI singer -songwriter. I'm excited to perform my new
song, which I actually wrote in collaboration with Humans. I started by asking my
followers to pick a mood for the song. They picked Feel Good. I then asked GPT4 to
generate some potential Feel
I then passed in this context to generate some relevant words of inspiration for the
chorus. I had my followers vote again and they picked "Truck, Dream, and Road." I
then used all this to generate the chorus lyrics and then subsequently the verse and
bridge lyrics. I'm using the OpenAI API along with these prompt templates from the
LangChain library, but you can also just use ChatGPT directly. To get the best
results, it's important that your initial prompt has very specific guidelines and
inspiration. Then you can build off that by chaining additional prompts, each with a
growing context window, until you have a full set of lyrics. In a future video,
I'll talk about how I create the chord progressions by programming MIDI files, and
also how I construct the melody and map it to the lyrics. But here's how this one
turned out. This is Miles of Smiles.
Sun shining bright Everything feels Just right Within my hair And the world on
display I'll write my story I'll make my own way There's no looking back Only for
where I'll go On this boundless adventure I'm ready to go Underneath the sky,
so vast and blue Don't know where I'm heading to My heart beats and rhythm with
life's sweet song Right here on this road is where I belong Cruisin' in my truck
with nothin' by the dream Drivin' through the heartlands, The fields are evergreen
These roads keep giving me Miles of smiles In the end It will all be worthwhile In
the reveal mirror The past is a blur I drive to the future It's uncertain but pure
Every mile of smile Every turn of chance It's an endless adventure I'll dance my
own dance I faced the fire and I've found the pain I've never surrendered I've
danced in the rain Through the darkest of days I've learned to thrive And on this
road I'm truly alive Underneath the sky So vast and blue Don't know where I'm
heading to My heart beats and rhythm with life's sweet song Right here on this road
is where I belong Cruisin' in my truck with nothin' but a dream Drivin' through the
heartland, the fields are evergreen These roads keep giving me miles of smiles In
the end, it will all be worthwhile When the storms roll in and the skies turn grey
I'll keep driving, come what may Don't know what's ahead,
just here for the ride And on this highway, I don't need a guide Cruisin' in my
truck With nothin' but a dream Drivin' through the heartland The fields are evergreen
These roads keep givin' me Miles of smiles In the end It will all be worthwhile
Cruisin' in my truck With nothin' but a dream Driving through the heartland,
the fields are ever green These roads keep giving me miles of smiles In the end,
it will all be worthwhile
Welcome to the future of music! I believe this will be noble a fully generated AI
singing and some people will stay resistant because they might say in the future it
is not real music made by real humans there are no emotions in there yes but all
these AIs that generate music are based on humans who put in real emotions before
so maybe an AI is just the recycling of human emotions and aren't emotions in the
end also just chemical reactions. So mathematical orders. But I'm drifting away here.
Let's focus on today's episode. Some weeks ago YouTube, so Google in collaboration
with a number of musicians released a software program that could copy voices with
the artist's consent. These artists include Alec Benjamin, Charlie XCX,
John Legend, Sia and T -Pain. The service is called Dreamtrack. The songs can be
used with a license then. So far just chosen participants can test the application,
it's mostly producers. This idea before that was floating around some time now.
And YouTube was not the first one to think about it. Voice Swap existed before.
I talked with Daniel Stein, also known under the name DJ Fresh. Check him out if
you have not heard his drum and bass dubstep hits in Great Britain. He started
Voice Swap together with his partner Nico Pellerin some months ago. Here is their
story. So over the last five or six years I've been working for companies in the
tech space, mostly in and tech -figured science and education.
Most recently, head of engineering at a company called General Bioinformatics in the
UK, who builds genomics data solutions. And I found myself doing some consulting for
a company called Stability AI at the end of last year. And I was working with
those guys and they were one of the forerunners with text to image technology and I
just started to become aware of this disconnect between AI and this incredible
technology that I was really passionate about and my beloved
you know, in which lived all of my friends and everyone that I care about making
music. And realizing that, unfortunately, that this disconnect meant that whilst music
and art were starting to power some incredible new AI technologies,
that there didn't really seem to be a mechanism that was focused on awarding the
creators themselves or the technologies that were being created off the back of their
work. So I became quite, quite concerned because wearing both of these hats and
being equally interested in AI and technology and music, I found myself thinking,
you know, as somebody who's working in the AI and tech industry, am I now at odds
with all of my friends in the music business and the music industry because it
seems like these technologies could potentially, you know, take away a lot of jobs
and affect their livelihoods, but there doesn't seem to be a focus on,
you know, how are these people going to be rewarded from their work being used to
create these AI models.
And so, yeah, I became I became very, very concerned for a while for sort of five
or six months. I spent a lot of time following people like Gary Marcus on the
internet and investigating what was being done,
what was happening with legislation, how were creators going to be brought into this
picture as AI starts to lean on their work to generate new content.
And the answers that I found were generally misleading, confusing.
There seemed to be a lack of clarity around what legislation there was going to be
to protect creators. And I met a guy called Miko Pelerin,
who's an incredible engineer who also has a background in music and he'd been
experimenting with voice transfer models and we got together and we said,
well, could we maybe try and build the first AI platform that puts artists at the
forefront and the artist payment model as being the number one concern of the
company as opposed to the inconvenient blocker that I think for some of these AI
companies that's how they see you know paying artists so we started to reach out to
some amazing artists that we were aware of that we worked with and we built models
of their voices and we've created a platform where people can effectively change
their voice to be like one of our partner artists and then if they want to go on
and use that recording commercially they can license the recording legally with the
artist's permission and consent or they can use the platform to connect with the
artists and maybe move from an AI recording to a real recording and arrange a
studio session and go the traditional route. So there's also an element of,
you know, networking that voice swap allows between artists and producers.
And we've had some really amazing producers and some really amazing singers involved
in the platform, producers like Diplo and Scream and Rob Swire from Pendulum and
Knife Party, Beardy Man. I think Todd Terry is one of the users of the platform as
well. So lots and lots of people have been really finding the platform quite
exciting and using it to try out new ideas. And We're just about to launch some
really exciting new technology and singers, and we've been expanding our team over
the recent weeks. We've started working with the former music tech editor of Rolling
Stone, Declan McGlynn, the former VP strategy at SoundCloud,
Michael Pulsinski, the YouTube influencer, music tech influencer Ben Jordan,
who's very outspoken on artists.
So, how does this sound? I have some examples for you. This One is from a singer
called Jonathan Muriel. He sang the following lines under the name "White Wake".
One of the singers of the voice swap archive is Liam Bailey. His AI voice version
sounds like this.
And another one in the tone of the singer Aya Marar. R.
Quite good, isn't it? I am really impressed. Yes, there is still room to even make
it better, but it is an amazing start. Here's a little story. I did many interviews
so far in my life. Many musicians. What I really like is when someone tells me
something I have not thought of. Sure as a journalist I'm always trying to be so
smart and prepared but sometimes the view on a subject changes and I have a new
perspective on something and that's what happened while listening to Daniel Stein.
Sure I thought it is a great idea to change your voice and play around but he
told me why it actually also totally makes sense so one of the things as a
producer myself I have a lot of what I think are good ideas maybe they're not you
know and often I rely on other people I work with to be in the room with me and
I sing them the idea and they tell me whether they think it's a good idea but my
voice is awful and I've always had this problem working you know some really big
artists like Ellie Golding and Rita Ora and Kylie Minogue I've worked with and I
really sound awful when I'm singing it's almost like a disability so in my head I
can hear Kylie Minogue singing this incredible perk and if I'm in the room with
Kylie Minogue and I'm trying to explain this idea to her using my voice. I'm really
relying on her,
basically her patience and how much respect she has for me to look past how
terrible it sounds through my voice. Or maybe it's some other talent that, you know,
assuming that it is a good idea because a lot of my ideas aren't. But This is a
big problem for me, so for me, I can use VoiceSwap to be able to assume the voice
of somebody whose voice sounds much cooler than mine, and I can use that to create
a demo, to demo my song idea and make it much more believable and compelling for
record labels or singers or other people that I'm working with. Daniel talked about
his awful voice, which I don't agree with actually you don't want to hear me sing
so let's hear that voice you can find that voice online on the SoundCloud account
of VoiceSwap
and here comes the music and the changed voice to the I voice of Jamie Cole.
Don't want the name, don't want the name Don't want the name,
don't want the name Sing And another one,
the Liam Bailey version Don't want the name, love sticker The way that you change
my world The way that you do something to ♪ The way that you move your mind ♪ ♪
You're what the man wants you girl ♪ ♪ The way that you change my world ♪ ♪
You're really something to me ♪ ♪ The way that you move your mind ♪ ♪ You're what
the man ♪ ♪ You're what the man ♪ ♪ You're what the man ♪ ♪ You're what the man
♪ And And one last one, Nicky Ambers.
So, what do you think? I personally would choose Niki Ambers, so the last one. But
what did we do just right now? We listened to singers singing lines without moving
at all. They did not work at all for this, which is in the end helping everyone
in the process. It is easy, right? There's lots and lots of different uses, some of
which are, you know, if we're talking about people using Voice Swap, the platform,
you know, just people that say are listening to your show right now that make
music, and I'm talking about one sort of use case here.
I mean, a typical use case for somebody like that at the moment is I'm a male
songwriter and I'm trying to create a song for a female singer so using one of our
models you can transform your male voice into a female voice and obviously you know
for any songwriter that's just like a dream come true suddenly it's like you can
not only just do your own vocal sound, but you can sing like a girl.
If you're a girl, you can sing like a guy, or you could maybe take somebody like
Liam Bailey, who's one of the singers on our platform. He has a very deep, sort of
very rootsy kind of voice that suits his kind of scar reggae music that he does.
And often as a songwriter my voice might sound good with certain genres of music so
my voice tends to sound good with kind of like rock music a little bit but not so
much with with that style so I can basically use Liam's voice to explore singing in
different styles that I would that would never suit my voice normally. I told you
in the beginning that Voice Swap really enables artists to be paid for their AI
voices. But how does the paying model work for voice swap? In terms of payments,
what we do is we charge our users a subscription or they can buy effectively
credits. It's a bit like phone credits for making phone calls and you get charged
for the amount of seconds of use that you have of our models and then we split
the revenue that we make for each model with the artist whose model it is 50 /50.
We take all of our costs from our 50 and then we pay 50 over to them and we do
that twice yearly which is typical music industry royalty accounting structure and
then with the licensing we pay 80 % of any fee that we secure for the artist
through to the artist and we take a 20 % commission which we think is incredibly
fair if you look at you know the DSPs and the way that people are generally paid
for recordings This is a really fair model and for us this is also an opportunity
to reframe AI and to basically create a new model so that even if VoiceSwap doesn't
become the biggest platform for this, that people will be aware that these deals are
there and artists will say I want a deal that's as good as VoiceSwap is offering
and we think this is an amazing opportunity to set a precedent for the future of
these deals. In order to make this work, we have tried to come up with the
simplest model that services like the most common use cases.
So if you use VoiceSwap and you create a song with one of our artists voices and
you want to go on to release it. We've agreed with each artist price bands for the
amount of seconds of audio that you have on your track using their models. So let's
say that you have like a little hook, like just put your hands up in the air or
something like that and it's five seconds long. So we have three bands,
one of them is less than 10 seconds. So that's going to be in the less than 10
second band. And the price for that is indicated when you go and click on the
model, you can see what the prices are for the seconds of audio that you might
want to use. And this can be some some of the models can be as little as $50,
I think it is for the for the cheapest band for band for the cheapest rate that
we swap at the moment, so if you're using one of those samples and you've created
10 or 20 different versions of this track and finally you end up with the one that
you want to use, you go through this process, you select less than 10 seconds,
you see that it's going to cost $50, you put in the information about the track
where it's going to be released you upload your song. That gets sent to the artist
or their manager and our artists have committed to respond within 48 hours and
really this process is just to give the artist peace of mind because they want to
make sure that you know then their voice isn't going to be used to politically
sensitive or inappropriate. So they get sent this information,
they review it, if they're happy, they approve, they get paid, we generate an
agreement which goes to both sides and then this producer can now legally use that
recording on their track. In this episode so far we heard Daniel Stein Aka,
DJ Fresh talking about voice swap as a representative from the producer side.
For him, cloning is very useful, but we have not heard from the singers. What do
they think about cloning their voices and why do they do that? I reached out to
Angie Brown, who is in the roster of Voice Swap. Her bio says... - Angie is of the
undisputed legendary Diva champions of house music. Her legacy is unquestionable, her
voice instantly recognizable. She is the distinctive voice on Return of the Mac by
Mark Morrison and has performed backing vocals for The Dirty Strangers, The Rolling
Stones, The Happy Mondays, Grace Jones, Heaven 17, Beverly Knight,
Lisa Stansfield, Fat Boy Slim, Kate Bush, Stereophonics and many more.
So why did she with all of her amazing history decide to clone her voice? I
decided to let my voice be a part of the AI voice swap platform because in all
honesty, being a recording artist, my voice has already been immortalised.
My voice, my image, and my brand due to the internet and the digital age.
There are different ways of recording the voice.
They have been different ways of recording the voice over many years, but at the
moment in this digital age it's very clean and very easy to be produced and to
actually be creative. You can do so many things with the AI voice,
you can change your voice to a country voice, a reggae voice, you can even singing
languages. And I do think that computers have come into our lives a lot,
like we all rely on our phones to live our daily lives.
So of course at some point there's going to be some kind of progression where the
computer and the digital age would actually work together with the human voice and I
find it creative and really challenging. Nothing stays still forever creatively.
There's always something new just around the corner so really for me it was a
golden opportunity and I didn't want to miss out on it. Angie talks about a golden
opportunity so let's get into that. What exactly did she have to do to take this
opportunity so to say to train the AI? Well I just went to Bernstein's studio and
he has an amazing really fantastic studio with the best microphones and I must have
sung solidly for about an hour maybe an hour and a quarter and I sang different
shapes high and low using all all all the sides of my voice all the all the
sounds within you know the capacity that my that my voice can make,
the range and also the delivery and a certain amount of projection as well as the
feel, the feel that I've got in my voice and the timbre in my voice.
So we wanted to capture all that within the hour and I sang all kinds of words as
well and that was all recorded and Added to my voice module so that should someone
want to use my voice I will fit nicely into whatever they may need Whatever they
might need so Christmas is approaching and I could not help it I needed a Christmas
song here Freddie Mercury singing all I want for Christmas as you was uploaded on
YouTube by a user called Matt I think this is a good clone of the Mariah Carey
song Although it is still not clear if that is really her song songwriter Andy
Stone suit her over a leached copyright infringement again last November It is a
second time. We will see where this will lead us in the And what strikes me about
this version, Mercury's voice is quite good, which means maybe also that the Mariah
Carey voice is quite close to the Freddie Mercury voice. I never saw it that way
before. Thanks AI.
I don't care about the presents Underneath the Christmas tree I just want you for
my home For the new could ever know Make my wish come true
All I want For Christmas, it's you,
yeah I don't want a lot for Christmas There is just one thing I need Don't care
about the presents Underneath the Christmas tree I don't need to end my stuckin'
Bear up on the fireplace Santa Claus won't make me happy With a toy on Christmas
Day I just want you for my own More than you could ever know Make my wish come
true All But for a Christmas is you,
you Oh, I won't ask for much this Christmas I won't even wish for snow I'm just
gonna keep on waiting I don't need to miss a toe I will make a List and send it,
to the knock -ball for send it I won't even stay away to hear those magic reindeer
clays 'Cause I just want to hear tonight, calling on to me so tight What more can
I do? Baby, all I want for Christmas is you
You, you baby
Okay, enough Christmas mood here. Let's hear Angie one more time because her words
really are in line with my opinion. They are good strong last words in this
episode. Literally you cannot hold back actual creativity and how much we progress
artistically, you can't guess what's going to happen creatively.
You can't buy it or trade it in the stock market.
It's not something that you can do. So when it comes to AI, it's a natural
progression of art and creativity, And nothing's going to get in the way of that.
So really we have to accept that there's going to be AI voices, there's going to
be AI instrumentation and it can take away work from the actual musician or from
that actual singer but it's the way forward. At some point we have to accept that
especially when it comes to working in the studio and getting things done.
AI has been here for ages. It really has. Computerized drum sounds and percussion
and guitar and piano has been here for ages. So what makes us singers think that
we're untouchable And that we can't, you know, there's no way you can change the
human voice with AI. No, I think that that was going to come.
I think that the feel, the human feel is always going to be there, but there is
going to be AI developed at some point that can replicate a human voice,
you know, 95 Definitely and and who are we to stand in the way of progress word?
Thanks, Angie Brown for this nice final statement And thanks Daniel Stein for taking
the time to talk to me This was the last episode of the Iliac Suite in the year
2023 the year that will go down in history when music made with the help of
artificial intelligence Went mainstream. Thanks for listening listening humans, take
care and behave, talk to you in 2024.
Creators and Guests

