On today’s episode of Black Mirror: The Reality TV version
Attack of the Voice Clones? Voice cloning — using a computer program to create a synthetic, adaptable version of a real person’s voice — has become notably more sophisticated in the last few years. And with this, a handful of tech companies are racing to develop new products that will let people replicate — and often monetize — their voices. But as voice cloning software becomes more widely used, we’re likely to see more Black Mirror-type questions arise about consent and the appropriation of personal data.
Recently, voice-cloning tech has come on in leaps and bounds: There’s a stark difference between the tech used just four or five years ago and that used today, digital voice company VocaliD’s founder Rupal Patel tells Radio New Zealand (listen, runtime: 07:23). “The main thing is the cloned voices are actually modelling phrasing and intonation — these things are key to the believability of a voice.”
Ever dreamed about just sending a clone to do your work for you? “While celebrity Y is sleeping, their voice might be out and about, recording radio spots, reading audiobooks, and much more,” the president of AI company Veritone tells tech news outlet The Verge. One American voiceover artist and actor sees voice cloning as a way to “future proof” his career, he tells the BBC. He imagines generating passive income by sending the voice clone off to do jobs for him — especially if he finds himself double booked, he says.
Celebrities are also gaining back lost voices: Hollywood actor Val Kilmer, who lost his voice after surgery for throat cancer in 2015, had it digitally restored last year by UK-based software firm Sonantic. “Val’s team wanted to give him his voice back so that he could continue creating,” says Sonantic’s CEO Zeena Qureshi.
So how exactly do you clone a voice? Recorded audio material of the person is fed into the voice cloning program, along with a transcript of what they’re saying, says Patel. The program’s neural network can then map the way the person pronounces different sounds, learning how that particular person speaks, she adds. Programs like the one used by VocaliD usually use a “base model” of speech production, which can then be adapted using the audio data fed to them. The more data the program is fed, the more the cloned voice will sound like the original subject, says Patel. “You can build a voice with as little as a few minutes or seconds of audio,” she adds.
And how’s it being used? Cloning specific voices goes a step beyond the speech synthesis — or artificial production of human speech — responsible for creating Siri or Alexa. Veritone’s recently-launched platform, Marvel.AI, lets people generate voice clones, which they can then license as they like, The Verge reports. The idea is for cloned voices to be stored on Veritone’s systems, available to generate audio material whenever the customer wants. There’s even the chance that one day people could submit requests to buy and use the cloned voices. Veritone anticipates a huge market for this service — especially among digital influencers, athletes, celebrities, and actors.
But these new developments come with their own set of ethical questions, like whether it’s okay (or a bit too much like a Black Mirror episode) to clone the voice of someone who’s passed away. Earlier this year, news that a documentary about the late chef Anthony Bourdain made use of voice cloning tech to create a snippet of dialogue prompted widespread criticism, AP reports. Director Morgan Neville appeared to dismiss any ethical concerns about his not having consent from Bourdain to clone his voice. And Bourdain’s widow flatly denied Neville’s claims that he had obtained her approval to do so.
And they could open the door to more cybercrime: the rise of voice clones could also make certain kinds of cybercrime easier to commit and harder to detect, the BBC notes. Verbal conversations have long been seen as an almost foolproof way to verify a person’s identity, in contrast to email or text messages. Voice cloning now represents a substantial threat to this, with one 2019 high-profile incident of fraud seeing a UK manager tricked into transferring EUR 220k to fraudsters who used a cloned copy of his boss’ voice. Increasingly, companies are being advised to monitor audio to assess whether it could be fake, while Europol is urging EU states to invest significantly in tech designed to detect deepfakes.