AI Voiceovers are Getting Better, but There's No Replacing Real Humans
For decades, voiceover in television ads and as narration for audio commercials has been a constant. A good voiceover can change the tone and elevate the impact of a commercial, engaging audiences and getting the point across. A good voiceover actor can convey enthusiasm, gravity, calmness, excitement, or all of the above.
But as AI technology continues to innovate, we’re seeing it seep into writing and art, mathematics and conversation. It’s no secret that AI voices have been growing in this realm too, though most of them still have the robotic intonation (or lack thereof) familiar to people who use audio AI for directions with Google or Apple maps, or when they ask Siri a question and she robotically responds “I’m sorry, I didn’t quite catch that.” We’ve also seen an advancement in AI voiceover on social media apps like Instagram and TikTok, with short videos playing a voiceover of captions in a clearly non-human voice, complete with odd inflections and off-kilter pacing. For most of us not embedded in the high levels of tech, AI voices have a ways to go.
However, the advancement in AI voices is continuing to grow. In the past, AI voiceovers only had the capacity to sound mechanical and unnatural, but the growing text-to-speech software world means those robot voices are sounding more realistic. This is thanks to advancing algorithms and voice training software, which means the AI voices can sound more realistic and natural. Naturally, this also means a growing concern for the real live humans who make their living as voice actors, both in movies or television shows and in advertising.
In late 2022, Microsoft first unveiled their AI language model VALL-E, which they said can emulate any voice, including tone, emotion, and cadence after training on just a few seconds of the source audio. This is an extreme example, but with the rise of natural-sounding AI, more and more voice actors are losing work to AI voices.
While there are pros and cons to both live actors and AI, there are some things an AI can’t replace. An AI voiceover is built from a variety of synthetic voices that are based on machine learning. The AI voice generator converts text to speech, and depending on the level of the software, it can be based on real voice actors to help preserve the humanity in the words, but the fact is that an AI will never have the range and emotional weight of a real actor.
No matter how much AI voices advance, a robotic voiceover will always lack the sophistication and nuance of real actors, making it difficult to convey the real message behind the advertisement or recorded text. While human voiceover is more expensive and time-consuming, if we continue to allow AI to take over the majority of our work, there will be a certain element of humanity lost along the way.
AI voiceover cannot do accurate dialects and accents, and will only use the language embedded in their system. There is also a distinct lack of flexibility in the programming of an AI voice, which means you’re limited to the preprogrammed emotions and timbre in the AI system. Timing in a voiceover script is critically important to getting the point across in an effective and attention-grabbing manner, and changes can be made during the human recording process to better fit the needs of the advertisement or script. That can’t be altered with AI programming in the same level of sophistication.
People still prefer to hear human voices, and no matter how close the AI gets to sounding like a human voice, there will always be a certain element missing. That could mean losing audience trust or attention, and a lack of connection the listener feels to what is being spoken about. A message coming from a real human will be conveyed in a more genuine and effective way than a robotic voice, with a uniqueness and ability to convey the emotions needed for each project.
While human voice actors do require more cost and time resources, as well as audio recording, most professional recording software in a studio is going to have a better sound than the audio coming from an AI program. The sound will be richer and the human voice will have more of an authenticity that can connect to audiences better, increasing the trust in the message, which in turn helps increase interest in the product or service or event.
As AI technology continues to advance, there’s no denying some of the work will end up going to AI instead of voice actors. But like we’ve seen with pushback to the tech industry, there is no replacing a human artist, a paper book instead of an e-reader, or a real human connection instead of a robot. For those reasons, we know that despite the higher cost of a human voiceover, the extra resources of using real actors is worth the price.