Posted on

Tacotron 2 or Human; Can You Tell the Difference?

When Apple introduced the new Siri back in June of 2017, I was pretty impressed. In fact, I was so impressed that I wrote a blog on the subject and noted that voice actors better pay attention. (Have You Heard The New Siri? As A Voice Actor, You Better Listen!)

Text-to-speech technology is something I’ve been keeping on eye on because I believe, one day, it could have a profound impact on my income as a voice actor. I’ve already seen it happen in certain areas of my business. In fact, just the other day I had a conversation with an eLearning company that said they prefer to use computer generated voices to voice actors because it’s more convenient for them.

Are you familiar with Moore’s Law? Around 1970 it became common term related to processing speed for computers. The principle states that processing power for computers will double every two years. For the most part, this has held true.

If it applies to computer processing speed, one can’t help but wonder if it could be applied to the advancement of other related computer technologies, such as A.I.

Tacotron 2 or Human; Can You Tell the Difference?

Google released some audio samples recently that are ear-opening.

The engineers at Google have been working very hard on a new text-to-speech system currently called “Tacotron 2.”

Here’s what they have to say about it, with regards to how it compares to human voice.¬†“Our model achieves a mean opinion score (MOS) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech.”

Mean Opinion Score is a fancy term used in telecommunications that measures how true to life something sounds. Based on the results Google is sharing, Tacotron 2 sounds darn near as real as it gets right now.

In addition to not only sounding real in quality of voice, it also has the ability to detect context! For example it can tell the difference between¬†the noun “present” and the verb “present.”

Could text-to-speech replace voice actors in the near future? Listen to this!
Tweet Quote

The Future is Coming Fast

If you thought the new Siri sounded great back in June, the samples of Tacotron 2 are going to blow your mind and make you nervous all at the same time.

This technology isn’t all that far away from becoming a legit competition in a field like eLearning, in my opinion.

I don’t know about you, but that’s a concerning prospect for me, as that represents a significant chunk of my voice over income.

Hear the samples for yourself here – Tacotron 2 Audio Samples

What do you think?