Get Nrdly Free Trial Built with Nrdly

Google Play’s AI Narration

With my Corgi baby in the hospital I really needed a distraction today, one that didn’t involve writing because I can’t concentrate anyway, so I decided to delve into something I’ve been putting off–Google Play’s AI narration.

So I uploaded Dominion and it took about a day for it to process (this is normal for all new uploads) before I could proceed with the audiobook narration option.

Now, if you’re asking why wouldn’t I just go the “traditional” route and just have it narrated, it’s probably because you don’t understand what goes into making an audiobook. If you want it narrated your way and you don’t want to sell your soul and your rights to ACX/Amazon, it’s going to take a heap of specialized knowledge, equipment, software, and most all, something I don’t have–peace and quiet. I do have a Corgi after all.

I’m protective of my IP and don’t want to have it butchered by some guy doing falsetto for the females or some woman straining an ovary to drop her voice for the males. It makes me cringe. That’s the “doing it my way” portion that’s proven problematic. Enter, AI-generated audio narration, something that is up-and-coming. There are some paid versions that are quite good, but they are expensive.

For the time being Google Play is not charging for the AI-generated audiobook and I know going in that it won’t be accepted elsewhere, but so be it. I am still working on narrating Pretending to Sleep myself. More on that later.

The first thing I noticed is that Google Play limits you to one narrator voice. I can’t have the female viewpoints done by a female voice and vice versa. This is one feature that would be really easy for Google to fix. The more difficult issues have to do with pauses and intonation. While the voices are good, they definitely fall short especially when it comes to simple phrases like “No” where the context of the piece would allow a human narrator to intone the word properly.

It does allow you to edit the pronunciation, but not always. Sometimes it just decides that yours is wrong and won’t accept it. For example, the word “data.” To quote Star Trek, “One is my name. The other is not.” Well, I want the not-name. I want the proper way to say the plural of datum. Nope, not an option. Someone decided that the Star Trek way is the right way.

But it’s not at all like the text-to-speech that comes from Scrivener or from other software that will read to you. It’s also definitely not as good as a human narrator that isn’t doing falsettos and setting my teeth on edge.

It remains to be seen if Google Play will allow me to switch narrators and then have a second option where perhaps I can take the tracks for each scene and edit them myself so that the voices alternate. It’s a bit of a PIA but worth it to me to at least get some semblance of what I want.

Others things I noticed is that audio narration forces additional clarification on dialogue. Whereas the written page definitely tells you that it’s a new speaker, that element is lost via the AI-narrator. Unlike a human narrator who filters the change for the listener, no such nuance exists here. It’s one of the things I’ve been worried about with audio, because frankly, I’m not about to trade my beloved close viewpoint for the distance of third limited where I have to explain everything with needless tagging (she said sternly, he said ruefully, etc).

One of the nice things about it was that Google Play did allow me to go in and add simple dialogue tags where they were needed (since the visual cues are gone). I suspect this is not the case for other narration, especially the kind where they are synced up. I myself am not a consumer of audio since most of the time the attempts at voice acting, particularly with the falsetto are enough to make me want to set my hair on fire.

I have said it before and I’ll say it again. Writing for the page and writing for audio are two different skillsets. The page allows for the character-narrator to shine as there is nothing to get between your character and your reader. Audio, however, is all filtered via the voice actor. And if you don’t understand the difference, then I doubt you understand the difference between author-narrator and character-narrator either and we really have nothing in common as a starting point for any meaningful discussion.