Computer voices not quite human but getting closer

E-mail E-mail this article
Print Print this article
0

Mock me if you must, but I'm intrigued by Peter, the computer voice. Mary and Eddie sound more mechanical, so Peter, who's slightly more articulate, is my favorite.

Still, his voice is a persistent monotone that lacks variation or emotion.

However, he's competent at what he does, which can be incredibly useful. Peter, you see, reads to me — almost any text I give him, from word-processed documents to Web pages and e-mail.

Peter, Mary and the others are voice "fonts" in a software program, NextUp's TextAloud MP3, which converts written words to spoken language. You simply open a document on your computer, copy it to the clipboard, click a button, and listen to it read the text out loud. You can also save it as an MP3 or WAV file to listen to later, or transfer to a portable digital player or CD.

OK, so you disdain the very idea of listening to a computer voice, and I did, too. But think about those times when you're tired of reading text on a computer screen, but could handle listening to a few more reports, perhaps while driving home or relaxing in an easy chair. Certainly, anyone who is sight impaired, or can't read, will value this program.

If you're interested enough to listen to what text sounds like when read by one of the computer voices, go to www.PortableVoice.com. This Web site is in the process of converting literary classics in audio books, read by computer voices. To listen to a sample, click on one of the offerings, which all come from the Bible.

The one I listened to didn't sound great, but it wasn't awful.

Portablevoice.com offers a downloadable free trial version of TextAloud conversion software ($20, for PC only), and provides complete information and a tutorial to get started. You can download up to 27 voices and translate text into seven languages. I decided to try it.

After a couple of attempts to download the software, one of the alternative downloads worked and the program loaded. Following the instructions, it was easy to convert an article I had saved on my hard drive to a computer voice reading of it.

The result sounded passable, so the next test was to convert my latest unpublished novel. I figured that hearing certain scenes would help me identify problems with my characters' dialogue.

The program handily converted my first 10 chapters to WAV files. I opened up my CD-burning software and made a CD of Part One as an audio book. Listening to Peter read my novel was rather painful. His dull and dreary voice is OK for reports, but it's pitiful as the voice of my characters who are anything but deadpan. The speed of his computer voice can be adjusted, but expression cannot. Alternate voices can read different chapters, but assigning characters their own voices is beyond this economically priced program. At this point, anyway.

In my opinion, Peter and the other voices aren't the right voices for reading great literature, or even my literature. At least not anything with dialogue. Still, they are able to deliver straightforward information in audio form. I believe that's the main purpose of this software — or it ought to be — and that it does pretty well.

So what about converting the Bible and other literature to audio books? The sample I heard contained no dialogue. As for converting other classics, Portablevoice.com may continue doing that when its software, TextAloud MP3, has an improved Text to Speech (TTS) engine.

Meanwhile, AT&T Labs is developing a new TTS engine with more natural-sounding computer voice fonts to be used in TextAloud and other products. To listen to AT&T Lab's more advanced voices, go to its Web site at www.naturalvoices.com.

There you can try out its TTS engine by creating your own demo. Simply type in up to 30 words, choose a male or female voice and listen to the results.

When I tried it, the voice sounded better than Peter's in TextAloud, but you won't mistake it for a human voice. Not yet anyway.

Besides developing more natural voices, AT&T Lab's latest engine allows users to include control tags to instruct the program to change the voice, enabling different voices to read various characters in a dialogue. That's what's needed for converting stories, plays and novels to audio formats. Of course, embedding tags won't be as simple as pushing a button, but the results will sound better.

The technology is advancing fast, and we'll see what the new year brings.

In the meantime, check out TextAloud MP3, you might find it useful.

Linda Knapp can be reached by e-mail at lknapp@seattletimes.com. You'll find more columns at: www.seattletimes.com/gettingstarted.