1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 4 out of 5)
Loading ... Loading ...

Text-To-Speech

Posted by Levi on May 5th, 2005
2005
May 5

eAs I’ve written about many times before, I am constantly seeking ways to digest various forms of “content.” What I mean is radio, blogs, news, tv shows, books, you name it. Anything that will make this process easier, faster, and more efficient, is something I’m constantly looking out for.

So, I was surprised when I recently found Nextup.com which makes a variety of text-to-speech software products. I read at a pretty slow rate and since I’d rather be out enjoying fresh air than sitting in front of a computer screen, having audio versions of a book, as I do with Audible.com, I’m able to “read” a lot of stuff that I normally would not. It’s not just for when I take walks, but of course whenever I’m doing something where I can listen to stuff in the background. So, commuting, exercising at the gym, cleaning, and so forth, all become activities that I can utilize for hearing all this content.

RSS, the mechanism by which I can subscribe to news feeds of blogs and other types of content, is a great invention, but it has piled on the additional amount of stuff that I want to read by huge amounts. Most of this stuff just gets unread because I only have so much time and given my slow reading speed, I’m not getting through a huge amount even if I devote hours per day reading all of these feeds.

These text-to-speech programs could really fill a huge niche in this respect by allowing me to listen to stuff that otherwise I could only read because they don’t exist in audio format. I know what some of you may be thinking. Computer voices, blech! Well, yes and no. Certainly they don’t convey the richness of intonation that a human voice does, but they also are not nearly as robotic sounding as they used to be, and it appears that they are being improved constantly. Some words still get mispronounced, but not many.

The main program that I’ve been playing with is called TextAloud. You can cut and paste text from any source, or open text documents into the application and it will read it aloud in a selection of different voices, or alternately will save the audio to wav, mp3, or wma. You can specify the bitrate (quality) of the recording, and you can control the voice’s pitch, volume, and even speed. So I can speed a voice up to make things go even faster, “speed listening” similar to what you can do with the 4th generation and later iPods, but with much more control, and the ability to speed things up to a ridiculous level. TextAloud also integrates with your clipboard so whenever you hit cut or copy in a windows application, a dialog box asks you if you want to add this text and overwrite or append what’s already in TextAloud.

I think the best voice that comes with TextAloud is “Jennifer RealSpeak” but it looks like there are even some better ones if you are willing to spend the money. AT&T Natural Voices, NeoSpeech, and Cepstral all make additional voices you can plug in. Unfortunately they are pretty pricey! TextAloud itself (which includes a bunch of voices) costs $30, but these additional voices are $30-35 each. Actually NeoSpeech offers two voices for $35, but you can’t buy just one for less. Some of these voices seem very good based on the samples they give you on the TextAloud voices page, but while there’s a fully-functional trial version of TextAloud itself, there’s no such trial for any of these voices. About the best you can do are these samples as well as a form that lets you paste in a very short string of words and hear them via the AT&T Natural Voices. If you don’t mind paying for these extra voices, there really is a nice selection both in English (with regional accents, as well as different types of voices both male and female), as well as lots of other languages. In fact, you can paste English into the online Natural Voices form for various other languages and laugh as the non-English voice engine tries to pronounce it in another language. It doesn’t even sound like accented English, but rather a weird mixture of that and the native language.

As an example, I took a story and copied it into TextAloud. I converted this to a wav file in only a minute or so even though the audio file that was created was about 20 minutes. The wav file was only about 30MB. The same process took a similar amount of time to create an MP3 and at 64Kbps, it only took up around 9MB. Unfortunately I can’t share these with you here because of licensing issues. Apparently only some of the voices that come with TextAloud are open for free distribution. The ones from the third-party companies you can distribute but a license must be purchased, some of which are somewhat reasonable and some of which are astronomical! This is the only gotcha I’ve found with the product so far. I was thinking of actually providing an audio version of each of my blog entries ala podcasting, and I still could if I used some of the voices available, but these voices are generally of poorer quality and sound much more robotic, so at that point, it’s not worth it, right?

Some uses for TextAloud (or any such text-to-speech application) that aren’t immediately obvious are listed by Nextup. For example you can “proofread” your written work. There’s definitely something to this. Often it’s easy to visually miss a typo or misused word, but hearing it out loud one can catch some of these more quickly. Another is to create a message for your answering machine. I kind of like this one! There are lots of people who like to have the default computer-generated message on their answering machine for privacy reasons, but this allows you to make any message you want and not just the default “please leave a message.”

There’s another obvious use for these types of applications – for those who cannot speak, either temporarily or permanently, but Nextup has another application specifically designed for them called Nextup Talker.

I personally haven’t purchased the application yet, but it looks like Nextup is a decent complay and they have one of the things I look for when determining whether I should buy something from a company – they have an active discussion forum where users talk about the applications and Nextup answers specific questions. Companies that participate in open public dialogs with their customers, I think, really have the right idea.

Share: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • del.icio.us
  • Facebook
  • Google
  • Live
  • Reddit
  • StumbleUpon
  • Technorati
  • TwitThis
  • YahooMyWeb
  • Furl
  • Ma.gnolia
  • NewsVine
  • Pownce
  • Slashdot

1,071 views    |    Email This Post Email This Post

One Response

  1. JT Duxbury Says:

    Hello,

    I have adapted my three novels, and some of the classic books from the Gutenburg Library into multiple voiced audio books; using the Cepstral voices as characters in the stories. The idea hasn’t caught on with the public, but it was a fun experiment.

    JT Duxbury


Leave a Comment




XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

Twelve Black Code Monkeys is using WP-Gravatar