NextUp.com- Text to Speech Software with AT&T Natural Voices™ NextUp Technologies
  Home Products Downloads Purchase Support About Us
 •Press
 •FAQ
 •Contact Us
 •Affiliates
 •Tell A Friend
 •Get More Voices
 •Free Newsletter
 •What's New
 •User Forums

 •Mac Users

Tell A Friend Frequently Asked Questions

Site Licenses and Volume Purchases

Association of Shareware Professionals

 

Welcome
Home> TextAloud> Getting the Most From AT&T Labs Natural Voices™

TextAloud AT&T Labs Natural Voices™

Getting the Most From Natural Voices™ with TextAloud
This page provides information about using Natural Voices™ with TextAloud, some basic information about how Natural Voices™ work, some typical differences you may notice with Natural Voices™ as opposed to the free voices that have been available, and information about how to get the best performance and sound from TextAloud.
How are Natural Voices™ Different than Previous Voices?
AT&T Lab's Natural Voices™ are the most realistic sounding voices ever available for use on consumer PCs. Quality synthesized speech like this was previous available only to large companies using very large servers for systems driving telephone menu systems. Engineers at AT&T Labs have adapted this technology for use on personal computers. Natural Voices™ differs from previously available voices in many ways:
 • More Disk Space: As opposed to the 2mb disk space taken by older voices, each 8khz NaturalVoice requires about 200MB per voice, the 16khz version requires 600MB per voice. This is because a greater number of high quality sounds files are used to splice together the words you hear.
 
 • Client/Server Communications: Versions 1.3 or before of Natural Voices sometimes have issues related to use of TCP/IP on the computer.  Version 1.4 of Natural Voices does not do this.  If you have version 1.3 or before, please read the info here:  Because Natural Voices™ was adapted from Server technology where a computer contacts a large server for speech conversion, if you run ZoneAlarm™ or other PC based firewall software, you may notice TextAloud and a process called TTSDeskTopProxy asking for authorization to use the internet. You must answer Yes to this for Natural Voices™ to work. Please note that while ZoneAlarm™ believes the programs are going to access the internet, they are really just setting up a TCP/IP communications channel within your computer. You do not have to be connected to the internet for the programs to work. But, you must tell ZoneAlarm™ to give both TextAloud and TTSDESKTOPPROXY.EXE permission to communicate for NaturalVoices™ to work correctly.
 • Slower Performance: Because Natural Voices™ uses a large amount of sound files to create smooth speech, you will notice TextAloud performs more slowly than with older voices. This is primarily related to loading the huge sound files into memory. There are two types of performance issues you may see. When starting speaking with Natural Voices™, the voices take longer to initialize. This can range from between 2 seconds to 30 seconds depending on your PC. This delay is the first portion of sound files being loaded into memory. The second problem some PCs exhibit is pauses during speaking, between phrases or sentences, typically early in the article. This is related to loading more needed sounds into memory. Below we'll give some tips on improving performance.
Improving TextAloud performance with Natural Voices™
Voice Initialization Delays and Pauses During Speech:
When you first start speaking an article with a Natural Voices™ voice, there are delays in starting the speech and possibly pauses during speech. These delays are less noticeable for those with faster processors. Probably more significantly, the more memory you have, the less often this problem occurs. The minimum system for Natural Voices™ has 128MB of RAM Memory. Recommended is 256MB RAM, and with even more RAM, the system will perform even better. Upgrading to more RAM will improve performance. In addition, because memory is swapped out to disk space, defragmenting your hard disk, via Start->Accessories->System Tools->Disk Defragmenter may help as well.
One thing you'll notice though is if you use a single voice, for example speaking an article with Crystal, if you re-read the same article with Crystal again or another article soon after but using the same voice, the delays almost completely disappear. This is because the first reading loads up everything required into memory, then the following readings will already have the required files loaded into memory so there are no delays.
But, if you speak with Crystal, then with Mike, all of the Crystal files are unloaded from memory, requiring the entire initialization process to start from scratch. So, the most important thing you can do to improve performance is stick with one voice. If you speak everything with Crystal, after the first article, the following ones will work great. So, instead of using Random or Round-Robin voice selection, you'll get better performance by using Default Voice, and making either Crystal or Mike your default.
If you still like the variety in voices and use multi-article mode, try having the first several using one voice, then the next several using the other. This gives you a little variety while still giving you pretty good performance.
Note that while speaking to file (creating WAV or MP3 files) these performance issues still exist, but the delays won't be heard in the audio files so they have much less impact.
Creating good sounding MP3 Files
The bitrate and sample rate of the Natural Voices™ audio files differ from those of the older voices, meaning different settings for MP3 bitrate and sample rate may be required. The bitrate settings for TextAlouds are on the File Options TAB, labeled Audio Quality.  While with music files, the higher bitrates always sound better, this isn't really true with voice files.  This is because the bitrates of the audio samples within the voice engines are lower, and when encoded to higher bitrates, extra noise can be added to the recordings.

Specifically for Natural Voices, the best sounding bitrates are as follows:

8khz Natural Voices, use 16kpbs, 11khz

16khz Natural Voice, use 24kpbs, 16khz.

Need More Help?
We are still learning and improving our support for Natural Voices™, so if you are having a problem or have a suggestion, please send email to support@nextup.com and don't forget to mention AT&T Labs Natural Voices™ somewhere in the email.


Mailing List
Sign up for Our Email Newsletter
See our Privacy Statement

Our site is hosted by Web Space Outlet

Earn Cash! Link to us!

Get More Voices

More Information About NewsAloud

 

Translate Page
*Rough Translation
Flags
FreeTranslation.com

 

Copyright © 2000-, NextUp Technologies, LLC. All Rights Reserved.