This topic contains troubleshooting tips for text-to-speech (TTS).
Use the following checklist to troubleshoot this problem:
The language you are trying to use should be installed.
The [1.0] is for American English, the [1.1] is for British English, [3.1] is for Canadian French, and [4.0] is for German, [2.1] is for Latin American Spanish.
Also, if the SSML processor cannot find the audio file, it will synthesize the files that it can find. This means that there is no error message rendered, but you will not hear the file played. In order to troubleshoot this error you will need to turn on tracing to see why the file was not fetched.
If you are unable to hear the audio specified in the SSML <audio> tag, verify that the server where WebSphere Voice Server is installed, doesn't have Windows Internet Information Services (IIS) or any other web servers running or installed. Disable any web server software you find running or installed as additional Web servers conflict with the WebSphere Application Server HTTP server.
Certain <voice> tag attributes, such as gender "neutral" and age, are not supported for any concatentive voices, including US English voices. Also note that the "variant" attribute is not supported for the US English voices.
The amount of text included in one speak request is limited to the amount of text that can be spoken in five minutes. The synthesized audio for larger amounts of text may be truncated.
The prosody volume and range attributes do not have the appropriate effect on the TTS generated and are currently unsupported.
The TTS supports only adult voices. Some age and variant voice settings may not affect the TTS.
The Australian TTS uses the UK English TTS. As a result, some of the names of Australian cities will sound as if pronounced in UK English. You may insert lexicons into the VoiceXML script to correct the pronunciations.
The <emphasis> element, included in VoiceXML 2.0, is currently not supported. A fix is planned for a future release.
Windows
When loading more than three different TTS voices, for example the U.S. English male and female voices, as well as the U.K. English male and female voices, you may need to boot Windows Server 2003 so that it allocates more contiguous memory to application programs. To do this, use the following instructions:
C: cd \
You should see a response similar to the following: A SH C:\boot.ini
attrib -h -s boot.ini notepad boot.ini
You should see something similar to the following in Notepad:
[boot loader] timeout=30 default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS [operating systems] multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect
[boot loader] timeout=30 default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS [operating systems] multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise /3GB" /fastdetect /3GB /userva=3030 multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect
This configuration allows you to run additional voices, given that you have sufficient real memory and/or pagespace. If there is any problem with the /3GB configuration, you can boot from the original configuration (the second line on the boot screen) and remove the /3GB line from boot.ini.
To hide the boot.ini file again, type: attrib +h +s boot.ini
Linux
Since WebSphere Voice Server uses a single process to load voices, the memory available for loading languages is limited by the Linux limitation for the memory available for a single process, even if the machine has a larger memory.
If you have antivirus software installed or enabled on a WebSphere Voice Server Windows Server 2003 machine, you may encounter the following:
If the operating system stops, the Windows Server 2003 Online Crash Analysis will indicate that the antivirus software has had a critical failure and request that you update the antivirus software to correct the problem.
Also, it is possible that the antivirus scanning software is attempting to scan some or all of the WebSphere Voice Server files; therefore, it is recommended that you disable scanning for the following directories (where installation_drive is the letter designation of the drive where the software is installed, such as C:):
If you have configured WebSphere Voice Server to point the General (first level) Cache and Grammar (second level) Cache to directories other than the default, then you must disable those directories as well.
If scanning on the above directories is disabled, and problems are still occurring, it may be necessary to uninstall or disable the antivirus software. Updating the antivirus software may not correct the problems. If not, contact your antivirus software vendor for a solution.
Make certain that the list of available TTS voices does not contain both 22 KHz voices and 8 KHz voices. The following log message is displayed in that case:
CWVSB0001E: Failed to launch TTS engine process. Reason: Failed to load TTS voice voice name due to error status = 407, error cause = mixing sample rates not supported
Configure the list of available TTS voices so that it includes only 22 KHz voices, or only 8 KHz voices. Configure the default TTS voice accordingly, and restart the server (See Configuring default and available TTS voices).