TTS troubleshooting tips

This topic contains troubleshooting tips for text-to-speech (TTS).

The TTS engine(s) will not start (Linux platforms)
Speech Synthesis Markup Language or lexicon-related troubleshooting tips
The embedded audio or lexicon document cannot be fetched, or the embedded audio file within the SSML document will not play
Audio specified in the SSML <audio> tag not heard
The SSML document encounters a parse error
When the <voice> tag has a gender attribute set to "neutral", the voice is not changed
A synthesized audio output is truncated
Prosody volume and range attributes have no effect
Age and variant voice settings unsupported in the TTS
Australian TTS and city name pronunciations
Speech markup <emphasis> element limitation
Errors when trying to load multiple voices
Reliability issues when antivirus software installed
Errors when trying to load 22 KHz voices

The TTS engine(s) will not start (Linux platforms)

Use the following checklist to troubleshoot this problem:

Run the following command: rpm -qa | grep ibmtts
The language you are trying to use should be installed.
Make sure you have installed the correct language. To do this, navigate to /var/opt/IBM/ibmtts/cfg/eci.ini and look for the following entries: [1.0], [1.1], [8.0], [6.0], [3.1], [4.0], [2.1].
The [1.0] is for American English, the [1.1] is for British English, [3.1] is for Canadian French, and [4.0] is for German, [2.1] is for Latin American Spanish.
Check the $WAS_ROOT/profiles/AppServ01/logs/server1/SystemOut.log for errors.

Speech Synthesis Markup Language or lexicon-related troubleshooting tips

Is your Speech Synthesis Markup Language (SSML) compliant with SSML 1.0 standards?
Have you validated your lexicon file using the Voice Toolkit Lexicon Editor? For more information on the Lexicon Editor provided in WebSphere® Voice Toolkit, see the Voice Toolkit online help.
Check the $WAS_ROOT/profiles/AppSrv01/logs/server1/SystemOut.log to see if any errors have been logged.

The embedded audio or lexicon document cannot be fetched, or the embedded audio file within the SSML document will not play

Make certain that the IBM® HTTP Server is up and running on the host system:
1. Using Linux®, change directory to /opt/IBM/HttpServer/bin and type ./apachectl start.
2. Then check to make sure that the server is started by running the following command: ps auxw | grep httpd.
3. On a Windows® platform, view your services panel to check if the IBM HTTP Server is running.
Is the path you specified correct for the file you want?
Check the $WAS_ROOT/profiles/AppSrv01/logs/server1/SystemOut.log to see if any errors have been logged.
Does the file exist in the location specified?
Verify that the file is in the correct audio format. VoiceXML specifications require that a platform support the following playing and recording audio formats:
- Raw (headerless) 8kHz 8-bit mono mu-law [PCM] single channel.
- Raw (headerless) 8kHz 8 bit mono A-law [PCM] single channel.
- WAV (RIFF header) 8kHz 8-bit mono mu-law [PCM] single channel.
- WAV (RIFF header) 8kHz 8-bit mono A-law [PCM] single channel.

Also, if the SSML processor cannot find the audio file, it will synthesize the files that it can find. This means that there is no error message rendered, but you will not hear the file played. In order to troubleshoot this error you will need to turn on tracing to see why the file was not fetched.

Audio specified in the SSML <audio> tag not heard

If you are unable to hear the audio specified in the SSML <audio> tag, verify that the server where WebSphere Voice Server is installed, doesn't have Windows Internet Information Services (IIS) or any other web servers running or installed. Disable any web server software you find running or installed as additional Web servers conflict with the WebSphere Application Server HTTP server.

The SSML document encounters a parse error

Is your SSML compliant with SSML 1.0 standards?
Have you validated your lexicon file using the Voice Toolkit Lexicon Editor? For more information on the Lexicon Editor provided in WebSphere Voice Toolkit, see the Voice Toolkit online help.
Check the $WAS_ROOT/profiles/AppSrv01/logs/server1/SystemOut.log to see if any errors have been logged.

When the <voice> tag has a gender attribute set to "neutral", the voice is not changed

Certain <voice> tag attributes, such as gender "neutral" and age, are not supported for any concatentive voices, including US English voices. Also note that the "variant" attribute is not supported for the US English voices.

A synthesized audio output is truncated

The amount of text included in one speak request is limited to the amount of text that can be spoken in five minutes. The synthesized audio for larger amounts of text may be truncated.

Prosody volume and range attributes have no effect

The prosody volume and range attributes do not have the appropriate effect on the TTS generated and are currently unsupported.

Age and variant voice settings unsupported in the TTS

The TTS supports only adult voices. Some age and variant voice settings may not affect the TTS.

Australian TTS and city name pronunciations

The Australian TTS uses the UK English TTS. As a result, some of the names of Australian cities will sound as if pronounced in UK English. You may insert lexicons into the VoiceXML script to correct the pronunciations.

Speech markup <emphasis> element limitation

The <emphasis> element, included in VoiceXML 2.0, is currently not supported. A fix is planned for a future release.

Errors when trying to load multiple voices

Windows platforms Windows

When loading more than three different TTS voices, for example the U.S. English male and female voices, as well as the U.K. English male and female voices, you may need to boot Windows Server 2003 so that it allocates more contiguous memory to application programs. To do this, use the following instructions:

Important: Allocating more contiguous memory may degrade the performance of some systems, so only use this procedure if necessary. See Memory Support and Windows Operating Systems for more information.

Restriction: This procedure only works on machines with at least 1 GB of memory running Windows Server 2003 Standard or Enterprise editions.

Change directory to your Windows boot partition.
```
C:
cd \
```
At the command line prompt, type: attrib boot.ini
You should see a response similar to the following: A SH C:\boot.ini

Next, type:

attrib -h -s boot.ini
notepad boot.ini

You should see something similar to the following in Notepad:

[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect

Carefully duplicate and edit the last line, adding /3GB /userva=3030 as in the following, and save the file:

[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise /3GB" /fastdetect /3GB /userva=3030
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect

Reboot your machine, and when given the choice, select Windows Server 2003, Enterprise /3GB, or wait until the machine automatically boots the new default /3GB configuration.

This configuration allows you to run additional voices, given that you have sufficient real memory and/or pagespace. If there is any problem with the /3GB configuration, you can boot from the original configuration (the second line on the boot screen) and remove the /3GB line from boot.ini.

To hide the boot.ini file again, type: attrib +h +s boot.ini

Linux platforms Linux

Since WebSphere Voice Server uses a single process to load voices, the memory available for loading languages is limited by the Linux limitation for the memory available for a single process, even if the machine has a larger memory.

Reliability issues when antivirus software installed

If you have antivirus software installed or enabled on a WebSphere Voice Server Windows Server 2003 machine, you may encounter the following:

Performance issues
ASR and TTS engines time out
Windows Server 2003 stops

If the operating system stops, the Windows Server 2003 Online Crash Analysis will indicate that the antivirus software has had a critical failure and request that you update the antivirus software to correct the problem.

Also, it is possible that the antivirus scanning software is attempting to scan some or all of the WebSphere Voice Server files; therefore, it is recommended that you disable scanning for the following directories (where installation_drive is the letter designation of the drive where the software is installed, such as C:):

%WVS_ROOT% (the directory where WebSphere Voice Server is installed)
%WAS_ROOT% (the directory where WebSphere Application Server is installed)
installation_drive\Program Files\IBM\HTTPServer (the directory where IBM HTTP Server is installed)
The directory defined by the TEMP environmental variable (%TEMP%)
The directory defined by the TMP environmental variable (%TMP%)

If you have configured WebSphere Voice Server to point the General (first level) Cache and Grammar (second level) Cache to directories other than the default, then you must disable those directories as well.

If scanning on the above directories is disabled, and problems are still occurring, it may be necessary to uninstall or disable the antivirus software. Updating the antivirus software may not correct the problems. If not, contact your antivirus software vendor for a solution.

Errors when trying to load 22 KHz voices

Make certain that the list of available TTS voices does not contain both 22 KHz voices and 8 KHz voices. The following log message is displayed in that case:

CWVSB0001E: Failed to launch TTS engine process. Reason: Failed to load TTS voice voice name due to error status = 407, error cause = mixing sample rates not supported

Configure the list of available TTS voices so that it includes only 22 KHz voices, or only 8 KHz voices. Configure the default TTS voice accordingly, and restart the server (See Configuring default and available TTS voices).