Creating a pronunciation

Words that the IBM Speech Recognition engine must recognize when users speak them, should be added to your grammars. The speech engine recognizes the pronunciations for thousands of words, and if you include those words in your grammars, the spoken words are recognized.

If you add words to your grammars that are not in the vocabulary of the speech engine, the IBM Speech Recognition engine automatically creates a default pronunciation based on the spelling of the word. It is important to note that testing grammars using Test Grammar on MRCP improves the usability of your application.

Note: You must be connected to an installation of WebSphere® Voice Server version 5.1 or higher if you want to play pronunciations.

You can use the Pronunciation Builder in the toolkit to:

Create pronunciations for unknown words (that is, words that are not in the vocabulary of the speech recognition engine and appear in the Unknown Word list in the toolkit).
Create alternative pronunciations for known words or words that are being misspoken by the CTTS engine. With IBM Text-to-Speech (TTS), you can explicitly specify pronunciations for words, abbreviations, acronyms and other sequences, preventing the normal pronunciation rules from applying. The TTS engine uses conventional spelling patterns to produce spoken text.
Revise those pronunciations.
Create a lexicon file with the added pronunciations for use by the grammar or VoiceXML file.

You might want to create your own pronunciations, when:

The default pronunciation is not correct, or
You want to improve the performance of the application.

Creating a pronunciation consists of the following steps:

Note: You must be connected to an installation of WebSphere Voice Server version 5.1 or higher if you want to play pronunciations.

Generate a default pronunciation and play the pronunciation using the Pronunciation Builder.
Choose a method to create or edit a pronunciation:
- Use a sounds-like spelling for the word
- Use the International Phonetic Alphabet (IPA) composer
Optional: Create additional pronunciations for the same word
For special circumstances: Use a sounds-like pronunciation

Using the Pronunciation Builder

Open the VoiceXML, grammar, or lexicon file containing words that you want the speech recognition engine to recognize. All application files must reside in a project.
(Optional) Alternatively, without selecting a voice project or opening a file (but with the voice project already created), you can create pronunciations for a lexicon file, and then select the project and create the file to contain the new pronunciations.
1. Select File > New > Other > Voice Tools > Pronunciation. On the Create New Pronunciation dialog box, select the target for the pronunciation: Recognition or Text-to-Speech (Synthesis).
2. Click Finish and follow the instructions starting with Step 4.
3. Later, when you select OK or Apply to save the pronunciation, select the voice project and then select or create the lexicon file in which to save the pronunciation.
Play the default pronunciation of words in your file by selecting a word and then selecting Pronunciation > Play Pronunciation, or right-click to open the pop-up menu, and select Play Pronunciation. If the pronunciation is not correct, or if you want to create alternative pronunciations for the word, continue with these steps. You might also want to find words that need pronunciations before you continue.
Open the Pronunciation Builder window using one of the following methods:
- In a grammar or VoiceXML file in the source editor:
  1. From the Pronunciation menu, click Compose Pronunciations.
  2. In the Unknown Pronunciation view (tab on lower left-hand panel of grammar or VoiceXML files), click a word to select it, right-click to open the pop-up menu, and click Compose Pronunciations.
  3. Select a word in the editor, right-click to open the pop-up menu, and click Compose Pronunciations.
- In a lexicon file:
  1. From the Pronunciation menu, click Compose Pronunciations.
  2. Select a word in the editor, right-click to open the pop-up menu, and click Compose Pronunciations.
If you started by selecting a word in a source editor, the selected word appears in the Word text box. Type new words or spellings in the Word text box, and click Get Default Pronunciation. The phonology of the pronunciation will be appropriate for the file format that is open. For example, if you opened the Pronunciation Builder from a VoiceXML, grammar, or lexicon file for the speech recognition engine, the Pronunciation Builder displays the pronunciation using the IBM recognition engine phonology (base forms).
Note: In a few circumstances, the Pronunciation Builder window has a Sounds-Like Pronunciation field, but not the other options (such as Get Default Pronunciation and Show IPA Composer) mentioned above. For more information, see Using a sounds-like pronunciation.

The phonetic representation of the default or current pronunciation appears in the Recognition or Synthesizer Pronunciation textbox. If alternative pronunciations are available, the other phonetic representations appear in a drop-down list. The list includes pronunciations that you have created using the IPA Composer or sounds-like dialog box in the current session of the Pronunciation Builder.

Note: Developers might encounter issues when trying to compose pronunciations on a computer with a Chinese operating system, due to a different Windows character set.
Test the pronunciation anytime by clicking Play.
Note: If the playback reads a string of characters (such as, "back quote left bracket dot one...") rather than pronounces the word, your pronunciation is invalid, possibly for one of these reasons:
- You are missing a vowel sound in one or more syllables in the Pronunciation text box.
- A word has a secondary stress without a primary stress.
If the default pronunciation is not correct, use the one of the options on the dialog box to create or tune the pronunciation using one of the methods below:
- To create a new pronunciation using a sounds-like spelling, click Create Pronunciation with Sounds-Like, and follow the instructions in the section below.
- To edit a pronunciation, such as tuning the pronunciation produced from the spelling, sounds-like spelling, or recorded speech, click Show IPA Composer, and follow the instructions in the section below.
- To delete all manually created pronunciations, click Get Default Pronunciation. The list shows only the default pronunciations generated by the TTS or speech recognition engine.
Click OK to save the pronunciation in a lexicon file. The Choose a Lexicon File Resource dialog box opens so that you can select or create a lexicon file in which to save the pronunciation.

Creating a pronunciation using a sounds-like spelling

If the default pronunciation is inaccurate, sometimes it is easier to generate a pronunciation from a sounds-like spelling for the word.

From the Pronunciation Builder window described above, with the selected word showing in the Word text field, click Create Pronunciation from Sounds-Like.
The Sounds-Like Pronunciation dialog box appears with the target word filled in.
In the Sounds-Like Pronunciation text box, type a word or phrase that is spelled the way you want the target word pronounced.
Test the pronunciation anytime by clicking Play.
When you are satisfied with the pronunciation, click OK, and it is converted into the appropriate phonology in the Pronunciation Builder window, where you can continue to tune the pronunciation using the IPA Composer.

Creating or tuning a pronunciation using the IPA Composer

Use the IPA Composer to edit a pronunciation using phonetic symbols. When you select the pronunciation, the tool converts the pronunciation into the phonology appropriate for the target file format.

From the Pronunciation Builder window described above, with the selected word showing in the Word text field, click Show IPA Composer.
When the IPA Composer dialog box opens, the pronunciation in the Pronunciation Builder is converted to the default IPA phonology. The phonetic symbol buttons on the dialog box are separated into three groups: vowel sounds, consonant sounds, and separators (word/syllable breaks and stress marks).
Test the pronunciation anytime by clicking Play.
Note: If the playback reads a string of characters such as, "back quote left bracket dot one..." instead of pronouncing the word, your pronunciation is invalid, possibly for one of the following reasons:
- You are missing a vowel sound in one or more syllables in the Pronunciation text box.
- A word has a secondary stress without a primary stress.
Edit the pronunciation. To do this, delete the existing symbols for mispronounced sounds and syllables, and use the buttons to insert phonetic symbols into the pronunciation:
- Select the symbol buttons for each sound in the word. Study the IPA phonology symbols with the representative words. The symbols are pronounced like the underlined sound in the representative word.
- Required: Select at least one vowel sound for each syllable. If the playback reads a description of the characters instead of pronouncing the word, you are probably missing a vowel sound in one or more syllables.
- Click Syllable or press the Period key to start a new syllable that does not have primary or secondary stress.
- Click Primary or Secondary (instead of Syllable) to immediately precede syllables with voiced emphasis.
- Click Word or press the Space bar (instead of Syllable) to insert a brief pause, such as when the word is a collection of individual sounds. Spaces indicate slight pauses between sounds, similar to a pause between words.
Tip: For best results in the IPA Composer, click the symbol buttons instead of typing letters or characters into the text box. Although the symbols often resemble keyboard keys, the necessary character needed to produce the sound might be produced only when you click the symbol button.
When you are satisfied with the pronunciation, click OK, and it is converted into the appropriate phonology in the Pronunciation Builder window, where you can continue to tune the pronunciation using other options in the window.
When you view the recognition pronunciation, certain syntax rules apply:
- When you enter more than one word, such as "New York," the lexicon file entry has apostrophes before and after the words so that the words are saved in the lexicon file as a unit such as 'New York'.
- If the word contains an apostrophe, such as "don't," or if the word starts with an apostrophe, such as "'60s," the apostrophe is duplicated, and the whole word is enclosed in a set of apostrophes in the lexicon file entry such as 'don''t' and '''60s'.
If you opened the Pronunciation Builder from a lexicon file, the pronunciation is stored in the source file. Otherwise, the Choose a Lexicon File Resource dialog box opens so that you can select or create a lexicon file in which to save the pronunciation.

When you save the pronunciation, the word and pronunciation string are stored in the lexicon file.

Creating additional pronunciations for the same word

You can create alternative pronunciations for a word in lexicon files for the speech recognition engine. Any of the defined pronunciations would be valid for the user to say when that lexicon file is loaded. For example, you can save the recognized text "Smith" with the pronunciations of both "smith" and "smyth."

After you finish one pronunciation, click Apply to save it, and continue with the next pronunciation.
If you click OK, you can click the word again in the file and open the Pronunciation Builder to create and save the next pronunciation.

Note: Lexicon files for the text-to-speech engine should only contain one pronunciation for each word.

Using a sounds-like pronunciation

The Pronunciation Builder window includes only the Sounds-Like Pronunciation text field, without the other pronunciation-creation buttons, in the following situations:

The IPA mappings cannot be found for the language.
The language is Japanese, Chinese, or Cantonese.
Note: Pronunciations for Japanese lexicon files for the text-to-speech engine will have an additional drop-down list for part of speech.

To create or edit a sounds-like pronunciation, do the following:

If the Sounds-Like Pronunciation text field is blank, click in the field and type letters or words that sound the way you want the word to be pronounced. Add a space where a natural pause is sounded.
Tip: The pronunciation follows the natural reading rules for the language. For example, if you type a string of consonants only, the letters are read.

Note: For Japanese, use a Hiragana spelling for the sounds-like. For Simplified Chinese, Pin Yin can be used. For Cantonese, Yue Pin can be used.
Test the pronunciation anytime by clicking Play.
Keep testing and editing until the pronunciation is satisfactory. Experiment with combinations of letters and spaces until you achieve the closest pronunciation.
Save the pronunciation by clicking Apply to continue working in the dialog box or OK to close the dialog box. The pronunciation is converted to the appropriate phonology in the Pronunciation Builder window.
When the pronunciation is saved to a lexicon file, the word and the sounds-like pronunciation are added to the lexicon file. If the lexicon file is for the speech recognition engine, the recognition pronunciation(s) for the sounds-like are added to the lexicon file as well

Note: While using the Pronunciation Builder, if the Play button stops responding, simply close and restart the Pronunciation Builder.