Words that the IBM Speech Recognition engine must recognize when
users speak them, should be added to your grammars. The speech engine
recognizes the pronunciations for thousands of words, and if you include
those words in your grammars, the spoken words are recognized.
If you add words to your grammars that are not in the vocabulary
of the speech engine, the IBM Speech Recognition engine automatically
creates a default pronunciation based on the spelling of the word.
It is important to note that testing grammars using Test Grammar on
MRCP improves the usability of your application.
Note: You must be
connected to an installation of WebSphere® Voice Server version
5.1 or higher if you want to play pronunciations.
You can use the Pronunciation Builder in the toolkit to:
- Create pronunciations for unknown words (that is, words that are
not in the vocabulary of the speech recognition engine and appear
in the Unknown Word list in the toolkit).
- Create alternative pronunciations for known words or words that
are being misspoken by the CTTS engine. With IBM Text-to-Speech (TTS),
you can explicitly specify pronunciations for words, abbreviations,
acronyms and other sequences, preventing the normal pronunciation
rules from applying. The TTS engine uses conventional spelling patterns
to produce spoken text.
- Revise those pronunciations.
- Create a lexicon file with the added pronunciations for use by
the grammar or VoiceXML file.
You might want to create your own pronunciations, when:
- The default pronunciation is not correct, or
- You want to improve the performance of the application.
Creating a pronunciation consists of the following steps:
Note: You
must be connected to an installation of WebSphere Voice Server version
5.1 or higher if you want to play pronunciations.
Using the Pronunciation Builder
- Open the VoiceXML, grammar, or lexicon file containing words that
you want the speech recognition engine to recognize. All application
files must reside in a project.
- (Optional) Alternatively, without selecting a voice project or
opening a file (but with the voice project already created), you can
create pronunciations for a lexicon file, and then select the project
and create the file to contain the new pronunciations.
- Select . On the Create
New Pronunciation dialog box, select the target for the pronunciation: Recognition or Text-to-Speech
(Synthesis).
- Click Finish and follow the instructions
starting with Step 4.
- Later, when you select OK or Apply to
save the pronunciation, select the voice project and then select or
create the lexicon file in which to save the pronunciation.
- Play the default pronunciation of words in your file by selecting
a word and then selecting , or right-click
to open the pop-up menu, and select Play Pronunciation.
If the pronunciation is not correct, or if you want to create alternative
pronunciations for the word, continue with these steps. You might
also want to find words that need pronunciations before you continue.
- Open the Pronunciation Builder window using one of the following
methods:
- In a grammar or VoiceXML file in the source editor:
- From the Pronunciation menu, click Compose
Pronunciations.
- In the Unknown Pronunciation view (tab
on lower left-hand panel of grammar or VoiceXML files), click a word
to select it, right-click to open the pop-up menu, and click Compose
Pronunciations.
- Select a word in the editor, right-click to open the pop-up menu,
and click Compose Pronunciations.
- In a lexicon file:
- From the Pronunciation menu, click Compose
Pronunciations.
- Select a word in the editor, right-click to open the pop-up menu,
and click Compose Pronunciations.
- If you started by selecting a word in a source editor, the selected
word appears in the Word text box. Type new words or spellings in
the Word text box, and click Get Default Pronunciation.
The phonology of the pronunciation will be appropriate for the file
format that is open. For example, if you opened the Pronunciation
Builder from a VoiceXML, grammar, or lexicon file for the speech recognition
engine, the Pronunciation Builder displays the pronunciation using
the IBM recognition engine phonology (base forms).
Note: In a few
circumstances, the Pronunciation Builder window has a
Sounds-Like
Pronunciation field, but not the other options (such as
Get
Default Pronunciation and
Show IPA Composer)
mentioned above. For more information, see
Using a sounds-like
pronunciation.
The phonetic representation of the
default or current pronunciation appears in the Recognition or Synthesizer
Pronunciation textbox. If alternative pronunciations are available,
the other phonetic representations appear in a drop-down list. The
list includes pronunciations that you have created using the IPA Composer
or sounds-like dialog box in the current session of the Pronunciation
Builder.
Note: Developers might encounter issues when trying to
compose pronunciations on a computer with a Chinese operating
system, due to a different Windows character set.
- Test the pronunciation anytime by clicking Play.
Note: If
the playback reads a string of characters (such as, "back quote left
bracket dot one...") rather than pronounces the word, your pronunciation
is invalid, possibly for one of these reasons:
- You are missing a vowel sound in one or more syllables in the
Pronunciation text box.
- A word has a secondary stress without a primary stress.
- If the default pronunciation is not correct, use the one
of the options on the dialog box to create or tune the pronunciation
using one of the methods below:
- To create a new pronunciation using a sounds-like spelling, click Create
Pronunciation with Sounds-Like, and follow the instructions
in the section below.
- To edit a pronunciation, such as tuning the pronunciation produced
from the spelling, sounds-like spelling, or recorded speech, click Show
IPA Composer, and follow the instructions in the section
below.
- To delete all manually created pronunciations, click Get
Default Pronunciation. The list shows only the default
pronunciations generated by the TTS or speech recognition engine.
- Click OK to save the pronunciation in
a lexicon file. The Choose a Lexicon File Resource dialog box opens
so that you can select or create a lexicon file in which to save the
pronunciation.
Creating a pronunciation using a
sounds-like spelling
If the default pronunciation is inaccurate,
sometimes it is easier to generate a pronunciation from a sounds-like
spelling for the word.
- From the Pronunciation Builder window
described above, with the selected word showing in the Word text field,
click Create Pronunciation from Sounds-Like.
- The Sounds-Like Pronunciation dialog box
appears with the target word filled in.
- In the Sounds-Like Pronunciation text
box, type a word or phrase that is spelled the way you want the target
word pronounced.
- Test the pronunciation anytime by clicking Play.
- When you are satisfied with the pronunciation, click OK,
and it is converted into the appropriate phonology in the Pronunciation
Builder window, where you can continue to tune the pronunciation using
the IPA Composer.
Creating or tuning a pronunciation using
the IPA Composer
Use the IPA Composer to edit a pronunciation
using phonetic symbols. When you select the pronunciation, the tool
converts the pronunciation into the phonology appropriate for the
target file format.
- From the Pronunciation Builder window described
above, with the selected word showing in the Word text field, click Show
IPA Composer.
When the IPA Composer dialog box
opens, the pronunciation in the Pronunciation Builder is converted
to the default IPA phonology. The phonetic symbol buttons on the dialog
box are separated into three groups: vowel sounds, consonant sounds,
and separators (word/syllable breaks and stress marks).
- Test the pronunciation anytime by clicking Play.
Note: If
the playback reads a string of characters such as, "back quote left
bracket dot one..." instead of pronouncing the word, your pronunciation
is invalid, possibly for one of the following reasons:
- You are missing a vowel sound in one or more syllables in the
Pronunciation text box.
- A word has a secondary stress without a primary stress.
- Edit the pronunciation. To do this, delete the existing symbols
for mispronounced sounds and syllables, and use the buttons to insert
phonetic symbols into the pronunciation:
- Select the symbol buttons for each sound in the word. Study the
IPA phonology symbols with the representative words. The symbols are
pronounced like the underlined sound in the representative word.
- Required: Select at least one vowel sound for each syllable. If
the playback reads a description of the characters instead of pronouncing
the word, you are probably missing a vowel sound in one or more syllables.
- Click Syllable or press the Period key
to start a new syllable that does not have primary or secondary stress.
- Click Primary or Secondary (instead
of Syllable) to immediately precede syllables
with voiced emphasis.
- Click Word or press the Space bar
(instead of Syllable) to insert a brief pause,
such as when the word is a collection of individual sounds. Spaces
indicate slight pauses between sounds, similar to a pause between
words.
Tip: For best results in the IPA Composer, click
the symbol buttons instead of typing letters or characters into the
text box. Although the symbols often resemble keyboard keys, the necessary
character needed to produce the sound might be produced only when
you click the symbol button.
- When you are satisfied with the pronunciation, click OK,
and it is converted into the appropriate phonology in the Pronunciation
Builder window, where you can continue to tune the pronunciation using
other options in the window.
When you view the recognition pronunciation,
certain syntax rules apply:
- When you enter more than one word, such as "New York," the lexicon
file entry has apostrophes before and after the words so that the
words are saved in the lexicon file as a unit such as 'New York'.
- If the word contains an apostrophe, such as "don't," or if the
word starts with an apostrophe, such as "'60s," the apostrophe is
duplicated, and the whole word is enclosed in a set of apostrophes
in the lexicon file entry such as 'don''t' and '''60s'.
If you opened the Pronunciation Builder from a lexicon
file, the pronunciation is stored in the source file. Otherwise, the
Choose a Lexicon File Resource dialog box opens so that you can select
or create a lexicon file in which to save the pronunciation.
When
you save the pronunciation, the word and pronunciation string are
stored in the lexicon file.
Creating additional pronunciations
for the same word
You can create alternative pronunciations
for a word in lexicon files for the speech recognition engine. Any
of the defined pronunciations would be valid for the user to say when
that lexicon file is loaded. For example, you can save the recognized
text "Smith" with the pronunciations of both "smith" and "smyth."
- After you finish one pronunciation, click Apply to
save it, and continue with the next pronunciation.
- If you click OK, you can click the word
again in the file and open the Pronunciation Builder to create and
save the next pronunciation.
Note: Lexicon files for the text-to-speech engine should only
contain one pronunciation for each word.
Using a sounds-like pronunciation
The
Pronunciation Builder window includes only the Sounds-Like Pronunciation
text field, without the other pronunciation-creation buttons, in the
following situations:
To create or edit a sounds-like pronunciation, do the following:
- If the Sounds-Like Pronunciation text field is blank, click in
the field and type letters or words that sound the way you want the
word to be pronounced. Add a space where a natural pause is sounded.
Tip: The pronunciation follows the natural reading rules for
the language. For example, if you type a string of consonants only,
the letters are read.
Note: For Japanese, use a Hiragana spelling
for the sounds-like. For Simplified Chinese, Pin Yin can be used.
For Cantonese, Yue Pin can be used.
- Test the pronunciation anytime by clicking Play.
- Keep testing and editing until the pronunciation is satisfactory.
Experiment with combinations of letters and spaces until you achieve
the closest pronunciation.
- Save the pronunciation by clicking Apply to
continue working in the dialog box or OK to
close the dialog box. The pronunciation is converted to the appropriate
phonology in the Pronunciation Builder window.
When the pronunciation
is saved to a lexicon file, the word and the sounds-like pronunciation
are added to the lexicon file. If the lexicon file is for the speech
recognition engine, the recognition pronunciation(s) for the sounds-like
are added to the lexicon file as well
Note: While using the Pronunciation Builder, if the Play button
stops responding, simply close and restart the Pronunciation Builder.