The Lexicon editor can create lexicon files for Voice applications
using the IBM speech recognition and text-to-speech (TTS) engines.
For each application that you want to create using the toolkit,
create a voice project in which you place all the files for the application.
Besides lexicon files, these might include VoiceXML files, grammar
files, ECMAScript Object files, and pre-recorded audio files.
This topic includes the following information:
Creating a new lexicon file
Use
the following steps to create a new lexicon file:
- If you do not have a project where your lexicon file resides,
create a voice project by navigating to . Otherwise, select the desired voice project in
the Navigator.
- Optional: If your project will be used for web deployment, you
can create a subfolder within the WebContent folder in which to place
your lexicon file. To do this, select and type the folder name. Folders
are optional containers for organizing files within a voice project.
- In the Navigator view, select (highlight) the folder you created
for your lexicon file, and from the File menu,
select .
- The New File wizard appears and guides you through the steps for
creating a new grammar file. Select how you want to create it, from
scratch, resulting in an empty file, or with DTD assistance.
- Click Next.
- In the File Type box, select the type of
file:
- Recognition
- Creates pronunciations for the speech recognition engine using
International Phonetic Alphabet (IPA) phonologies. Select this option
when creating grammars with words that you expect users to say when
interacting with the application. This is the default selection.
- Synthesizer (Text to Speech)
- Creates pronunciations for the synthesized speech (TTS engine)
using Symbolic Phonetic Representations (SPR) phonologies. Select
this option when creating grammars with words that the synthesized
voice will say to users.
- Type your new file name. The default extension for a lexicon file
is .lxml, however, you can also use the extensions .lexml and .pls.
Tip: Filenames are case-sensitive. The lexicon browser URLs
should be case-consistent to avoid duplicate loads. For example,
do not name one file A.lxml and another a.lxml.
- Optional: Click Advanced (available in
some installed products) to reveal or hide a section of the wizard
used to create a linked file. Check the Link to file in
the file system check box if you want the new file to
reference a file in the file system. In the field below the check
box, enter a file path or the name of a path variable. Click Browse to
locate a file in the file system. Click Variables if you want
to use a path variable to reference a file system file.
- Click the Finish button and the Lexicon
editor launches your file with the basic <lexicon> tag.
Adding words and pronunciations
In
the lexicon file, you will add all the words that users could say
while interacting with the application. Each word must also include
phonetic pronunciation understood by the speech recognition engines.
The
toolkit provides a built-in Pronunciation Builder that
automatically creates a phonology for the words you add. To do this:
- In the Source tab of the Lexicon editor, right-click,
and click Compose Pronunciation.
- In the Pronunciation Builder, type a word that you want added
to the grammar, and click Get Default Pronunciation.
Tip: See the online help for Pronunciation Builder information
for details on using the Pronunciation Builder window.
- When you click Apply or OK,
the word and phonology are added to the lexicon file with the correct
tags.
- Continue to add the pronunciations, one word at a time. You can
add multiple pronunciations for the same word, if needed.
- When you finish, save your file.
Importing or converting existing pronunciation
files
If you have existing pronunciation pool files (.pbs)
for speech recognition or exception dictionary files (.eci) for speech
synthesis, you can import them into a lexicon file.
- With a lexicon file open in the editor, right-click, and click Import
pronunciation file.
- Follow the instructions to import the words and phonologies in
the pool file (if you are importing into a Recognition lexicon) or
exception dictionary (if you are importing into a Synthesizer lexicon)
into the lexicon file.
- Continue to add or modify pronunciations, as needed, and save
the file.
Alternatively, you can convert an existing pronunciation file
into a lexicon file:
- Right-click a .pbs or .eci file
in the Navigator, and click Convert to Lexicon File.
Note: To
convert multiple files (of the same file type) in one operation, press
the Ctrl key, click the files, and then right-click
the group.
- The new file is created with the same file name and the extension .lxml.
If the name already exists in the voice project, you can choose to
overwrite the file or cancel the operation.
- The words and pronunciations in the new file are automatically
converted to the lexicon markup language, and the file is added to
the voice project. Words that had errors in the original file are
ignored in the converted file.
Phonology examples
Although
the speech recognition and TTS engines use a common lexicon format,
the lexicons themselves are not common, and there are significant
differences in the alphabets used by the two systems.
Speech recognition phonology
The following
sample code demonstrates the format and the features that are supported,
such as multiple pronunciations per word, pronunciations specified
as phone strings or sounds-like spellings, alternative spellings,
case-sensitive or case-insensitive spellings, and proprietary phonetic
alphabet support. The Lexicon editor adds the tag content (words and
pronunciations) to the labels displayed in the Outline view, and displays
the first spelling within the lexeme at the lexeme level of the tree
structure, for easier traversal through the file.
<?xml version="1.0" encoding="UTF8"?>
<!DOCTYPE lexicon PUBLIC "-//com/ibm/speech/grammar/lexicon//DTD Lexicon 1.0//EN"
"ibmlexiconml.dtd">
<lexicon version="1.0" xml:lang="en-US" alphabet="x-ibmasr" case-sensitive="false">
<import uri="sourcelexicon.lxml"/>
<lexeme>
<spelling>preferred</spelling>
<phoneme>P R AX F ER DD</phoneme>
<phoneme>P R IX F ER DD</phoneme>
</lexeme>
<lexeme>
<spelling>colour</spelling>
<spelling>color</spelling>
<phoneme>K AH L AXR</phoneme>
</lexeme>
<lexeme>
<spelling>IEEE</spelling>
<sounds-like>I triple E</sounds-like>
</lexeme>
</lexicon>
TTS phonologies
The text-to-speech (TTS)
engine supports the IPA alphabet and the SPR alphabet. The SPR alphabet
will be referred to as x-ibmtts<lang> (for example, "x-ibmttsus",
where "us" is US English).
For details on the SPR alphabet and
the TTS phonologies, see the IBM Text-to-Speech SSML Programming
Guide, (tts_ssml.pdf) in the online help.
The following
example demonstrates the format and the features that are supported:
<?xml version="1.0" encoding="UTF8"?>
<!DOCTYPE lexicon PUBLIC "-//com/ibm/speech/grammar/lexicon//DTD Lexicon 1.0//EN"
"ibmlexiconml.dtd">
<lexicon version="1.0" xml:lang="en-US" alphabet="x-ibmtts" case-sensitive="true">
<import uri="sourcelexicon.lxml"/>
<lexeme>
<spelling>preferred</spelling>
<phoneme>.0prI.1fxrd</phoneme>
<phoneme>.0pri.1fxrd</phoneme>
</lexeme>
<lexeme>
<spelling>colour</spelling>
<spelling>color</spelling>
<phoneme> .1kH.0lxr </phoneme>
</lexeme>
<lexeme>
<spelling>IEEE</spelling>
<sounds-like>I triple E</sounds-like>
</lexeme>
</lexicon>
The TTS engine uses the first pronunciation when multiple
pronunciations for a given word are specified. If the lexicon is not
case-sensitive, add an entry for each spelling: the capitalized and
the lower-case version.