Substitution/Synonym dictionaries
A substitution dictionary is a collection of terms that help to group similar terms under one target term. Substitution dictionaries are managed in the bottom pane of the Library Resources tab. You can access this view with View > Resource Editor in the menus, if you are in an interactive workbench session. Otherwise, you can edit dictionaries for a specific template in the Template Editor .
You can define two forms of substitutions in this dictionary: synonyms and optional elements. You can click the tabs in this pane to switch between them.
After you run an extraction on your text data, you may find several concepts that are synonyms or inflected forms of other concepts. By identifying optional elements and synonyms, you can force the extraction engine to map these to one single target term.
Substituting using synonyms and optional elements reduces the number of concepts in the Extraction Results pane by combining them together into more significant, representative concepts with higher frequency Doc. counts.
Synonyms
Synonyms associate two or more words that have the same meaning. You can also use synonyms to group terms with their abbreviations or to group commonly misspelled words with the correct spelling. You can define these synonyms on the Synonyms tab.
A synonym definition is made up of two parts. The first is a Target term, which is the term under which you want the extraction engine to group all synonym terms. Unless this target term is used as a synonym of another target term or unless it is excluded, it is likely to become the concept that appears in the Extraction Results pane. The second is the list of synonyms that will be grouped under the target term.
For example, if you want automobile
to be replaced by
vehicle
, then automobile
is the synonym and
vehicle
is the target term.
You can enter any words into the Synonym column, but if
the word is not found during extraction and the term had a match option with
Entire
, then no substitution can take place. However, the target term does not need
to be extracted for the synonyms to be grouped under this term.
Optional elements
Optional elements identify optional words in a compound term that can be ignored during extraction in order to keep similar terms together even if they appear slightly different in the text. Optional elements are single words that, if removed from a compound, could create a match with another term. These single words can appear anywhere within the compound--at the beginning, middle, or end. You can define optional elements on the Optional tab.
For example, to group the terms ibm
and ibm
corp
together, you should declare corp
to be treated as an optional
element in this case. In another example, if you designate the term access
to be an
optional element and during extraction both internet access speed
and
internet speed
are found, they will be grouped together under the term that occurs
most frequently.