Creating a synonym dictionary for enterprise search collections

After you create or update a list of synonyms in an XML file, you must convert the XML file to a binary synonym dictionary.

About this task

To create a synonym dictionary, use the command line tool called essyndictbuilder, which is provided with Watson Content Analytics. The tool is in the ES_INSTALL_ROOT/bin directory.

The input to the tool is the XML file that lists your synonyms, and the output from the tool is a case-sensitive synonym dictionary. The dictionary must have the suffix DIC. For example, c:\mydictionaries\products.dic.

The default location for both files is the directory where the script is invoked. If a dictionary with the same name exists, the script produces an error.

The maximum size of a DIC file is 8 MB.

Procedure

To create a synonym dictionary for enterprise search collections:

  1. On the master server, log in as the Watson Content Analytics default administrator.
  2. Enter the following command, where XML_file is the fully qualified path to the XML file that contains the list of synonyms and DIC_file is the fully qualified path to the synonym dictionary.
    AIX® or Linux
    essyndictbuilder.sh XML_file DIC_file
    Windows
    essyndictbuilder.bat XML_file DIC_file

What to do next

After you create a synonym dictionary, use the administration console to add the dictionary to the system and associate it with one or more enterprise search collections. You cannot associate a synonym dictionary with a content analytics collection.

Only the generated DIC file is uploaded to the system. Ensure that the source XML file is kept in an access-controlled environment, and ensure that you back up the file regularly. You need this XML file to update your synonym dictionary.