You can use IBM®
Watson™ Explorer Content Analytics Studio, which is included
in the Analytical Components of Watson Explorer, to create
additional custom PEAR files that can be used to index Lexical Analysis language streams.
About this task
These instructions walk you through the steps of creating a PEAR file that can be used in a
Lexical Analysis language stream. The PEAR file will use a built-in dictionary. If you need
to create a custom dictionary, see Creating a Custom PEAR File with a Custom Dictionary.
Note: PEAR files for 17 languages are already included with the
Watson Explorer Foundational Components. See
Lexical Analysis
Streams for the list and additional information.
See UIMA Software Development Kit for additional information about creating PEAR files.
Procedure
-
Enable PEAR export in Content Analytics Studio
-
From the main menu, select .
-
In the Preferences tree view, select .
-
Click the Advanced button on the
Capabilities pane.
-
In the Advanced Capabilities Settings dialog, under
Miscellaneous, select the Export ICA Studio UIMA
Pipeline as UIMA Pear check box.
-
Click OK, then OK again.
-
Create a new Content Analytics Studio project
-
From the main menu, select . Name your project.
-
Enter the Default UIMA Type prefix, which will be the
package name for some of the PEAR file Java artifacts. It can be any Java package
name, but avoid the prefixes com.ibm and
org.apache.
-
Click Finish to create the project.
-
On the Studio Explorer tab, navigate to your project name
and open the Configuration folder. Right-click the
Annotators folder and select .
-
Add a file name (typically
projectname.annoconfig). Click
Finish.
-
On the Studio Explorer tab, double-click the file. The new project will display four UIMA Pipeline Stages. Two of
those stages, Lexical Analysis and Parsing
Rules, will display an error because they are not yet configured.
-
Delete the Parsing Rules stage. It is not used by Watson Explorer Engine.
-
Select the Document Language stage. Select the
Manually specify the document language radio button. Choose
the language from the drop-down.
-
Select the Lexical Analysis stage. Choose the same language
from the list and click the Built In button.
-
Save the changes you made to the UIMA pipeline configuration.
-
Export the PEAR file
-
On the Studio Explorer tab, right-click and select Export.
-
In the Export dialog, under ICA
Studio, select UIMA Pipeline as UIMA PEAR from
the list. Click Next.
-
Choose a folder and a name for your file and click
Save.
-
Click Finish. You do not need to specify index fields and
facets because this information is not used by Watson Explorer Engine.