After you use the Apache Unstructured Information Management
Architecture (UIMA) to build support for custom analysis, you can
integrate the analysis logic with IBM® Content
Analytics with Enterprise Search collections.
IBM Content
Analytics with Enterprise Search provides a
number of annotators that you can associate with collections and use
as provided. To customize how documents are parsed and analyzed, you
can:
- Create custom annotators.
- Create mapping files for common analysis structures (CAS). For
example, you can create files to map specific analysis results to
the index or to a relational database.
- Create custom dictionaries. For example, you can create stop word
dictionaries to exclude common and enterprise-specific terms from
the search results and create boost word dictionaries to increase
the relevance of documents that contain certain words.
- For a content analytics collection, you can configure user-specific
dictionaries to map words and equivalent terms to facets. You can
also create rule files to extract patterns of text from documents
when they are added to the index.
- For an enterprise search collection, you can create user-specific
lexical dictionaries and synonym dictionaries, and apply enterprise-specific
terminology and synonyms when the content analysis processes run.