Indexing concepts in Db2 Text Search service

IBM Db2 Text Search is configured as Db2 Text Search server on IBM Content Manager server. Any item type on the CM8 server can use the associated Db2 Text Search server to create and search indexes.

About this task

All documents archived in IBM Content Manager repository using IBM Content Collector are stored in item types which must be text searchable for indexing. An indexing request is added to the staging table for a document of an item type, the moment they are stored in the item type.

When indexing is invoked over an item type enabled with Db2 Text Search, Db2 Text Search Support pre-processes each Content Collector document other than file system documents in the following way:

  • Creates an empty XML template for each document ID
  • Retrieves the archived document from the repository and adds items such as the subject and body to the XML template
  • Retrieves the document attribute values and adds these values to the XML template
  • Attachment data is processed by the IBM Db2 Accessories Suite configured with the Db2 Text Search server and the extracted textual data is added to the XML template
  • The generated XML content is indexed by Db2 Text Search Services. If warnings or errors occur during this process, these messages are attached to the indexed document.
  • The Indexing Failure code (IccIDXRCString) for the document is updated to the Component Type table of the item type corresponding to each document processed.
For File System documents, IBM Content Collector uses the default constructor provided by Content Manager. ICC has leveraged the ICMDCTOR constructor of CM8 for indexing of the ICC archived documents to the CM8 repository.
  • The generated XML content is added to the default File document generated by the content manager for ICMDCTOR constructor.
  • This XML generated contains readable content only for the text documents being indexed.
  • The Indexing Failure code (IccIDXRCString) for the document is updated to the Component Type table of the item type corresponding to each document processed.