Enabling search for email documents

Configure the access to archived email in a way that users can search the repository for archived email, considering restrictions and special set-up, such as including custom metadata in the search scope.

Before you begin

The following prerequisites apply, depending on the type of repository that you use:
Table 1. Prerequisites for enabling search
Repository Prerequisites
IBM® Content Manager with Net Search Extender
  • The repository is configured and enabled for search.
  • You configured at least one repository connection for the IBM Content Manager Connector.
  • The configuration files for the text search indexer contain attribute and field definitions for all custom attributes that you want to make available for search.

    The result of searching for special characters depends on what is stored in the index. Most special characters are treated as token delimiters and are not stored in the index unless you redefine the tokenization rules. These rules are set up the cteixcfg.ini file.

  • An email item type was configured during the initial configuration of Content Collector and is enabled for text search.
  • These components are configured properly:
    • Configuration Web Service
    • IBM Knowledge Center
    • Web Application
IBM Content Manager with IBM Db2 Text Search Services
  • The repository is configured and enabled for search.
  • You configured at least one repository connection for the IBM Content Manager Connector.
  • The configuration files for the IBM Content Manager with IBM Db2 Text Search Services Support contain attribute and field definitions for all custom attributes that you want to make available for search.
  • An email item type was configured during the initial configuration of Content Collector and is enabled for text search.
  • These components are configured properly:
    • Configuration Web Service
    • IBM Knowledge Center
    • Web Application
IBM FileNet® P8 with IBM Legacy Content Search Engine
  • The object store must be enabled for content based retrieval with IBM Legacy Content Search Engine.
  • You configured at least one repository connection for the IBM FileNet P8 Connector.
  • The style.xml file contains zone definitions for all custom attributes that you want to make available for search, and the attributes must be included in the index for full-text search.

    The result of searching for special characters depends on what is stored in the index. Most special characters are treated as token delimiters and are not stored in the index unless you redefine the tokenization rules. These rules are set up the uni.cfg file.

  • These components are configured properly:
    • Configuration Web Service
    • IBM Knowledge Center
    • Web Application
IBM FileNet P8 with IBM Content Search Services
  • The object store must be enabled for content based retrieval with IBM Content Search Services.
  • Upgrade installations: Documents were previously archived to object stores that are enabled for content based retrieval with IBM Legacy Content Search Engine (LCSE). After the upgrade, documents will be archived into object stores that are enabled for content based retrieval with IBM Content Search Services.
  • You configured at least one repository connection for the IBM FileNet P8 Connector.
  • The configuration files for IBM Content Collector P8 Content Search Services Support contain attribute and field definitions for all custom attributes that you want to make available for search.

    In IBM Content Search Services 5.1, special characters in a document are indexed by default. Therefore, users can search for those characters without further configuration.

  • These components are configured properly:
    • Configuration Web Service
    • IBM Knowledge Center
    • Web Application
Tip: If your configuration includes both Lotus Notes and Microsoft Exchange collections, set the system environment variable AFU_DISABLE_URL_CHECK to ensure that all Content Collector clients can display the email search page. You can use any value for the variable. If the system environment variable is not set and your configuration files contain only Domino collections, all Content Collector clients can start the email search function. If your configuration files contain Exchange collections, only Exchange clients that have at least IBM Content Collector, Version 2.1.1, installed can start the email search function.

The number of users running the IBM Content Collector email search application at the same time cannot be higher than the value of DB2_APP_CONNECTION.

Procedure

To enable search for email documents:

  1. For new installations, check the definitions for the archived data access for email.
    1. In the Configuration Manager, select General Settings > Archived Data Access and select Archived Data Access for Email.
      On the General page, all defined collections and their associated storage templates are listed.
    2. Optional: Add collection definitions as required.
    3. Check the list of content server properties on the Properties page.
      Consider that you can use only the fields that are defined here when you map collection fields to content server properties. If required, add, edit, or remove content server properties. You can add any field that is defined on the IBM Content Manager item type or the IBM FileNet P8 document class.
    4. Check the list of text index fields on the Text Index page.
      Consider that you can use only the fields that are defined here when you map collection fields to text index fields. If required, add, edit, or remove text index fields. You can add any field that is defined in the text indexer model file (IBM Content Manager - Net Search Extender), the XIT (IBM FileNet P8 with IBM Legacy Content Search Engine), or the configuration file of IBM Content Collector P8 Content Search Services Support (IBM FileNet P8 with IBM Content Search Services), or the configuration file of IBM Content Collector Db2 Text Search Services Support (IBM Content Manager with IBM Db2 Text Search Services).
      Important: The only date attribute for which date-range queries can be done in the full-text index in FileNet P8 is the system-defined attribute EMAIL_DATE. Starting with FileNet P8, Version 4.5.1, the system-defined received date as it is defined for EMAIL_DATE is the partition key that is used to organize indexes in FileNet P8. The FileNet P8 repository internally routes searches to the full-text index. Therefore, the default archive mapping does not contain a text index field for EMAIL_DATE. If you work with a previous version of FileNet P8 or if the object store is not set up with date-partitioned collections, you must map a text index field to the collection field EMAIL_DATE:
      <field nm="EMAIL_DATE" type="DATE" partitionkey="true">
        <search format="yyyyMMddHH">icc_received_date</search>
        <attr>ICCMailDate</attr> 
      </field>
    5. Save your settings.
  2. For upgrade installations, you can add the required new definitions in the Configuration Manager, or you can merge the definitions of the old and the new configuration files.
    To merge the newly created configuration files with the existing definitions:
    1. Export the existing configuration files from the configuration store to disk.
      In the Configuration Manager, select General Settings > Archived Data Access > Archived Data Access for Email > Advanced and export the files.
    2. Check the new configuration files; most likely you will have to update the collection ID and the collection name.
      Remember that the collection ID and the collection name must be identical and must be unique within the set of collections.
    3. Add the contents of the template files to the exported configuration files.
      Add the collection definition to the archive mapping file, either before or after any existing collection definition.
      Then, add the respective definitions to the search configuration file.
      • To offer users a selection of repositories which they can search, add the complete <search-template> section, either before or after any existing search template definition.
      • To have all searches run against the old and the new repositories, do not add the complete search template definition but add the new collection definition to the <collections> section in the existing search template.
        <collections>
        		        <collection name="oldRepository" ...> 
        		</collection>
                <collection name="newRepository" ...> 
        		</collection>
        </collections>
        

        In this case, however, users cannot sort the result list by date.

      Remember that all field names that are used here to map the properties of this template to a field name must match field names that are defined in the archive mapping file for the given collection.
    4. Import the updated files into the configuration store and save the new configuration.
    With these steps, you enable searches across different repositories.
  3. If necessary, adapt the layout of the search page by updating the search configuration file.
    You might have to update the <declaration> section, the <form> section, and the <result> section.
    1. On the Advanced page, export the search configuration file to disk.
    2. Update the file as required and save your changes.
    3. On the Advanced page, import the search configuration file from disk.
    4. Save your changes.
  4. If you added custom attributes to the search scope, define labels for the search fields.

What to do next

Restart the IBM Content Collector Web Application service for any changes to take effect.