Building an auto-classification model

Creating rules to find documents that fit differing categories is time-consuming and requires constant, meticulous adjustments. However, importing a classification model with sets of training documents helps find other, similar documents.

About this task

Using previously harvested data, you can create an auto-classification model.

Procedure

  1. Determine the categories into which you want the auto-classification model to classify documents.
  2. Using IBM® StoredIQ® Data Workbench, create a filter for each category to capture documents that are representatives of the category.
  3. For each filter, create an infoset. The members of the resulting infoset become the "training corpus" for the category.
  4. For each infoset, run a copy action with IBM StoredIQ Data Workbench onto a folder that is accessible by the IBM Content Classification application.
  5. Use the IBM Content Classification application to create a decision plan and knowledge base by importing the training corps that you created.
    Note: A classification model consists of one decision plan and at least one knowledge base, which is a requirement of the IBM StoredIQ auto-classification feature.
    For best practices to create an Auto-classification model, see Best practices for creating an Auto-classification model.