Product overview

IBM® Content Classification helps organize unstructured content by analyzing the full text of documents and emails and applying rules that automate classification decisions.

Managing documents and email means making decisions about the content every day. IBM Content Classification reduces the burden of manual decision making that is done by employees by accurately and automatically organizing information.

Classification can also be used to determine whether a content item needs to be classified or not. By filtering out email about lunch appointments or documents that do not hold any business value, for example, you can reduce costs and ensure that only documents that need to be retained are classified and archived.

Embedded with natural language processing and semantic analysis capabilities, IBM Content Classification determines the true intent of words and then uses that knowledge to automate decision making. Unlike other classification systems that are based on rules only, IBM Content Classification combines rules and contextual analysis to incorporate real-time learning that adapts to changing business needs. As a result, classification becomes even more accurate over time.

Learning by understanding

IBM Content Classification can organize information by policies or key words, but it can also assign metadata that is based on the full context of the document. The classification process does not just search for a single word or phrase, but analyzes the entire document, distills the main point of the text, and assigns the text to a category. When analyzing content, IBM Content Classification can recognize misspellings, abbreviations, jargon, and technical terms.

Accuracy improves over time because the system adapts to the unique nature of your business by identifying different categories from examples that you provide. When you provide feedback, the system adjusts in real time and immediately implements any corrections that you make. The accuracy of the classification results keeps pace with changes in your business.

IBM Content Classification combines this context-based approach with a rule-based, decision-making approach. The system can identify keywords, patterns (such as account numbers and phone numbers of case identifiers) and words within a certain proximity of each other (such as occurrences of the phrase "Attorney General" in the same sentence as the word "California"). When content that matches a condition in a rule is detected, the action defined for the rule is applied, and the document or email is classified accordingly.

Examples of classification applications

You can use IBM Content Classification to achieve a variety of business goals.

Enterprise content standardization: To support document classification and taxonomy automation within your content management system, document properties or metadata can be automatically assigned when the content is classified, and documents can be automatically moved to the correct enterprise repository.
Compliance and records management: Documents and email can be declared as records when they are classified and placed under the control of record retention policies and compliance standards.
eDiscovery readiness: Documents and email can be filtered to ensure that only items with business value are classified and archived. You can quickly and cost effectively prepare content for potential legal notices.
Business process optimization: Automated decision making ensures more consistent outcomes and reduced costs. For example, with content-based analysis, documents can be automatically inserted into the workflow of a business plan, email can be automatically routed, and agent responses in a customer support center can be automatically suggested or applied.

IBM Content Classification can be integrated with IBM FileNet® Content Manager, IBM Content Manager, and IBM Content Collector, which extends its ability to provide an automated classification solution. For example, IBM Content Classification can determine where items belong in an IBM FileNet Content Manager or IBM Content Manager repository or declare items as records under the control of IBM Enterprise Records. IBM Content Collector can use analysis results returned from IBM Content Classification to determine the appropriate disposition of documents and email, including email attachments.