IBM® Content Classification helps organize unstructured content by analyzing the full text of documents and emails and applying rules that automate classification decisions.
Managing documents and email means making decisions about the content every day. IBM Content Classification reduces the burden of manual decision making that is done by employees by accurately and automatically organizing information.
Classification can also be used to determine whether a content item needs to be classified or not. By filtering out email about lunch appointments or documents that do not hold any business value, for example, you can reduce costs and ensure that only documents that need to be retained are classified and archived.
Embedded with natural language processing and semantic analysis capabilities, IBM Content Classification determines the true intent of words and then uses that knowledge to automate decision making. Unlike other classification systems that are based on rules only, IBM Content Classification combines rules and contextual analysis to incorporate real-time learning that adapts to changing business needs. As a result, classification becomes even more accurate over time.
IBM Content Classification can organize information by policies or key words, but it can also assign metadata that is based on the full context of the document. The classification process does not just search for a single word or phrase, but analyzes the entire document, distills the main point of the text, and assigns the text to a category. When analyzing content, IBM Content Classification can recognize misspellings, abbreviations, jargon, and technical terms.
Accuracy improves over time because the system adapts to the unique nature of your business by identifying different categories from examples that you provide. When you provide feedback, the system adjusts in real time and immediately implements any corrections that you make. The accuracy of the classification results keeps pace with changes in your business.
IBM Content Classification combines this context-based approach with a rule-based, decision-making approach. The system can identify keywords, patterns (such as account numbers and phone numbers of case identifiers) and words within a certain proximity of each other (such as occurrences of the phrase "Attorney General" in the same sentence as the word "California"). When content that matches a condition in a rule is detected, the action defined for the rule is applied, and the document or email is classified accordingly.
IBM Content Classification can be integrated with IBM FileNet® Content Manager, IBM Content Manager, and IBM Content Collector, which extends its ability to provide an automated classification solution. For example, IBM Content Classification can determine where items belong in an IBM FileNet Content Manager or IBM Content Manager repository or declare items as records under the control of IBM Enterprise Records. IBM Content Collector can use analysis results returned from IBM Content Classification to determine the appropriate disposition of documents and email, including email attachments.