Optical character recognition engines
The Java™ document viewer toolkit defines a pluggable interface to enable support for an optical character recognition (OCR) engine. An OCR engine is designed to recognize text from an image, which helps you do tasks such as copy, paste, and search with screen readers.
About this task
A sample OCR engine implementation is provided for the Nuance OmniPage Capture Software Development Kit (SDK). If you have a license for the Nuance OmniPage Capture SDK, you can use this sample OCR engine to enable the Java document viewer for OCR technology. For more information, see the Java class file com.ibm.mm.viewer.samples.ocr.OmnipageOCREngine.java in the OCR samples. You can also write your own OCR engine by extending the Java class CMBOCREngine. For more information, see the Application Programming Reference.
OperationToolbar.tools=ocr_doc,select_text,find,separator,save_doc, ...Procedure
- OCRENGINE_CLASSNAME
- OCRENGINE_exePath
Example
OCRENGINE_CLASSNAME=com.ibm.mm.viewer.samples.ocr.OmnipageOCREngine
OCRENGINE_exePath=c:\cm8\windows\nativelibProperties engineCMProperties = new Properties();
engineCMProperties.put("ENGINES","2");
engineCMProperties.put("ENGINE1_CLASSNAME",
"com.ibm.mm.viewer.CMBSnowboundEngine");
engineCMProperties.put("ENGINE2_CLASSNAME",
"com.ibm.mm.viewer.CMBOutsideInExportEngine");
engineCMProperties.put("OCRENGINE_CLASSNAME",
"com.ibm.mm.viewer.samples.ocr.OmnipageOCREngine");
engineCMProperties.put("OCRENGINE_exePath",
"c:\\cm8\\windows\\nativelib");
// Create document services.
docServices = new CMBStreamingDocServices(
new StreamingDocServicesCallbacks(), engineCMProperties);