Creating multimedia annotators
You can create custom annotators to analyze multimedia content such as audio, video, and image files. For example, you might want to create a face recognition annotator that can analyze image files.
About this task
In UIMA, the common analysis structure (CAS) supports analysis of multiple views of a document. For example, one view of a video stream might be the video frames and another view might be the closed-caption text. Multiple CAS views are useful when different versions of a document are needed at different stages of the analysis. Each view of the document is associated with a different subject of analysis (Sofa). Watson Explorer Content Analytics supports multiple CAS views to allow you to access the following Sofas:
- Custom text analysis annotators can analyze the document text that is extracted by the parsers.
- Multimedia annotators can analyze the original document that is crawled by crawlers.
Procedure
To develop and deploy a multimedia annotator:
What to do next
<Parser>
<ParserName>empty</ParserName>
<ParserFactoryClass>com.ibm.ilel.parser.EmptyParserFactory</ParserFactoryClass>
</Parser>
You must then move the <Mimetype>
element for the particular multimedia file type from the <ParserMapping>
element for the terminator parser to the <ParserMapping> element
for the empty parser. For example, move the <Mimetype>image/jpeg</Mimetype>
element to configure the parsers to process JPEG image
files. <ParserMapping>
<ParserName>empty</ParserName>
<Mimetype>image/jpeg</Mimetype>
</ParserMapping>
- Add the following lines after the last <Parser> element in
the ES_NODE_ROOT/master_config/collection_ID.indexservice/parser_config.xml
file:
<Parser> <ParserName>empty</ParserName> <ParserFactoryClass>com.ibm.ilel.parser.EmptyParserFactory</ParserFactoryClass> </Parser>
- Move the <Mimetype>element for the particular multimedia file
type from the <ParserMapping>element for the terminator
parser to the <ParserMapping>element for the empty
parser. For example:ES_NODE_ROOT/master_config/collection_ID.indexservice/parser_config.xml
file:
<ParserMapping> <ParserName>empty</ParserName> <Mimetype>image/bmp</Mimetype> <Mimetype>image/gif</Mimetype> <Mimetype>image/jpeg</Mimetype> <Mimetype>image/tiff</Mimetype> <Mimetype>image/png</Mimetype> </ParserMapping>
- After you make these changes, open the administration console and restart the collection parser and indexer, and then recrawl the image files. When processing is complete, the image metadata is indexed.