EC Extract Metadata
This task extracts metadata from fields in an document to store this metadata in corresponding fields in the repository.
The metadata fields in a repository provide search information for user queries. For example, the repository field that corresponds to the From or Sender field of an email allows users to search for email that was sent by a specific person. Certain email fields are selected by default, such as the Subject, To, or Sender (From) fields. You can select other fields to extract metadata from if you think that these fields add valuable search information to your repository.
For Exchange messages, this task also extracts managed folder information if this information is available.Content Collector attempts to decrypt encrypted Lotus® Notes® documents before any metadata is extracted. In general, this is possible only for email archived from a journal because the journal is usually encrypted on behalf of the Content Collector user. However, even then, decryption will be successful only if the email was originally not encrypted. If an encrypted Lotus Notes document can be decrypted by Content Collector, it is archived in decrypted format and is also restored as decrypted copy.
Metadata can be extracted for both unencrypted and decrypted Lotus Notes documents. When email is encrypted in Lotus Notes, usually only the email body and the attachments are encrypted, while header data (like Subject, Sender, PostedDate, and so on) is usually not encrypted. This means that for encrypted Lotus Notes documents, Content Collector can extract meaningful data for the email fields, but cannot extract text from the email body or attachments. Therefore, an encrypted Lotus Notes document is not fully text searchable.
Because encrypted Microsoft Exchange messages consist of a container and an encrypted attachment, these messages can be archived in encrypted format. So the EC Extract Metadata does not decrypt Microsoft Exchange messages.
Task summary
| Characteristic | Value |
|---|---|
| Task name | EC Extract Metadata |
| Main purpose | Extracts metadata from fields in a document to store this metadata in corresponding fields in the repository. |
| Usable with which source connectors? | Email Connector |
| Usable with which target connectors? | IBM® FileNet® P8 Connector, IBM Content Manager Connector |
| When needed? | Required in email or application processing task routes |
| Placement in task route | Must be the first task in a task route |
| Produces which metadata? | Email, Email Deduplication, Re-collection (only for documents that were archived before), Task Status |
| Configuration options |
Designated Email Date
If you are archiving email documents with dynamic retention, specify which email date metadata to use as the reference date when calculating the expiration date of a document. The default is Received Date.
Associate Metadata
To extract metadata from additional document fields, select the appropriate set of fields from the User defined metadata list. Such sets of fields must have been defined earlier, in the User Defined Metadata section of the Metadata and Lists configuration.
- MAPI property
- Microsoft Messaging API (MAPI) properties are a standard set of predefined properties from Microsoft. Select a property name to refer to a specific MAPI property. For the selected property, the hexadecimal property identifier and the MAPI property type are displayed.
- Named property
- Named properties are properties that were defined by a user or an application. They usually serve a purpose for which a MAPI property cannot be used. Named properties are referenced by a property ID, which is a hexadecimal value, or a name, which is a string value. Select MNID_ID for reference by property ID or MNID_STRING for reference by name and specify an appropriate value in the ID field. Also select the property set that contains the selected named property.
Additional Forms Definition (Notes only)
- To add a form for the calculation of the deduplication hash key, type its name in the Form name field and click Add. For example, to add the form for calendar entries in Lotus Notes® 8, type Appointment. The form name appears in the list under the Form name field.
- To add a property for the calculation of the deduplication hash key, type its name enclosed in backslash (\) characters in the Form name field and click Add. The backslash (\) characters are required to denote that the entry refers to a property, not to a form. For example, to add the property BCC, which is available only in email of BCC recipients and in the sent copy, type \BlindCopyTo\. The entry appears in the list under the Form name field.