Archive mappings for IBM Content Manager

The contents of the archive mapping file that is required for accessing archived data in IBM Content Manager depend on the type of document and on the selected data model.

Archive mappings for the bundled email data model

With the bundled email data model, an email document is archived into one item type. In its root component, this item type contains all attributes that are common to all instances of the email document and the content of the email document including any attachments. In its child components, the item type contains the attributes that are unique to each instance of the email document, such as the mailbox from which the email document was archived.

In the following table, strings that are enclosed in percent signs (%) are placeholders for values that defined in the repository or in the configuration store.

Entries in the archive mapping file IBM Content Manager data model
<doc_type_collection id="Default Mail" nm="Default Mail" 
     collectionType="%ICC_EMAIL_PROVIDER%">
  ...
</doc_type_collection>
This doc_type_collection section defines the root collection for the bundled email data model. The collectionType attribute of the doc_type_collection element can have these values:
  • ICC_Exchange_Email_Bundled
  • ICC_Domino_Email_Bundled
<repositories>
  <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
    	<doc_types>
     ...
  </repository>
</repository>
The repositories section in the root collection defines the repositories that Content Collector can access. Each repository ID corresponds to a repository connection that is defined in the Content Collector configuration store.
<doc_types>
  <doc_type>
    <name>ICCEmail</name>
      <children>
       <child ref_coll="ICCEmailInstance">
         AFUEChild
       </child>
      </children>
  </doc_type>
</doc_types>					
The doc_type section in the root collection defines the email item type that contains the content of the email document. It also defines the reference to the child component that contains the varying attributes of each instance of the email document.
<fields>
  <field nm="field_nm" type="datatype>
    <attr>attr_name</attr>
    <search>index_field</search>
  </field>
    ...	
</fields>
The fields section in the root collection maps search fields to repository attributes that are common to all item types that are defined in the root collection and that are common to all instances of an email document within an item type. These attributes can be IBM Content Manager attributes or user-defined attributes.

In addition, there are field definitions that address fields in the child component.

<field nm="MAILBOX_ID" type="STRING">
  <search>mailboxid</search>
  <reference>EMAIL_REF.MAILBOX_ID</reference>
</field>

<field nm="EMAIL_REF" type="REFERENCE" 
     ref_coll="ICCEmailInstance" multivalue="true">
  <relationship type="CHILD"></relationship>
</field>
			
If the field definition addresses a field in the child component. you must also define a reference field for the collection that defines this child component.
<doc_type_collection id="ICCEmailInstance"
     nm="ICCEmailInstance" collectionType="CMCHILDCOMP">
  <repositories>
    <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
      <doc_types>
        <doc_type>
          <name>AFUEChild</name>
        </doc_type>
      </doc_types>
    </repository>
  </repositories>
  <fields>
    <field nm="MAILBOX_ID" type="STRING">
      <attr>ICCMailboxID</attr>
    </field>
    <field nm="USER_SPECIFIC_CUSTOM_STRING" type="STRING">
      <attr>User_String</attr>
    </field>
			 
  </fields>
</doc_type_collection>
This doc_type_collection section defines the collection for the child component. Therefore, the collection type is set to CMCHILDCOMP. The child component tracks references of all copies of the same email document.

The fields section in this collection maps search fields to repository attributes that are specific to each instance of an email document within an item type. These attributes can be IBM Content Manager attributes or user-defined attributes.

Archive mappings for the compound email data model

With the compound email data model, email is stored in an email item type with two child components. Attachments are stored separately in an attachment item type. In its root component, the email item type contains the attributes that are common to all instances of the email document, such as the email document itself or its subject. One child component, the email instance (EI) child, contains the attributes that are unique to each instance of the email document, such as the mailbox from which the email document was archived. The other child component, the attachment instance (AI) child, contains the attributes that are unique to an attachment in each email instance. Attachment content is stored separately.

The archive mappings for the compound email data model contain one collection definition for each component of the email item type: one for the root component, one for the EI child component, and one for the AI child component. The attachment item type contains just the attachment content but no attributes. Attachment content cannot be accessed independent of its attributes and thus its associated document. Therefore, an explicit collection definition is not required. The internal collection definition EDMDefaultContent is used for processing the attachment item type.

In the following table, strings that are enclosed in percent signs (%) are placeholders for values that defined in the repository or in the configuration store.

Entries in the archive mapping file IBM Content Manager data model
<doc_type_collection id="COLLECTION" nm="COLLECTION" 
    collectionType="ICC_Exchange_Email_Compound">
  ...
</doc_type_collection>
This doc_type_collection section defines the root collection for the compound email data model. The collectionType attribute of the doc_type_collection element can have these values:
  • ICC_Exchange_Email_Compound
  • ICC_Domino_Email_Compound
Content Collector 4.0.1 fix pack 13 (4.0.1.13): New collection types of Db2 Text Search (Db2TS) are introduced:
  • ICC_DOMINO_EMAIL_Db2TS_COMPOUND
  • ICC_EXCHANGE_EMAIL_Db2TS_COMPOUND
<repositories>
  <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
    <doc_types>
    ...
  </repository>
</repository>
The repositories section in the root collection defines the repositories that Content Collector can access. Each repository ID corresponds to a repository connection that is defined in the Content Collector configuration data base.
<doc_types>
  <doc_type>
    <name>ICCEmail</name>
    <children>
      <child ref_coll="ICCEmailInstance">
          AFUEChild
      </child> 
      <child ref_coll="ICCAttachmentInstance">
          AFUAChild
      </child>
    </children>
  </doc_type>
</doc_types>					
The doc_type section in the root collection defines the email item type that holds the distinct email instances (DEI). The DEI contains all email data that is common to all instances of one email document:
  • The email object.
  • Attributes that are shared across all instances of the document.
It also defines the references to the two child components of the DEI, the email instance (EI) and the attachment instance (AI).
<fields>
  <field nm="field_nm" type="datatype>
    <attr>attr_name</attr>
    <search>index_field</search>
  </field>
  <field nm="SUBJECT" type="STRING">
    <attr>ICCSubject</attr>
    <search>subject</search>
  </field>
		...	
</fields>
The fields section in the root collection maps search fields to repository attributes that are common to all item types that are defined in the root collection and that are common to all instances of an email document within an item type. These attributes can be IBM Content Manager attributes or user-defined attributes.

In addition, there are field definitions that address fields in the child components.

A sample definition for an attribute that is common to all instance of an email document is the definition for the field SUBJECT.

<field nm="MAILBOX_ID" type="STRING">
  <search>mailboxid</search>
  <reference>EMAIL_REF.MAILBOX_ID</reference>
</field>

<field nm="EMAIL_REF" type="REFERENCE" 
    ref_coll="ICCEmailInstance" multivalue="true">
  <relationship type="CHILD"></relationship>
</field>
			
<field nm="USER_SPECIFIC_CUSTOM_STRING" type="STRING">
  <search>icc_custom_metadata</search>
  <reference>
     EMAIL_REF.USER_SPECIFIC_CUSTOM_STRING
  </reference>
</field>

To address a field in the EI child component. you must also define a reference field for the collection that defines this child component. This field has the type REFERENCE. Then, address the field in the EI child component by including a reference element in the field definition.

A sample definition for an attribute that is specific to an email instance is the definition for the field MAILBOX_ID.

<field nm="ATTACH_REF" type="REFERENCE" 
    ref_coll="ICCAttachmentInstance" multivalue="true">
	<relationship type="CHILD"></relationship>
</field>
			
<field nm="CORRELATION_KEY" type="STRING">
	<reference>ATTACH_REF.CORRELATION_KEY</reference>
</field>

<field nm="ATTACHMENT_NAME" type="STRING">
	<reference>ATTACH_REF.ATTACHMENT_NAME</reference>
</field>
To address a field in the AI child component, you must also define a reference field for the collection that defines this child component.
<doc_type_collection id="ICCEmailInstance" 
     nm="ICCEmailInstance" collectionType="CMCHILDCOMP">
  <repositories>
    <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
      <doc_types>
        <doc_type>
          <name>AFUEChild</name>
        </doc_type>
      </doc_types>
    </repository>
  </repositories>
  <fields>
    <field nm="MAILBOX_ID" type="STRING">
      <attr>ICCMailboxID</attr>
    </field>
    <field nm="USER_SPECIFIC_CUSTOM_STRING" type="STRING">
      <attr>User_String</attr>
     </field>
			 
  </fields>
</doc_type_collection>
This doc_type_collection section defines the collection for the EI child component. Therefore, the collection type is set to CMCHILDCOMP. The EI child component tracks references of all copies of the same email document.

The fields section in this collection maps search fields to repository attributes that are specific to each instance of an email document within an item type. These attributes can be IBM Content Manager attributes or user-defined attributes.

<doc_type_collection id="ICCAttachmentInstance" 
     nm="ICCAttachmentInstance" collectionType="CMCHILDCOMP">
  <repositories>
    <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
      <doc_types>
        <doc_type>
          <name>AFUAChild</name>
        </doc_type>
      </doc_types>
    </repository>
  </repositories>
		 
  <fields>
    <field nm="ATTACHMENT_NAME" type="STRING">
      <attr>AFUFilename</attr>
    </field>
    <field nm="CORRELATION_KEY" type="STRING">
      <attr>AFUCorrelationKey</attr>
    </field>
    <field nm="CONTENT_REF" type="REFERENCE" 
       ref_coll="EDMDefaultContent">
      <attr>AFUContentRef</attr>
    </field>
		<field nm="CONTENT" type="STRING">
      <reference>CONTENT_REF.CONTENT</reference>
    </field>
  </fields>
</doc_type_collection>
This doc_type_collection section defines the collection for the AI child component. Therefore, the collection type is set to CMCHILDCOMP. The AI child component tracks the references to separately stored attachments (in an attachment item type).

The fields section in this collection contains only required fields. These fields may not be changed.

There is no need to define a collection for the attachment item type because attachments cannot be searched, previewed, viewed, or restored independent of their associated email documents.

Archive mappings for application archiving

These are the mappings for those item types in the repository that are used for archiving documents from Notes applications. Configure access to documents archived from Notes applications only in combination with access to email documents.

The archive mappings are required for retrieving an archived document when a user clicks the respective stub or restores the document.

In the following table, strings that are enclosed in percent signs (%) are placeholders for values that defined in the repository or in the configuration store.

Entries in the archive mapping file IBM Content Manager data model
<doc_type_collection id="COLLECTION" nm="COLLECTION" 
     collectionType="Internal">
  ...
</doc_type_collection>

This doc_type_collection section defines the root collection for application archiving. The collectionType attribute of the doc_type_collection element must have the value Internal. This collection type is used for collections that do not require crosschecking for model validation or the like.

<doc_type>
  <name>%ICC_CONFIG_ITEMTYPE_NAME%</name>
  <children>
    <child ref_coll="ICCAppAttachmentInstance">
       %ICC_CONFIG_ITEMTYPECHILD_AI%
    </child>
  </children>
</doc_type>
The doc_type section in the root collection defines the item type that holds the content of the application document. It also defines the reference to the child component that contains the attachments.
<fields>
  <field nm="CONTENT" type="STRING">
  </field>

  <field nm="ATTACH_REF" type="REFERENCE" 
     ref_coll="ICCAppAttachmentInstance" multivalue="true">
    <relationship type="CHILD"></relationship>
  </field>
			
  <field nm="CORRELATION_KEY" type="STRING">
    <reference>ATTACH_REF.CORRELATION_KEY</reference>
  </field>

  <field nm="ATTACHMENT_NAME" type="STRING">
    <reference>ATTACH_REF.ATTACHMENT_NAME</reference>
  </field>
</fields>

The fields section in this collection contains only required fields. These fields may not be changed.

<doc_type_collection id="ICCAppAttachmentInstance" 
   nm="ICCAppAttachmentInstance" collectionType="Dependent">
  <repositories>
    <repository id="%ICC_UNIQUE_CONNECTION_NAME%">	
      <doc_types>
        <doc_type>
          <name>%ICC_CONFIG_ITEMTYPECHILD_AI%</name>
        </doc_type>
      </doc_types>
    </repository>
  </repositories>	
  <fields>
    <field nm="ATTACHMENT_NAME" type="STRING">
      <attr>%AFU_CONFIG_ATTR_FILENAME%</attr>
    </field>
    <field nm="CORRELATION_KEY" type="STRING">
      <attr>%ICC_CONFIG_ATTR_CORRELATIONKEY%</attr>
    </field>
    <field nm="CONTENT_REF" type="REFERENCE" 
        ref_coll="EDMDefaultContent">
      <attr>%ICC_CONFIG_ATTR_CONTENTREF%</attr>
    </field>
    <field nm="CONTENT" type="STRING">
      <reference>CONTENT_REF.CONTENT</reference>
    </field>
  </fields>
</doc_type_collection>
This doc_type_collection section defines the collection for the attachment instances. This child component tracks the references to separately stored attachments (in an attachment item type). The collectionType attribute of the doc_type_collection element must have the value Dependent. A collection of the type Dependent is always referred to by another collection in the archive mapping and contains only a subset of the definitions that are required for accessing the complete archived document. A collection of the type CMCHILDCOMP, for example, is a special type of a dependent collection.

The fields section in this collection contains only required fields. These fields may not be changed.

There is no need to define a collection for the attachment item type because an attachment cannot be restored independent of its associated document.

Archive mappings for File System

Content Collector does not enforce a formal data model for File System documents, but offers a sample item type for archiving File System documents and provides the respective archive mapping file. Even if you do not use the sample or if you use only some of the properties from the sample on a custom item type, you must provide an archive mapping file.

The archive mappings are required for retrieving an archived document when a user clicks the respective stub. If any required field definitions are missing, Content Collector cannot assign a proper file name when retrieving a document and uses a placeholder instead. In this case, the application for displaying the document content receives no valid file name and, therefore, might not be able to display the document.

In the following table, strings that are enclosed in percent signs (%) are placeholders for values that defined in the repository or in the configuration store.

Entries in the archive mapping file IBM Content Manager data model
<doc_type_collection id="File System" nm="File System" 
collectionType="ICC_FILE">
  <repositories>
    <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
      <doc_types>
        <doc_type>
          <name>%ICC_CONFIG_ITEMTYPE_NAME%</name>
        </doc_type>
      </doc_types>
    </repository>
  </repositories>
		
  <fields>
    <field nm="CONTENT" type="STRING">
    </field>
    <field nm="FILENAME" type="STRING">
      <attr>%ICC_CONFIG_ATTR_FILENAME%</attr>
    </field>
  </fields>
</doc_type_collection>	
This doc_type_collection section defines the collection for the File System item type. The collectionType attribute of the doc_type_collection element must have the value ICC_FILE.

The repositories section in the collection defines the repositories that Content Collector can access. Each repository ID corresponds to a repository connection that is defined in the Content Collector configuration data base.

The doc_type section defines the File System item type that holds the file object, the file properties, and the File System instance (FI). The FI tracks all references.

The fields section in this collection contains only required fields. These fields may not be changed.

Archive mappings for Microsoft SharePoint

Content Collector does not enforce a formal data model for Microsoft SharePoint documents, but offers a sample item type for archiving Microsoft SharePoint documents and provides the respective archive mapping file. Even if you do not use the sample or if you use only some of the properties from the sample on a custom item type, you must provide an archive mapping file.

The archive mappings are required for retrieving an archived document when a user clicks the respective stub. If any required field definitions are missing, Content Collector cannot assign a proper file name when retrieving a document and uses a placeholder instead. In this case, the application for displaying the document content receives no valid file name and, therefore, might not be able to display the document.

In the following table, strings that are enclosed in percent signs (%) are placeholders for values that defined in the repository or in the configuration store.

Entries in the archive mapping file IBM Content Manager data model
<doc_type_collection id="Sharepoint" nm="Sharepoint" 
collectionType="ICC_SHAREPOINT">
  <repositories>
    <repository id="%ICC_UNIQUE_CONNECTION_NAME%">
      <doc_types>
        <doc_type>
          <name>%ICC_CONFIG_ITEMTYPE_NAME%</name>
        </doc_type>
			</doc_types>
    </repository>
  </repositories>
		
  <fields>
    <field nm="CONTENT" type="STRING">
    </field>
    <field nm="FILENAME" type="STRING">
      <attr>%ICC_CONFIG_ATTR_FILENAME%</attr>
    </field>
  </fields>
</doc_type_collection>	
This doc_type_collection section defines the collection for the Microsoft SharePoint item type. The collectionType attribute of the doc_type_collection element must have the value ICC_SHAREPOINT.

The repositories section in the collection defines the repositories that Content Collector can access. Each repository ID corresponds to a repository connection that is defined in the Content Collector configuration data base.

The doc_type section defines the Microsoft SharePoint item type that holds the file object, the properties, and the Microsoft SharePoint instance (SI). The SI tracks all references.

The fields section in this collection contains only required fields. These fields may not be changed.