Edit a collection
You can edit an existing collection and modify the fields, enrichment, and facets that were defined when the collection was created. You can also specify options for exploring the collection.
You can open a collection for editing in IBM Watson® Explorer Admin Console or in IBM Watson Explorer Content Miner. The editing screen has six tabs where you can modify the collection. There is also a sidebar that displays information about the collection.
Edit
You can modify the name and description of the collection. You can also set the time zone of the collection. This setting is used in date-related analysis.
If this is a secured collection, you can disable pre-filtering or post-filtering of search results.
- Ignore document-level access controls in the index
- Disable pre-filtering of search results.
- Do not validate current credentials before returning results
- Disable post-filtering of search results.
For more information, see Document-level security.
You can view, but not change, the setting of the Domain Adaptation Curator.
If IBM Watson Explorer oneWEX is installed on IBM® Cloud Private with multiple index partitions, you can view, but not change, the following advanced options.
- Number of index partitions
- Specifies the number of index partitions.
- Enable index replication
- Enables a backup index replica.
Fields
You have more control over fields when editing a collection than you did when you created the collection. You can specify field indexing options in the Field indexing option table. The table columns are described here.
- Fields
- The dataset field name.
- Field type
- The dataset field type.
- Index type
- A drop-down list of index types. The allowable types depend on the field type.
For example, for Field type=
String
, Index type options are:String
- case sensitive, exact matchAnalyzable text content
- case normalized, fuzzy matchTokenized text
- case normalized, text is tokenized (broken up into meaningful elements, or tokens)
- Free text searchable
- Makes the field available for a free-text search.
- Metadata facet
- Specifies whether to use the field as a metadata facet. Not applicable for analyzed text content or tokenized text.
You can enable or disable n-gram segmentation.
You can modify the title and date fields. You can only view the body field.
Enrichment
The Enrichment tab is identical to the enrichment step when creating a collection.
Facet
The Facet tab is identical to the facet step when creating a collection.
Exploration
You can configure options to improve the precision of search results, ensuring that the most relevant results are ranked higher in the result set, and to customize query results by associating dictionaries.
- Ranker
- You can select a ranker from the drop-down list. For more information, see Rankers.
- Natural language query
- Improve the user search experience by using Natural Language Processing. The following options
are available.
- Disjunctify threshold
- The minimum number of annotations required for disjunctify strategy to take place.
- Maximum number of annotations
- The maximum number of annotations taken into account to build query.
- Annotation conversion strategies
- Specify how each annotation in the original query is converted. Click
Edit to open the Conversion Strategy Configuration
dialog. The following strategies are available:
- Original text
- Converts the annotation into a multi-term query. Search documents which contain the terms in the annotated span.
- Refinement
- Convert the annotation into a facet refinement query. Search documents which have the same annotation with the annotated query.
- Facet value
- Convert the span into a multi-term query. Search documents which contain the terms in the annotation's facet value.
- Phrase
- Convert the annotation span into a phrase query. Search documents which contain the exact same sequence of terms in the annotated span.
- Blacklist words
- A list of noise words to be removed from the query. Blacklist filters out terms from natural language query.
- Whitelist words
- A list of words that are candidates for query terms.
- Stop words
- You can use a stop words dictionary. Click Upload to upload a stop words dictionary. The dictionary is a UTF-8-encoded text file with one stop word on each line. Stop words filter out query terms from normal queries.
- Synonym
- Upload a synonym list file. Each line is set of comma-delimited words that are synonyms. The file must be encoded as UTF-8.
- Spotlight
- Matches user query text to a map of top results, configured using an elevate.xml file. For more information, see The Query Elevation Component.
- Document relevancy score
- Modify document query relevancy score by boosting specified fields. Click
Edit and then select a boost factor for a field selected from the drop-down
list.
If there are non-field search prefixes in the query, all the fields configured here will be searched.
- Training of machine learning models
- Select machine-learning models. You can select Vector representation of words or Document recommendation on collaboration activities.
Document flags
You can enable and create document flags on the Document flags tab. After you have created flags, you can apply them to documents in IBM Watson Explorer Content Miner. For more information, see Documents view.
The Document flags tab displays existing flags which you can edit or delete. You can also add new flags by clicking Add flag. After you click Add flag, the Add flag dialog opens, where you can name the new flag, and add a description and set the flag color.
Edit
You can modify the name and description of the collection. You can also set the time zone of the collection. This setting is used in date-related analysis.
If this is a secured collection, you can disable pre-filtering or post-filtering of search results.
- Ignore document-level access controls in the index
- Disable pre-filtering of search results.
- Do not validate current credentials before returning results
- Disable post-filtering of search results.
For more information, see Document-level security.
You can view, but not change, the setting of the Domain Adaptation Curator.
If IBM Watson Explorer oneWEX is installed on IBM Cloud Private with multiple index partitions, you can view, but not change, the following advanced options.
- Number of index partitions
- Specifies the number of index partitions.
- Enable index replication
- Enables a backup index replica.
Index rebuilding after modifying a collection
When you make changes to a collection and click Save, you may get either a Rebuild Index dialog or a Warning dialog. These are described below.
- Rebuild Index
- You will see this dialog if you modify metadata facets or enrichment options. You have three choices.
- Cancel
- Saving of the collection is canceled.
- No
- The collection is saved but the modified settings are applied only to documents which are processed after the indexer process is updated with the saved settings. You can do this explicitly by stopping and then restarting the indexer.
- Yes
- The collection is saved and all documents are reevaluated and indexed with the new settings.
- Warning
- You will see this dialog if you modify the index type or sortable options, or enable n-gram segmentation. You can either cancel the save or select Restart a full index build and click OK. In this case, the collection is saved and all the indexed data is erased and then IBM Watson Explorer starts recreating indexes for all of the documents.