Fast-Indexing
To use content elements for Ranking, Sorting, and Filtering and Refinements and Binning, you must tell the indexer to treat them specially. This is done by fast-indexing those content elements. Fast-indexing is the general term for indexing methods that enable content to be accessed quickly, and which is flexible enough to be used to filter and sort query results. Watson™ Explorer Engine supports two different fast-indexing methods:
- Fast-Indexing allows you to define a list of name|type entries in the fast-index option on a search collection's Indexing tab, and stores this fast-index data in memory for quick access. This list identifies each content element that you want to fast-index and its datatype. Alternatively, you can add the fast-index="type" attribute to the content elements in your data that you want to be able to use in fast-index comparisons.
- Indexed Fast-Indexing is the same as fast-indexing in terms of being able to do fast-index comparisons, but permanently stores additional data in the index for a search collection. This can improve run-time/compare-time performance, but increases indexing time and index storage requirements. To use indexed fast-indexing, you must specify indexed-fast-index="type" attribute on content elements in your data that you want to be able to use in fast-index comparisons. (See Activating Indexed Fast-Indexing for more detailed information about defining indexed fast-indexed contents.)
You should only choose one fast-indexing method for any name of a content in the collection. If both the fast-index and indexed fast-index attributes are set for the same content name, the benefit of indexed fast-indexing will be lost.
Deciding when to use these fast-indexing methods largely depends upon the characteristics of your system, please contact IBM product support for further information.
Valid types that can be used with fast-index and indexed-fast-index are as follows:
- date - a special data type that automatically invokes the IBM viv:parse-date function on a date string to produce an integer that can be fast-indexed. Allowable dates are 4 bytes and between the dates 1901-12-14 and 2038-01-18.
- float - single-precision floating point numbers with decimal points. 4 bytes.
- int - integers. variable # of bytes.
- double or number - double-precision floating point numbers with decimal points. 8 bytes.
- set - a set of values, such as a character string. This is the default type.