Fast-Indexing

To use content elements for Ranking, Sorting, and Filtering and Refinements and Binning, you must tell the indexer to treat them specially. This is done by fast-indexing those content elements. Fast-indexing is the general term for indexing methods that enable content to be accessed quickly, and which is flexible enough to be used to filter and sort query results. Watson™ Explorer Engine supports two different fast-indexing methods:

Note: When fast-indexing is specified, all instances of that content element are stored in RAM. Therefore, to optimize search application performance, careful consideration should be given to which content elements are fast-indexed.
  • Fast-Indexing allows you to define a list of name|type entries in the fast-index option on a search collection's Indexing tab, and stores this fast-index data in memory for quick access. This list identifies each content element that you want to fast-index and its datatype. Alternatively, you can add the fast-index="type" attribute to the content elements in your data that you want to be able to use in fast-index comparisons.
  • Indexed Fast-Indexing is the same as fast-indexing in terms of being able to do fast-index comparisons, but permanently stores additional data in the index for a search collection. This can improve run-time/compare-time performance, but increases indexing time and index storage requirements. To use indexed fast-indexing, you must specify indexed-fast-index="type" attribute on content elements in your data that you want to be able to use in fast-index comparisons. (See Activating Indexed Fast-Indexing for more detailed information about defining indexed fast-indexed contents.)

You should only choose one fast-indexing method for any name of a content in the collection. If both the fast-index and indexed fast-index attributes are set for the same content name, the benefit of indexed fast-indexing will be lost.

Deciding when to use these fast-indexing methods largely depends upon the characteristics of your system, please contact IBM product support for further information.

Note: When you use any form of fast-indexing, the content elements that you are fast-indexing must be defined as fields in the syntax for your project to be able to use them in queries. If the element(s) that you are fast-indexing are not a part of the default syntax, you must add them to the syntax for your project.

Valid types that can be used with fast-index and indexed-fast-index are as follows:

  • date - a special data type that automatically invokes the IBM viv:parse-date function on a date string to produce an integer that can be fast-indexed. Allowable dates are 4 bytes and between the dates 1901-12-14 and 2038-01-18.
  • float - single-precision floating point numbers with decimal points. 4 bytes.
  • int - integers. variable # of bytes.
  • double or number - double-precision floating point numbers with decimal points. 8 bytes.
  • set - a set of values, such as a character string. This is the default type.