API reference

API available

The complete API reference for Watson NLP can be found here:

Watson NLP API reference

Semantics of Span offsets

In Watson NLP, a Span is a contiguous region of a Text object, which is identified by the beginning and ending offsets in that Text object. Assume that your input text is:

Amelia Earhart is a pilot.

The text across Span [0-6] is Amelia. This Span can be visualized as:

 A m e l i a
^ ^ ^ ^ ^ ^ ^
0 1 2 3 4 5 6

Likewise, the text across Span [20-25] is pilot.

A Span of [x-x] represents the span between the end of a character and the beginning of the next character. In the previous example, [0-0] is an empty string before the character A. Likewise, a Span of [3-3] is an empty string between the characters e and l.

Span offsets in Watson NLP APIs follow the semantics of String objects in the respective programming language, to ensure interoperability with other libraries written in that programming language. Specifically, the APIs use code point (UTF-32) to represent Span offsets.

For a detailed description and relationship between code units and code points, see Character.