Watson Natural Language Processing library

The Watson Natural Language Processing library provides natural language processing functions for syntax analysis and out-of-the-box pre-trained models for a wide variety of text processing tasks, such as sentiment analysis, keyword extraction, and classification. The Watson Natural Language Processing library is available for Python only.

With Watson Natural Language Processing, you can turn unstructured data into structured data, making the data easier to understand and transferable, in particular if you are working with a mix of unstructured and structured data. Examples of such data are call center records, customer complaints, social media posts, or problem reports. The unstructured data is often part of a larger data record which includes columns with structured data. Extracting meaning and structure from the unstructured data and combining this information with the data in the columns of structured data, gives you a deeper understanding of the input data and can help you to make better decisions.

Watson Natural Language Processing provides pre-trained models in over 20 languages. They are curated by a dedicated team of experts, and evaluated for quality on each specific language. These pre-trained models can be used in production environments without you having to worry about license or intellectual property infringements.

Although you can create your own models, the easiest way to get started with Watson Natural Language Processing is to run the pre-trained models on unstructured text to perform language processing tasks.

Here are some examples of language processing tasks available in Watson Natural Language Processing pre-trained models:

  • Language detection: detect the language of the input text
  • Syntax: tokenization, lemmatization, part of speech tagging, and dependency parsing
  • Entity extraction: find mentions of entities (like person, organization, or date)
  • Noun phrase extraction: extract noun phrases from the input text
  • Text classification: analyze text and then assign a set of pre-defined tags or categories based on its content
  • Sentiment classification: is the input document positive, negative or neutral?
  • Tone classification: classify the tone in the input document (like excited, frustrated, or sad)
  • Emotion classification: classify the emotion of the input document (like anger or disgust)
  • Keywords extraction: extract noun phrases that are relevant in the input text
  • Relations: detect relations between two entities
  • Hierarchical categories: assign individual nodes within a hierarchical taxonomy to the input document
  • Embeddings: map individual words or larger text snippets into a vector space

Watson Natural Language Processing encapsulates natural language functionality through blocks and workflows. Blocks and workflows support functions to load, run, train, and save a model.

For more information, refer to Working with pre-trained models.

Here are some examples of how you can use the Watson Natural Language Processing library:

Running syntax analysis on a text snippet:

import watson_nlp

# Load the syntax model for English
syntax_model = watson_nlp.load('syntax_izumo_en_stock')

# Run the syntax model and print the result
syntax_prediction = syntax_model.run('Welcome to IBM!')
print(syntax_prediction)

Extracting entities from a text snippet:

import watson_nlp
entities_workflow = watson_nlp.load('entity-mentions_transformer-workflow_multilingual_slate.153m.distilled')
entities = entities_workflow.run('IBM\'s CEO Arvind Krishna is based in the US', language_code="en")
print(entities.get_mention_pairs())

For examples of how to use the Watson Natural Language Processing library, refer to Watson Natural Language Processing library usage samples.

Using Watson Natural Language Processing in a notebook

Service The Watson Natural Language Processing library is only available if the Jupyter Notebooks with Python 3.10 or Python 3.9 service is installed. Additionally, the pre-trained Natural Language Processing models must be installed on the IBM Cloud Pak for Data platform. See Specifying additional installation options for default Runtime for Python 3.10 and Specifying additional installation options for Python 3.9.

You can run your Python notebooks using the Watson Natural Language Processing library in the following provided default environments.

The Runtime 22.x and 23.1 environments might not be large enough to run notebooks that use the prebuilt models. For example, to run the Syntax and Sentiment models, you need an environment with 1 vCPU and 4 GB RAM. To work with larger environments, you must create a custom environment template of type Default (only CPU) or GPU. Refer to Creating environment templates. When you create this template, consider the following:

  • The environment must have at least 4 GB of memory, and one of the following Software versions:

    * : Indicates that the Runtime 22.1 on Python 3.9 environment template is deprecated.

    • Runtime 23.1 on Python 3.10
    • Runtime 22.2 on Python 3.10
    • Runtime 22.1 on Python 3.9 *
    • JupyterLab with Runtime 23.1 on Python 3.10
    • JupyterLab with Runtime 22.2 on Python 3.10
    • JupyterLab with Runtime 22.1 on Python 3.9 *
  • Use environment type Default or GPU. You can only select type GPU when creating a custom template if the Jupyter notebooks with Python for GPU service is installed on the IBM Cloud Pak for Data platform. GPU environments are not available by default. For details, see GPU environments.

Learn more

Parent topic: Notebooks and scripts