Adding text extractors

Edit online

You can add a text extractor in an agentic workflow to extract text from a document. A text extractor eliminates the need for manual data entries when an agentic workflow runs, thus reducing the document processing time. You can extract text from documents and provide them as inputs to downstream nodes in an agentic workflow such as to logic blocks for formatting, generative prompts for analyzing keywords, and other activities.

Note: The text extractor accepts Microsoft Excel (.xlsx) files as input only to extract text content. You cannot use these files for key value pair (KVP) extraction. The system supports only the Microsoft .xlsx format and does not accept the older .xls format.

Also, other workflow nodes, such as document extractor and document classifier, do not support .xlsx files.

When you configure a text extractor to extract semantic key value pairs (KVP) from documents, you can choose a model from the list of available models. You can also add your own custom model through AI Gateway. For more information, see Adding custom AI models.

An example use case is an agentic workflow that uses a text extractor node to analyze feedback in a document. When the agentic workflow runs in a chat, the agent can prompt the user to upload the document. The agentic workflow then extracts text from the document and the other nodes can process the extracted text to generate the expected output such as key points or summary.

To add a text extractor to an agentic workflow:

Open the agentic workflow in the workflow builder.
Click the Add flow items icon .
Select the Flow nodes tab.
Drag Text extractor to the agentic workflow.
Select the required properties:
- Detect handwriting: Extract handwritten notes from uploaded files.
- Keep document layout: Preserve the original formatting of the document.
- Enable text hints: Improve recognition using contextual hints.
- Output as object: Choose how the output variable is formatted:
  - Disabled: The extractor produces an output variable called document_ref, which is the URL to the file containing the extracted text and key‑value pairs.
  - Enabled: The extractor produces an output variable called text, which is a JSON string object containing the entire extraction result, including plain text and document structure metadata.
  After you choose how the output is formatted, you can use the output variable for mapping data, see Mapping data.

Extract key‑value pairs: Identify semantic key‑value pairs in documents. For more information, see Extracting semantic key-value pairs (KVP) from documents.
Select a page range: Specify a page range to extract content from specific sections of large documents. Instead of processing entire documents, use the From and To options to define the exact page range that you want to extract.
When you have a large document where only the first few pages or specific pages contain relevant information, you can extract only those pages rather than processing the entire document.

Alternatively, to add a text extractor, click the connector line between the start and end nodes, then select Add a flow activity > Text extractor.

Extracting semantic key-value pairs (KVP) from documents

You can configure a text extractor to extract semantic key value pairs (KVP) from documents. Semantic KVP extraction can adapt to document variations such as format and layouts by focusing on the key-value pairings for extracting data.

To extract the key-value pairs from documents:

Select the text extractor node in the agentic workflow.
Set the Extract key-value pairs switch to on.
Click Add schema.

Specify the fields and tables that you want to extract from the documents by using a valid JSON schema. Here is an example JSON.

[
    {
        "document_type": "Invoice",
        "document_description": "An invoice is a standard document issued by a seller to a buyer, outlining products or services provided, quantities, prices, and payment terms.",
        "fields": {
            "invoice_number": {
                "description": "A unique identifier assigned by the vendor for this invoice.",
                "example": "2023-AUS-987654"
            },
            "document_date": {
                "description": "Date of the document.",
                "example": "2025-07-05"
            },
            "vendor_name": {
                "description": "Legal or trade name of the company issuing the invoice. Usually located in the header or footer, near the logo, or billing details.",
                "example": "ABC Supply Company Ltd"
            },
            "vendor_number": {
                "description": "Internal identifier used by the buyer's system to refer to the vendor.",
                "example": "VEND-1023"
            }
        }
    }
]
}

Select a model to use for the text extractor.
From the Models list, select a model or click View all foundation models to open the model selection dialog, which lists all available models. You can search for a model or choose one from the list. After you select a model, click Save. Any notices associated with the selected model, such as deprecation notices or third‑party license requirements, are displayed.

Certain models include a status tag in the dialog to indicate states such as Recommended or Third party. A warning icon indicates that a model might be withdrawn or deprecated in a subsequent release.
Enter the KVP Force Schema Name.

To edit an existing schema, select the text extractor node in the agentic workflow, and click Edit schema.

Variance in kvp_model_name values for semantic KVP extraction

On-premises

Note:

The variance in kvp_model_name values is applicable only for On-premises deployments.

When a flow tool uses a default kvp_model_name or the API caller specifies one at run time, it is important to understand the subtle differences between the passed kvp_model_name values to ensure the expected results.

Models configured with internal foundation models

When you configure your models using IBM watsonx.ai, the value that is passed in kvp_model_name is the same for both SaaS and On-premises deployments.

For example, consider this value in kvp_model_name watsonx/mistralai/mistral-small-3-1-24b-instruct-2503.

Here,

watsonx is the Provider ID
mistralai/mistral-small-3-1-24b-instruct-2503 is the model card

Since the Provider ID is watsonx, you can use the same value watsonx/mistralai/mistral-small-3-1-24b-instruct-2503 for both SaaS and On-premises deployments.

Note:

If the Provider ID is watsonx, it indicates IBM watsonx.ai configuration. The same kvp_model_name works for semantic KVP extraction in both SaaS and On-premises deployments.

Models configured with external AI gateway

To configure the external models using AI Gateway in On-premises deployment, refer to Registering external models through AI gateway.

When you configure your models using external AI Gateway, the value that is passed in kvp_model_name is different in SaaS and On-premises deployments as the models are imported.

For example, consider this value in kvp_model_name groq/openai/gpt-oss-120b.

Here,

groq is the Provider ID
openai/gpt-oss-120b is the model card

See Provider ID for more details.

Since the Provider ID is other than watsonx, prefix the value with virtual-model. That is, you must pass the value in kvp_model_name as virtual-model/groq/openai/gpt-oss-120b.

Note:

If the Provider ID differs from watsonx, it indicates that the configuration uses an external AI Gateway. In such cases, you must prefix the value with virtual-model for semantic KVP extraction in On-premises deployments.

Refer to the following table with examples for more clarity:

Table 1. Variance in kvp_model_name values passed in SaaS and On-premises
Model name	Value in `kvp_model_name`	Provider ID	Value to be passed in SaaS	Value to be passed in On-premises
mistral-small-3-1-24b-instruct-2503	watsonx/mistralai/mistral-small-3-1-24b-instruct-2503	watsonx	watsonx/mistralai/mistral-medium-3-1-24b-instruct-2503	watsonx/mistralai/mistral-small-3-1-24b-instruct-2503
gpt-oss-120b	groq/openai/gpt-oss-120b	groq	groq/openai/gpt-oss-120b	virtual-model/groq/openai/gpt-oss-120b

Semantic key-value pair extraction: When to use and limitations

When you enable semantic key-value pair (KVP) extraction in the text extractor, the system uses a layout-aware, multimodal model to identify relationships between keys and values within the document. This is the same functionality that is used in the structured document extractor.

When to use semantic KVP extraction

Use semantic KVP extraction when your documents meet the following criteria:

Have a structured or fixed layout
Contain clearly defined key-value pairs (for example, labels and values)
Include tables or form-like sections
Are typically short (1–3 pages), such as invoices, purchase orders, or tax forms

When not to use semantic KVP extraction

Semantic KVP extraction might not perform well in the following scenarios:

Large documents
Documents where the desired values appear in free-flowing paragraphs

Semantic KVP extraction is designed to identify and return these types of values:

A single, atomic value per field from a document. Typical examples include values such as an invoice number, form date, or total amount.
Structured tuples of values with a fixed structure that are grouped within the document. Typical examples include line items in an invoice, or items in a receipt.

Limitations

This capability does not support extracting multiple values for the same field, either within a single page or across multiple pages. For example, it cannot be used to extract all person names in a document, all clauses of a certain type in an agreement, or all potential PII occurrences. For these use cases, consider using full text extraction combined with downstream processing such as generative prompts, or other specialized approaches.

Mapping data to inputs

By default, auto-mapping is enabled. However, you can map values to the inputs.

To map values to inputs, complete the following steps:

Select the text extractor node and then click Edit data mapping.
Specify the input values for data mapping. For more information about data mapping, see Mapping data.

Text extractor limits and restrictions

Text extractors have the following limits and restrictions.


Area	Description
Maximum file size	10 MB (except for Microsoft Excel files) Note: The maximum file size for Microsoft Excel files is 0.1 MB.
Maximum number of uploaded files	5 files
Accepted file types	.doc, .docx, .jpe, .jpeg .jpg, .pdf, .png, .ppt, .pptx, .tif, .tiff, and .xlsx
Maximum number of pages	600 pages
Maximum number of images	No limit
Maximum number of images per page	No limit