Adding text extractors
You can add a text extractor in an agentic workflow to extract text from a document. A text extractor eliminates the need for manual data entries when an agentic workflow runs, thus reducing the document processing time. You can extract text from documents and provide them as inputs to downstream nodes in an agentic workflow such as to logic blocks for formatting, generative prompts for analyzing keywords, and other activities.
Also, other workflow nodes, such as document extractor and document classifier, do not support .xlsx files.
An example use case is an agentic workflow that uses a text extractor node to analyze feedback in a document. When the agentic workflow runs in a chat, the agent can prompt the user to upload the document. The agentic workflow then extracts text from the document and the other nodes can process the extracted text to generate the expected output such as key points or summary.
To add a text extractor to an agentic workflow:
-
Open the agentic workflow in the workflow builder.
-
Click the Add flow items icon
. -
Select the Flow nodes tab.
-
Drag Text extractor to the agentic workflow.
-
Select the required properties:
-
Detect handwriting: Extract handwritten notes from uploaded files.
-
Keep document layout: Preserve the original formatting of the document.
-
Enable text hints: Improve recognition using contextual hints.
-
Output as object: Choose how the output variable is formatted:
-
Disabled: The extractor produces an output variable called
document_ref, which is the URL to the file containing the extracted text and key‑value pairs. -
Enabled: The extractor produces an output variable called
text, which is a JSON string object containing the entire extraction result, including plain text and document structure metadata.
After you choose how the output is formatted, you can use the output variable for mapping data, see Mapping data.
-
-
-
Extract key‑value pairs: Identify semantic key‑value pairs in documents. For more information, see Extracting semantic key-value pairs (KVP) from documents.
- Select a page range: Specify a page range to extract content from specific sections of large documents. Instead of processing entire documents, use the From and To options to define the exact page range that you want to extract.
When you have a large document where only the first few pages or specific pages contain relevant information, you can extract only those pages rather than processing the entire document.
Alternatively, to add a text extractor, click the connector line between the start and end nodes, then select Add a flow activity > Text extractor.
Extracting semantic key-value pairs (KVP) from documents
You can configure a text extractor to extract semantic key value pairs (KVP) from documents. Semantic KVP extraction can adapt to document variations such as format and layouts by focusing on the key-value pairings for extracting data.
To extract the key-value pairs from documents:
-
Select the text extractor node in the agentic workflow.
-
Set the Extract key-value pairs switch to on.
-
Click Add schema.
-
Specify the fields and tables that you want to extract from the documents by using a valid JSON schema. Here is an example JSON.
[ { "document_type": "Invoice", "document_description": "An invoice is a standard document issued by a seller to a buyer, outlining products or services provided, quantities, prices, and payment terms.", "fields": { "invoice_number": { "description": "A unique identifier assigned by the vendor for this invoice.", "example": "2023-AUS-987654" }, "document_date": { "description": "Date of the document.", "example": "2025-07-05" }, "vendor_name": { "description": "Legal or trade name of the company issuing the invoice. Usually located in the header or footer, near the logo, or billing details.", "example": "ABC Supply Company Ltd" }, "vendor_number": { "description": "Internal identifier used by the buyer's system to refer to the vendor.", "example": "VEND-1023" } } } ] } - Select a model to use for the text extractor.
From the Models list, select a model or click View all foundation models to open the model selection dialog, which lists all available models. You can search for a model or choose one from the list. After you select a model, click Save. Any notices associated with the selected model, such as deprecation notices or third‑party license requirements, are displayed.
Certain models include a status tag in the dialog to indicate states such as Recommended or Third party. A warning icon indicates that a model might be withdrawn or deprecated in a subsequent release.
-
Enter the KVP Force Schema Name.
To edit an existing schema, select the text extractor node in the agentic workflow, and click Edit schema.
Variance in kvp_model_name values for semantic KVP extraction
On-premises
The variance in kvp_model_name values is applicable only for On-premises deployments.
When a flow tool uses a default kvp_model_name or the API caller specifies one at run time, it is important to understand the subtle differences between the passed kvp_model_name values to ensure the expected results.
Models configured with internal foundation models
When you configure your models using IBM watsonx.ai, the value that is passed in kvp_model_name is the same for both SaaS and On-premises deployments.
For example, consider this value in kvp_model_name watsonx/mistralai/mistral-small-3-1-24b-instruct-2503.
Here,
-
watsonxis the Provider ID -
mistralai/mistral-small-3-1-24b-instruct-2503is the model card
Since the Provider ID is watsonx, you can use the same value watsonx/mistralai/mistral-small-3-1-24b-instruct-2503 for both SaaS and On-premises deployments.
If the Provider ID is watsonx, it indicates IBM watsonx.ai configuration. The same kvp_model_name works for semantic KVP extraction in both SaaS and On-premises deployments.
Models configured with external AI gateway
To configure the external models using AI Gateway in On-premises deployment, refer to Registering external models through AI gateway.
When you configure your models using external AI Gateway, the value that is passed in kvp_model_name is different in SaaS and On-premises deployments as the models are imported.
For example, consider this value in kvp_model_name groq/openai/gpt-oss-120b.
Here,
-
groqis the Provider ID -
openai/gpt-oss-120bis the model card
See Provider ID for more details.
Since the Provider ID is other than watsonx, prefix the value with virtual-model. That is, you must pass the value in kvp_model_name as virtual-model/groq/openai/gpt-oss-120b.
If the Provider ID differs from watsonx, it indicates that the configuration uses an external AI Gateway. In such cases, you must prefix the value with virtual-model for semantic KVP extraction in On-premises deployments.
Refer to the following table with examples for more clarity:
|
Model name |
Value in |
Provider ID |
Value to be passed in SaaS |
Value to be passed in On-premises |
|---|---|---|---|---|
|
mistral-small-3-1-24b-instruct-2503 |
watsonx/mistralai/mistral-small-3-1-24b-instruct-2503 |
watsonx |
watsonx/mistralai/mistral-medium-3-1-24b-instruct-2503 |
watsonx/mistralai/mistral-small-3-1-24b-instruct-2503 |
|
gpt-oss-120b |
groq/openai/gpt-oss-120b |
groq |
groq/openai/gpt-oss-120b |
virtual-model/groq/openai/gpt-oss-120b |
Semantic key-value pair extraction: When to use and limitations
When you enable semantic key-value pair (KVP) extraction in the text extractor, the system uses a layout-aware, multimodal model to identify relationships between keys and values within the document. This is the same functionality that is used in the structured document extractor.
When to use semantic KVP extraction
Use semantic KVP extraction when your documents meet the following criteria:
- Have a structured or fixed layout
- Contain clearly defined key-value pairs (for example, labels and values)
- Include tables or form-like sections
- Are typically short (1–3 pages), such as invoices, purchase orders, or tax forms
When not to use semantic KVP extraction
Semantic KVP extraction might not perform well in the following scenarios:
- Large documents
- Documents where the desired values appear in free-flowing paragraphs
- A single, atomic value per field from a document. Typical examples include values such as an invoice number, form date, or total amount.
- Structured tuples of values with a fixed structure that are grouped within the document. Typical examples include line items in an invoice, or items in a receipt.
Limitations
This capability does not support extracting multiple values for the same field, either within a single page or across multiple pages. For example, it cannot be used to extract all person names in a document, all clauses of a certain type in an agreement, or all potential PII occurrences. For these use cases, consider using full text extraction combined with downstream processing such as generative prompts, or other specialized approaches.
Mapping data to inputs
By default, auto-mapping is enabled. However, you can map values to the inputs.
To map values to inputs, complete the following steps:
-
Select the text extractor node and then click Edit data mapping.
-
Specify the input values for data mapping. For more information about data mapping, see Mapping data.
Text extractor limits and restrictions
Text extractors have the following limits and restrictions.
|
Area |
Description |
|---|---|
|
Maximum file size |
10 MB (except for Microsoft Excel files) Note: The maximum file size for Microsoft Excel files is 0.1 MB.
|
|
Maximum number of uploaded files |
5 files |
|
Accepted file types |
.doc, .docx, .jpe, .jpeg .jpg, .pdf, .png, .ppt, .pptx, .tif, .tiff, and .xlsx |
|
Maximum number of pages |
600 pages |
|
Maximum number of images |
No limit |
|
Maximum number of images per page |
No limit |