Every day businesses receive thousands of customer reviews, support tickets and feedback messages. Hidden within this flood of text data are patterns that reveal what customers like and where products fall short. However, manually reading and categorizing this information is time-consuming and inconsistent.
Text classification is a fundamental natural language processing (NLP) task that solves this issue by automatically categorizing text into predefined labels. These categorizations can then be used by AI systems to make decisions or trigger actions.
In this tutorial, you’ll build an AI agent that uses a text classification model to analyze product reviews and determine customer sentiment. The agent will classify text as positive, negative or neutral and respond with insights about the feedback. From spam detection to content moderation, text classification powers countless AI applications across industries. Follow along to learn how to integrate this capability into your intelligent systems.
Traditional data science approaches to this classification problem rely on machine learning algorithms like Naive Bayes and logistic regression. Before training a model with these algorithms, documents must go through text preprocessing and be converted into numerical features that the model can interpret.
Common techniques include bag-of-words (BoW) or term frequency–inverse document frequency (TF-IDF) vectors. In many workflows, these features are generated from datasets stored in CSV files and loaded into a pandas DataFrame for modeling with tools like scikit-learn (sklearn) and NumPy.
While effective for simple tasks, these methods require explicit feature engineering, where developers must manually define how text is represented in the training data. Neural networks take a different approach. Transformer-based model architectures create special numerical embeddings that capture semantic relationships, helping to optimize efficiency on more complex tasks.
This tutorial takes the transformer approach. Rather than manually engineering features, you’ll use a fine-tuned DistilBERT model that already understands semantic relationships in text.
Text classifiers help machines make sense of large volumes of unstructured data by automatically assigning labels to text. They power everyday AI tasks like email filtering, content moderation, support ticket routing and sentiment analysis. Combining a Python text classification tool with an AI agent can make these tasks faster, more accurate and easier to scale.
Modern AI applications often combine specialized machine learning models with conversational AI agents. Instead of relying on a single model for every task, agents can invoke tools designed for specific capabilities.
In this example, the AI agent uses a sentiment analysis tool powered by a fine-tuned version of DistilBERT, a lightweight open source transformer-based deep learning model derived from BERT.1 The agent itself uses IBM Granite® as the large language model (LLM) to handle natural language interaction and orchestrate the workflow. The classification tool performs the prediction, while the agent interprets the results and responds to the user.
This modular approach improves performance, flexibility and scalability, allowing AI systems to combine the strengths of multiple models.
You’ll create a fully functional AI system for sentiment analysis that demonstrates practical classification in action, while gaining hands-on experience with NLP and agent workflows.
Your agent will:
By the end, you’ll have a modular, interactive AI agent system and hands-on experience integrating NLP models, building agent-invokable tools and configuring workflows for real-world applications.
This guide includes installation steps for the watsonx Orchestrate Agent Development Kit (ADK).
Note: The watsonx Orchestrate ADK is only available with the Developer Edition. It lets you build, test and deploy AI agents locally before publishing them to your watsonx Orchestrate environment.
You have two options to set up your project:
Option A: Clone the tutorial repository
Clone our GitHub repository to get all project files preconfigured:
Navigate to the
Option B: Create from scratch
If you prefer to build the project step-by-step, create a new project directory:
This directory is where you’ll be working as you follow along, creating each file manually.
Creating a virtual environment isolates your project dependencies from other Python projects on your system.
This activation command differs depending on your operating system.
macOS and Linux:
Windows:
You should see a
The IBM watsonx Orchestrate Agent Development Kit (ADK) is a Python library and CLI tool that enables you to build, test and deploy AI agents.
With your virtual environment activated, install the ADK by using pip:
This command installs the watsonx Orchestrate ADK along with its dependencies.
Now we’ll create a Python tool that performs sentiment analysis. In the watsonx Orchestrate ADK, tools are Python functions that agents can invoke to accomplish specific tasks.
Create a requirements.txt file in your project directory and copy and paste the following dependencies:
Orchestrate also uses this file to install the specified dependencies when running any Python-based agent tools.
Install the dependencies:
Create a file inside your project directory named
This tool was written based on the parameters and guidance from Authoring Python-Based tools. However, if you already have a Python tool ready to test, the ADK can automatically import and convert ordinary Python files, as well as generate docstrings for functions in the source file. This is done by using the Auto-discover feature, a simple command that converts a Python file into a format ready to be uploaded to Orchestrate.
The sentiment analysis tool performs text classification through several key steps:
This loads a DistilBERT transformer model from Hugging Face that’s been fine-tuned for sentiment analysis. The model has been fine-tuned on the Standford Sentiment Treebank (SST-2) dataset, which makes it well suited for classifying product reviews and similar short-form feedback.
Depending on your use case, you might want to swap in a different model. For example, a model trained on news articles would be better suited for media monitoring, while a domain-specific model fine-tuned on customer support data might perform better for ticket routing.
You can browse models on HuggingFace and replace the model string in the pipeline to optimize for your specific classification task. If you do swap models, performance metrics such as accuracy, precision and F1 score can help you evaluate whether the new model is a better fit for your data.
2. Register the tool with
The
3. Validate input and perform classification
The tool first validates that the input text is not empty, then passes it to the transformer model. The
4. Extract key phrases (Post-processing)
This helper function identifies the most important words in the review by tokenizing the text, removing common words (for example, the, is or a), and keeping only meaningful terms longer than 3 characters. This provides explainability for classification by showing users which specific words influenced the sentiment decision.
5. Return structured results (Output)
The tool packages everything into a structured directory that makes the results easy to use and understand. This structured format allows the agent to easily present results to users and enables downstream systems to process the sentiment data programmatically.
Before integrating with an agent, test your tool to ensure that it works correctly:
You should see output similar to:
Note about the warning message: You may see a message like:
This is an informational warning from the watsonx Orchestrate ADK and does not affect the functionality of your tool. The ADK is being strict about docstring format pasting, but your sentiment analysis results will be accurate.
Now that you have a working tool, you’ll create an AI agent that can use it to help users analyze product reviews through natural conversation. Our Product Review Sentiment Analyzer agent combines conversational AI with advanced sentiment analysis to provide comprehensive review insights.
Agents are defined by using YAML configuration files that specify the agent’s behavior, capabilities and available tools.
Create a file named
Let’s breakdown the key fields:
Now that you have created your sentiment analysis tool and agent configuration, you’ll set up a local watsonx Orchestrate environment to test your agent.
Inside your project directory, create a
Otherwise, create a file within your
Open the
WO_DEVELOPER_EDITION_SOURCE: The source ID for the watsonx Orchestrate Developer Edition. Set this field to orchestrate.
WO_INSTANCE: This URL is your watsonx Orchestrate service instance. You can find this information by logging in to your watsonx Orchestrate account and navigating to your instance details. Click your profile icon > Settings, then select API details tab.
The URL follows this format:
Copy and paste your service instance URL to replace the template value in your
WO_API_KEY: This key is your watsonx Orchestrate API key, which authenticates your connection to IBM Cloud services. You can create this key by clicking “Generate API Key” on the API details tab where you’ll be redirected to your IBM Cloud account dashboard to generate a key.
Replace
Your complete
When you have configured the environment variables in your
This command starts the local watsonx Orchestrate server and loads your credentials from the
If the server initialization fails or hangs, try starting from a clean slate by using the following steps:
This command stops and removes all containers created for watsonx Orchestrate.
2. Restart the installation:
After resetting, run the start command again:
3. Check the server logs container status:
You can view service logs for the Orchestrate server to check for warnings or errors:
If the preceding steps do not work, reset the server and completely remove the server environment:
Now you’ll import your sentiment analysis tool and agent into the local environment and test them with sample product reviews that act as test data to validate the model’s predictions.
Before importing the agent, you need to import the sentiment analysis tool so it’s available in your watsonx Orchestrate environment.
Run this command to import the tool:
This command imports the Python tool directly and uses the
Now that the tool is imported, you can import the agent that uses it. Use the orchestrate agents import command to import your agent from the YAML configuration file:
The agent configuration and tool references are validated before the agent is registered with Orchestrate environment.
Now that both the tool and agent are imported, you can start the chat interface with your agent:
This command opens a web-based chat interface in your default browser at http://localhost:3000/chat-lite. If the browser doesn’t open automatically, you can manually navigate to that URL.
Click the agent dropdown menu at the upper left of the chat interface and select
The agent is now active and ready to analyze product reviews.
With your agent selected, try asking it to analyze various product reviews. Here are some examples to test out different sentiment types:
Example 1: Positive review
The expected response should display the sentiment tool results in a conversational tone.
Example 2: Negative review
Example 3: Neutral review
Example 4: Mixed review
Try testing the agent with your own product reviews or real customer feedback.
When you’re finished testing your agent, you can stop the watsonx Orchestrate server.
When you’re done working in the project, deactivate your Python virtual environment:
You can reactivate it anytime by running the activation command from step 1.
You’ve built a working AI agent that uses text classification to analyze product reviews and respond with meaningful sentiment insights by combining a fine-tuned DistilBERT classification tool with a conversational IBM Granite agent in watsonx Orchestrate.
The pattern you’ve implemented here is the same foundation used in production AI systems across industries. Rather than simply returning a label, the agent uses classification results to decide how to respond, which is what separates an NLP script from an intelligent agent workflow.
As a next step, consider extending the agent to trigger actions based on classification results. A high-confidence negative review could automatically create a support ticket, a positive one could be fed into a marketing pipeline and a neutral response could be queued for human follow-up. You could also apply the same architecture to entirely different problems, (spam detections, content moderation or support ticket routing), by swapping the model and updating the agent instructions.
Text classification is a small capability with a large impact. As you scale the system, performance metrics will help you optimize model performance overtime. Now that you understand how to integrate it into an agent workflow, you have a reusable pattern for building smarter, more responsive AI systems.