The Art of Automation: Chapter 2 - Robotic Process Automation (RPA)

RPA is made up of three core technologies: workflow automation, screen scraping and AI. The unique combo of these technologies allows RPA to solve the productivity challenge of manual desktop tasks.

Covered in this chapter

A practical definition of robotic process automation (RPA) and the three core technologies that power it
The intended users of RPA and the best usages of the technology
Practical limitations of RPA
RPA and its role and use with other popular automation technologies, including AI
What’s next for RPA

The Art of Automation: Table of Contents

Defining robotic process automation

Robotic process automation (RPA) is a program (in this case, it is a software robot) to mimic human users’ interaction with their desktop to perform tasks — for example, copying information from an Excel spreadsheet to a form, inserting customer data and placing an order on a website, etc. While we assume many human tasks have been automated in today’s digital world, there is still a large portion of our daily work that requires manual labor, and much of that work is repetitive.

Imagine if you are a data clerk responsible for processing incoming invoices sent to you by email or fax. You will have to read the incoming invoice — it could be a PDF document or a fax image — and enter the order manually into your ordering application. If this is a new customer making the order, you might also have to manually create the customer account. If you have RPA, the robot can leverage various OCR (Optical Character Recognition) techniques and intelligent document processing techniques to read the invoice and then simulate the mouse clicks and keyboard strokes on the computer screen to enter the information into the ordering application.

One key difference between RPA and other automation methods, such as scripts or API, is that RPA is not limited to command-line or API, but also the user interfaces. Despite advances in various modernization techniques, there are still many legacy business applications (e.g., CICS, IMS, SAP) or native applications (e.g., Windows-based) that do not provide modern APIs or command-line to automate. In some cases, the user just doesn’t have access to the APIs (imagine you’re using a third-party, web-based application like a banking website or online bookstore) since the chances of them giving regular users access to their backend API is very small. To automate tasks involving these systems, you need RPA.

The three core technologies in RPA

Robotic process automation (RPA) is made up of three core technologies: workflow automation, screen scraping and artificial intelligence (AI). It is the unique combination of these three technologies that allow RPA to solve a productivity challenge with manual desktop tasks that would otherwise have weak Return on Investment (ROI).

When RPA was first introduced, there was an impression that was just screen scraping technology. In a sense, that’s not wrong; RPA is an evolution of screen scraping with a smarter use of variation technology like screen assistance, more intelligent parsing of UI data (e.g., native Windows controls, Web browser DOM model) and a more scalable way of managing many robots at the same time.

Before the mass-market introduction of RPA, there were roughly three categories of workflow automation: fully manual, semi-automated human-centric and fully automated straight-through process. The purpose of almost all automation projects is to shift the percentage of fully manual processes to fully automated straight-through processes. The desire to have as many straight-through processes as possible is what helped drive the API economy since every service must be API accessible and programmed to eliminate all human interventions:

The problem is that there is an investment needed between human-centric process and straight-through process, and sometimes the investment required can be significantly more than the benefits gained. As a result, we will see that many initial automation projects are focused on business-critical processes where the ROI would be stronger. These sets of critical business processes often only account for the 10% of the processes in the company, and the majority of the rest are what we call “long-tail processes” and human-centric, but they are not significant enough to justify the investment required to build a new API or process re-engineering.

This is where using RPA is compelling and practical. RPA is good at automating a set of repetitive desktop tasks that would otherwise be difficult and time-consuming to automate without proper API integration. The fact that many RPA solutions (including IBM Robotic Process Automation) provide low-code authoring tools combined with screen recording and smarter vision technology makes it even easier and faster for users to build the solution.

What this accomplishes is that we have a middle ground between human-centric and straight-through processes where we are using a hybrid approach involving humans, robots and API to drive automation:

Intended users and boundaries of RPA

To identify if there are opportunities to use robotic process automation (RPA) within your organization, there are three places you can look into first:

Areas where you have a medium to a large population of human task workers that are largely doing repetitive and manual work (e.g., order processing from emails, record reconciliation between systems, etc.).
Disparate systems that do not have APIs or where APIs are not accessible. Typically, we would have considered these situations as not automatable due to lack of APIs, but it is now possible with RPA.
Manual steps as part of a bigger task. I generally refer to these as micro-tasks. For example, as part of preparing a sales report, the sales executive might need to copy groups of data from different marketing websites. While the sales strategy would still require the sales executive’s intuition and experience, copying the data or formatting the report can be done more routinely using RPA.

Attended vs. unattended RPA

There are two major forms of robots in robotic process automation (RPA): attended and unattended. When the RPA industry was first introduced in the market, the majority of the robots were ‘unattended.’ In essence, an unattended bot was like a cronjob, except in this case, the bots were being dispatched (usually on a schedule) to run on a designated computer.

‘Attended bots,’ later introduced, are bots that can be launched on demand by the users on their computers. In these cases, the bots are likely just automating a portion of the overall task, and not the entire task.

There are two main advantages to attended bots compared to unattended bots:

Attended bots allow users to automate a subset of tasks as part of the larger human-driven and more complex process where full automation might be difficult or wouldn’t produce the best outcome (e.g., when certain knowledge-based decision-making has to take place in between the steps).
Attended bots allow users to run automations on their computers without requiring IT to provision additional computing resources.

Recently, we are seeing an increasing trend where companies are deploying more attended bots in addition to the more traditional unattended bots.

Limitations of RPA

RPA — or robotic “process” automation — is a misnomer. It should have been called robotic “task” automation. RPA is good in automating a task and includes a workflow automation, but it is not intended to be used to orchestrate work across multiple people or multiple systems. One would typically use a Business Process Management software like IBM Business Automation Workflow for that purpose, which is more suitable for more complex interactions between automation and humans and can orchestrate across multiple automation, decision and AI technologies.
There are also many tasks that require human cognition and intuition. RPA bots are programs and can make use of AI to help them to make sense of the world, but they cannot think by themselves beyond simple and well-defined tasks. Some RPA vendors might lead you to believe RPA can solve all automation problems, but in reality, customers have misused RPA with unachievable expectations and are now realizing they need a more holistic end-to-end approach on their automation solutions.
RPA does not replace API integration. In places where you have API and can use API, it is almost always more reliable and scalable to use API-based integration, particularly in high-throughput and large-scale operations where performance metrics and business analytics are also required.

Examples/use cases of RPA used together with other automation technologies

Robotic process automation (RPA) as part of an overall business or IT process: In this case, the bots will be participants in the overall process and will work on tasks assigned to them. When a bot fails, it will log an error and often time — human operators will be required to investigate the failure reasons,or even complete the rest of the task manually. By using RPA as part of an overall business process, the bot can then delegate or escalate any failures to its teammates or manager.
RPA as part of an integration solution: When used together with an integration product like IBM App Connect, RPA can be part of an overall integration flow by basically providing an API to otherwise non-API systems.
RPA leveraging business rules and AI to make better decisions: By using business rules to help it to be better judgments, it will make the overall system more robust.
RPA leveraging intelligent document processing: RPA can use intelligent document processing to read unstructured and semi-structured documents. Intelligent document processing has advanced significantly in recent years in terms of the variety of types of document it can handle accurately and also how it can be integrated into other automation tools and workflows. Using intelligent document processing as built-in feature of RPA can help you automate even faster and simpler.
RPA as an interactive virtual agent, leveraging natural language processing: You can now leverage the power of conversation to initiate or provide additional information to the execution of bots, which helps incorporate RPA into your existing workflows, making it more consumable for non-technical users. Top RPA vendors (including IBM) embed this feature in their standard RPA feature with no additional installations.

How and where artificial intelligence (AI) will help

Today, there are a several areas where robotic process automation (RPA) makes use of AI (with the full expectation to expand to more use cases in the future). Fundamentally, what RPA tries to do is to mimic the human’s actions as they are performing their tasks.

These are four examples:

Reading unstructured or semi-structured documents: For example, invoices, scanned identity cards, handwritten notes or email. In these cases, RPA can make use of basic Optical Character Recognition software, but for the robot to be smart, it will have to use a more advanced content extraction system like the IBM Automation Document Processing capabilities, where it uses deep-learning techniques to extract contextual information from invoices or id cards (e.g., getting the customer address, item part numbers or even performing automated correction of information).
Integration with chat and voice: A typical example is using RPA to build chatbots or voice bots — integrating with existing services like IBM watsonx Assistant. IBM watsonx Assistant uses AI to identify and create intent recommendations, spot trends and emerging issues as they happen and automatically learns from user choices to improve interactions.
Computer vision: Reading the screen using computer vision to understand the user interface (e.g., Windows, scrollbar, button) instead of relying on traditional screen pixel coordinates. This helps the solution to be more accurate and reliable and help bots more portable to different devices.
Automating tasks discovered by task mining: This is a relatively new area that was introduced with the idea that we can identify and create highly effective bots to by observing how people work. This also gets into the concept of “learning” bots. This is obviously very desirable and has huge ROI benefits. IBM sees this as an active area of research for RPA and automation in general.

What’s next for robotic process automation (RPA)?

RPA’s undeniable strength is in clear ROI and being programmable by everyday business users. RPA, as a concept, is very powerful and possesses great potential for further innovation and application. RPA allows enterprises to bridge from their legacy or existing systems into the more modern API economy without requiring them to modernize their platforms first. There are several areas where RPA will evolve:

The robots will get smarter and understand the world better by leveraging different AI techniques like natural language understanding, computer vision, document extraction. This allows RPA bots to help with more tasks.
Everyone will have a bot on their desktop. I see the company deploying bots on every employee’s desktop to help everyone in the organization to automate part of their everyday work.
RPA being part of an overall automation solution with tighter integration to workflow management systems and process-mining software to identify opportunities where bots can be used.
Creation of the bot will get much easier than today. There will be better recording capabilities, more intrinsic use of computer vision and leveraging of task-mining technique to create bots.

Learn more

Make sure you check out The Art of Automation podcast, especially Episode 4 in which I sit down with Jerry Cuomo to discuss RPA.

Check out the other chapters in the ongoing series, The Art of Automation:

The Art of Automation: Landing Page

Was this article helpful?

YesNo

Allen Chan

DE & CTO, Digital Business Automation