June 17, 2021 By John J Thomas 4 min read

Industry focus on trustworthy AI is being driven by several forces — corporate social responsibility posture, concerns around reputational risk and a growing set of regulations. Organizations recognize that they need a systematic approach to ensure AI and machine learning can be trusted and operationalized. Aspects of trustworthy AI include fairness, robustness, privacy, explainability and transparency. Use case patterns include conducting health checks of existing applications and creating an organization-wide framework for AI governance, but today, we will focus on operationalizing new applications.

AI lifecycle

Organizations need a systematic approach to build and operationalize new AI applications in a trustworthy manner. This approach must take into consideration the end-to-end data science and AI lifecycle. The AI lifecycle consists of a sequence of stages that bring together various personas and best practices. (See the O’Reilly report Operationalizing AI for a general view.) The stages of the lifecycle include:

  • Scope and plan
  • Collect and organize
  • Build and test
  • Validate and deploy
  • Monitor and manage

In an ideal scenario, aspects of trustworthy AI should be addressed at each of these stages and not just at the end. We need appropriate guardrails at each stage, from business definition, data exploration and model building to model validation, deployment, monitoring and ongoing management. Let us take a closer look at each stage and how various aspects of trustworthy AI are addressed. 

Scope and plan

This stage guides the prioritization of use cases and the development of an AI action plan. The team — composed of a business stakeholder, data scientist, data owner, operations lead, and other roles — first focuses on the business use case, defining business value and specifying business KPIs. It then addresses the technical task, translating the business goal into specific AI tasks to solve. Finally, the team develops a structured action plan for a solution to the identified technical tasks in support of the business goal.

Applying Enterprise Design Thinking principles is a best practice for this stage. Design Thinking helps identify and define various business and technical aspects of bias/fairness, robustness, explainability etc. in the context of the use case. This stage helps answer questions such as:

  • What are the business expectations for fairness or transparency?
  • What regulation do we need to comply with?
  • How do we get access to sensitive data attributes?
  • What is the granularity and frequency at which explanations need to be provided?

It’s also important to take note of any organizational data and AI policies that the team will need to follow while building out these use cases, to prevent any last-minute challenges when trying to validate for production use.

Collect and organize

This stage allows the data consumer to find and access relevant datasets. Data science teams can “shop for data” in a central catalog using either metadata or business terms to search. They can understand the data, including its owner, lineage, relationship to other datasets, and so on. It’s important to provide data scientists a technical view of data lineage, so they can understand each data transformation that might impact how they create and use features.

Based on that exploration, they can request a data feed. Once approved, the datasets can be made available to the data science team in their data science development environment.

If the team needs to work with regulated personal data such as personally identifiable information (PII) or protected health information (PHI), the data steward or a data provider must ensure the data shared adheres to regulations through appropriate anonymization.

Build and test

Data science teams explore and prepare the data, and build, train and test their AI/ML models during this stage. The activities during this stage are best undertaken as a set of agile sprints. It is important that bias in data be checked at this stage even before any model building work starts. This is an inner guardrail for fairness, and such guardrails can be put in place during and after the model building steps. Model robustness, explainability and other aspects can be similarly accounted for and tested during this stage.

Once these tests complete successfully, an MLOps pipeline allows for the model and related assets to be moved from the development environment to a pre-production validation environment.

Validate and deploy

This stage involves validation of the model and deployment into production. Validation activities could be performed by a team other than the one that built the model, such as the organization’s model validation or model risk management team or an external entity.

This team validates quality, robustness, fairness, etc. and generates validation reports. It is important to capture the validation results for reference and comparison. Model factsheets, local and global explanations and other metrics are also checked.

If the model passes validation, the Ops team can promote it to the production environment using an MLOps pipeline. The model is deployed there, either for online or batch invocation.

Monitor and manage

In this stage, the Ops team sets up ongoing monitoring and management of the AI/ML model in production. The team configures monitors for periodic scheduled collection of metrics. Quality, robustness, and fairness are monitored at frequencies dictated by business needs. Monitoring for data and accuracy drift and generating explanations for selected transactions as well as for global behavior are examples of ongoing activities.

The business can choose to act in case the monitors detect that a threshold has been breached, ranging from alerts to corrective steps. Bias mitigation or model re-training steps can be configured to ensure continued trustworthy behavior. Models can be decommissioned based on either business or technical criteria. This stage establishes the outermost guardrails for trustworthy AI.

AI governance

These AI lifecycle stages fit into an overall AI Governance framework for the enterprise. Such a framework allows multiple use cases to follow the lifecycle stages consistently, regardless of which development tools are used. It helps automate documentation across the lifecycle and provides a consistent view to various stakeholders.

Getting started

Operationalizing trustworthy AI requires us to bring together people (expertise), process (best practices), and platform (technology). IBM Cloud Pak for Data and IBM Spectrum Fusion provide the technology framework that supports various stages of the end-to-end AI lifecycle. It can fit into existing environments and complement existing model development and deployment tools. The platform can run on a wide variety of Cloud (IBM, AWS, Azure, GCP, etc.) and on-premises infrastructure choices, providing for a true hybrid cloud capability.

IBM provides a set of service offerings that bring together education, expertise, best practices, and technology to help customers get started with operationalizing trustworthy AI.

Learn the keys to implementing trustworthy AI

Was this article helpful?
YesNo

More from Artificial intelligence

Taming the Wild West of AI-generated search results

4 min read - Companies are racing to integrate generative AI into their search engines, hoping to revolutionize the way users access information. However, this uncharted territory comes with a significant challenge: ensuring the accuracy and reliability of AI-generated search results. As AI models grapple with "hallucinations"—producing content that fills in gaps with inaccurate information—the industry faces a critical question: How can we harness the potential of AI while minimizing the spread of misinformation? Google's new generative AI search tool recently surprised users by…

Are bigger language models always better?

4 min read - In the race to dominate AI, bigger is usually better. More data and more parameters create larger AI systems, that are not only more powerful but also more efficient and faster, and generally create fewer errors than smaller systems. The tech companies seizing the news headlines reinforce this trend. “The system that we have just deployed is, scale-wise, about as big as a whale,” said Microsoft CTO Kevin Scott about the supercomputer that powers Chat GPT-5. Scott was discussing the…

Generative AI meets application modernization

2 min read - According to a survey of more than 400 top IT executives across industries in North America, three in four respondents say they still have disparate systems using traditional technologies and tools in their organizations. Furthermore, the survey finds that most executives report being in the planning or preliminary stages of modernization. Maintaining these traditional, legacy technologies and tools, often referred to as “technical debt,” for too long can have serious consequences, such as stalled development projects, cybersecurity exposures and operational…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters