December 7, 2020 By Demi Ajayi 3 min read

Last time on the NLP blog series, we explored how BERT and GPT models change the game for NLP. BERT and GPT models have a lot of exciting potential applications, such as natural language generation (NLG) (useful for automating communication, report writing, summarizations), conversational assistant, question and answer platforms, and query understanding. However, there are several key considerations to investigate before embarking on a new model for your business use case.

Bias: As with all machine learning, it is important to understand any implicit bias in the training data. Applications using massive language models such as NLG are particularly prone to disastrous negative effects in bias when not properly evaluated. For instance, there are incidents of various applications generating text that is offensive or negatively stereotyped against the subject of the text. Given the especially massive training data required, it’s very important to be cognizant of the potential of bias in these models and to keep a human in the loop when refining these models to eliminate bias.

Explainability/Transparency: It is also important to understand the algorithmic workings of any model you use: transparency on how results are derived and actual explanations are critical to ensure you have a model you can trust. Increasingly, AI providers such as IBM are moving toward creating standards of fairness, explainability and transparency in the models they provide.

Computational costs: As mentioned, GPT-3 has been trained on over 100 billion parameters. Building applications with this model is an incredibly computationally intensive task. Other massive deep learning models are less computationally intensive than GPT-3, but still often require significant computation power to provide results quickly in real life settings. Often GPUs, which are significantly more expensive than conventional CPU processing, are used to increase speed of computation in these applications. As businesses consider applications of massive deep language models, they will also have to consider the cost-to-performance benefit of these models.

Data: Another consideration is training data. Businesses have to consider how much data they have (or can invest in acquiring) to meet the demands of training these models. With these models, requiring less training data often means that the underlying model is very large (such as GPT-3), which introduces the trade-off of computational costs vs. data.

Accuracy & Evaluation: For applications with established evaluation metrics (such as question answering, or traditional text analytic tasks like classification and sentiment), it’s important to pick the model that meets your needed level of accuracy, while considering the other tradeoffs discussed in data and computational costs. For applications with less established means of evaluating accuracy (such as NLG for summarization and conversation), it’s critical to choose or develop an evaluation scheme suitable to your use case before adopting these models for business use. NLG is often evaluated by human annotators, though there has been incremental progress in developing automated evaluation tools. Here, the scalability and reliability of the evaluation tool are also additional considerations. For instance, evaluating if an AI-generated report is coherent, exhaustive, and well-written enough for use in a business setting will require much deeper analysis and higher standards than evaluating its ability to compose Shakespearean sonnets.

When kicking off your pilot project, focus on creating a model that is trained to perform certain tasks with specific validation data. Train the model to perform the specific task you are trying to achieve.

Afterward, using the considerations listed in this blog, you can conduct testing to measure baseline performance. With the data science elements listed in this blog, you can assemble a checklist to determine which models will best help you launch your pilot. These elements of data science will all play a critical role in the machine learning pipeline in your pilot and thereafter.

Get started with IBM Watson NLP.

Was this article helpful?
YesNo

More from Artificial intelligence

ServiceNow and IBM revolutionize talent development with AI

4 min read - Generative AI is fundamentally changing the world of work by redefining the skills and jobs needed for the future. In fact, recent research from ServiceNow and Pearson found that an additional 1.76 million tech workers will be needed by 2028 in the US alone.  However, according to the IBM Institute for Business Value, less than half of CEOs surveyed (44%) have assessed the potential impact of generative AI on their workforces. To help customers develop and upskill their workforces to meet…

Responsible AI is a competitive advantage

3 min read - In the era of generative AI, the promise of the technology grows daily as organizations unlock its new possibilities. However, the true measure of AI’s advancement goes beyond its technical capabilities. It’s about how technology is harnessed to reflect collective values and create a world where innovation benefits everyone, not just a privileged few. Prioritizing trust and safety while scaling artificial intelligence (AI) with governance is paramount to realizing the full benefits of this technology. It is becoming clear that…

Taming the Wild West of AI-generated search results

4 min read - Companies are racing to integrate generative AI into their search engines, hoping to revolutionize the way users access information. However, this uncharted territory comes with a significant challenge: ensuring the accuracy and reliability of AI-generated search results. As AI models grapple with "hallucinations"—producing content that fills in gaps with inaccurate information—the industry faces a critical question: How can we harness the potential of AI while minimizing the spread of misinformation? Google's new generative AI search tool recently surprised users by…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters