Abstractive text summarization tutorial

Multiple solid dots floating on a blue background

In this tutorial, you’ll learn how to create abstractive text summaries with a local transformer model from the Hugging Face library.

Text summarization is a core task in artificial intelligence (AI) and natural language processing (NLP) that turns long, complex documents into short, easy-to-read summaries while preserving the main ideas. Modern transformer models make this task possible by understanding text, highlighting the most important ideas and generating clear, comprehensible summaries.

What is abstractive text summarization?

Abstractive summarization is a type of automatic text summarization in which a system generates new sentences that paraphrase and condense the meaning of a source text. The goal is to produce a summary that captures the core ideas by using different wording and structure, rather than copying sentences verbatim.

From a tooling perspective, abstractive summarization typically combines NLP preprocessing steps with neural language models that perform the actual generation. Traditional NLP techniques such as tokenization, sentence segmentation and word embedding representations are used to structure and encode the input text. Meanwhile, the summarization model learns how to generate new sentences from these representations.

There are two main approaches to automatic text summarization:

Extractive text summarization: Selects and copies the most important sentences directly from the original text, similar to highlighting key sentences with a marker. This method is faster and simpler to implement, but it is limited to the wording and structure of the source document. NLP techniques here typically involve sentence scoring, keyword extraction or graph-based ranking algorithms.

Abstractive text summarization: Generates new sentences that capture the core meaning of the text, much like how a human would write a summary in their own words. This approach is more flexible and natural-sounding, but it is also more computationally intensive. NLP techniques used include encoder-decoder models, attention mechanisms and contextual embeddings, which allow the system to understand relationships between words and generate new text.

The evolution of modern text summarization

This technical progress wasn’t achieved overnight. Early NLP systems focused on explicitly modeling linguistic structure. Techniques from information extraction (IE) were used to identify entities, relations and events by using hand-drafted rules or statistical models.1 During this period, most text summarization methods were extractive, selecting important sentences rather than generating new text.

Neural extractive models represent the next step. One influential example, SummaRuNNer, a recurrent neural network (RNN) based sequence model, showed that neural models can capture document-level context and outperform traditional extractive techniques.2 Early neural net models included RNN and long short-term memory (LSTM) networks that helped capture sequential dependencies across long documents. Convolutional neural networks (CNNs) were also applied to text for local syntactical feature extraction, complementing sequential models.3

The idea of abstractive summarization became more practical with the introduction of encoder-decoder neural models, which can map an input sequence to a variable-length output sequence suitable for tasks such as summarization. 

In these models, the encoder processes the input text and converts it into a series of contextual representations that capture the meaning and relationships between words. The decoder generates the output sequence token by token, attending to relevant parts of the input through attention mechanisms to ensure coherence and to preserve information. This structure allows the model to produce entirely new sentences rather than relying on predefined templates or extracted facts. 

In recent years, state-of-the-art transformer-based models have achieved strong results on large datasets such as Gigaword or collections of news articles (CNN/DailyMail training data). Pretraining on large corpora enabled these models to generalize across domains and produce fluent summaries.

Some systems incorporate a knowledge base or learning-based lexical modules to improve factual correctness and contextual understanding, particularly in specialized domains. These ideas are closely related to retrieval-augmented generation (RAG) approaches, where a model can retrieve relevant documents or facts from an external source and then generate abstractive summaries that integrate this information. 

More broadly, abstractive summarization underlies many modern applications, from RAG-based QA systems to automated report generation, demonstrating its role as a building block in practical AI systems.

Abstractive text summarization is a form of document summarization, closely related to tasks like machine translation and natural language generation. Earlier syntactic text summarization techniques relied on grammatical rules, whereas modern approaches leverage neural architectures for summary generations and rewriting.

How does abstractive summarization work?

Abstractive summarization relies on advanced language models such as BART, T5 or PEGASUS, which are implemented as sequence-to-sequence (seq2seq) transformer models. These models transform input documents into numerical representations that capture contextual meaning, then generate concise summaries that convey the same ideas in new words.

The summarization process begins with tokenization, where the individual words are split into tokens (words or subwords). These tokens are converted into numerical representations and processed by the encoder, which uses self-attention to understand how different parts of the text relate to each other. Self-attention allows the model to weigh the importance of each token relative to every other token in the sequence. This way, the model can capture long-range dependencies and contextual relationships across the document. The encoder produces contextual representations that capture the document’s information, which the decoder then uses to generate the final summary.

The decoder generates the summary token by token, by using the encoder’s contextual representations and attention mechanisms to focus on the most relevant parts of the input. It also considers previously generated tokens to maintain coherence. Some models might directly copy certain words or phrases from the input, which is useful for names, numbers or technical terms.

By combining these techniques, the model produces human-like summaries that paraphrase and condense the original text instead of copying it verbatim. 

What kind of models are best for abstractive summarization?

Modern abstractive summarization is dominated by transformer-based sequence-to-sequence models, which treat summarization as a generation task: given an input sequence (the corpus or document), the model generates an output sequence (the summary).

BART (Bidirectional and auto-regressive transformers)

BART is a transformer-based encoder-decoder model designed for text generation tasks. Its encoder is bidirectional, meaning that it reads the entire input sequence both left-to-right and right-to-left, allowing it to fully understand context around each word.

BART is pretrained by using denoising objectives, where the model learns to reconstruct original text from corrupted versions (for example, with masked tokens, deleted spans or shuffled sentences). This pretraining strategy makes BART well-suited for abstractive summarization tasks, and fine-tuned BART models achieve strong performance on standard benchmarks.4  For this tutorial, we’re usingfacebook/bart-large-cnn , a BART model that is highly regarded and widely considered one of the best pre-trained, open-source models available for abstractive text summarization. 

Other popular seq2seq models for summarization

While this tutorial example uses BART, several other transformer-based seq2seq models are commonly used for abstractive summarization:

  • T5 (Text-to-text transfer transformer): Uses transfer learning and is pretrained on a large mixture of tasks framed in a text-to-text format.5 The T5 model frames all NLP tasks as text-to-text problems. All tasks are handled by using the same input/output format, making T5 highly flexible.
  • PEGASUS: Specially designed for summarization, PEGASUS introduces a pretraining objective that trains models to generate important missing sentences from documents, often yielding strong results on long-form content.6
  • Long-form models (for example, Long-T5, LED): These models extend the seq2seq methodology with attention mechanisms to handle much longer documents, making them well suited for summarizing reports, research papers or legal texts.

Requirements

  • Python: version 3.9 or higher (check yours with python --version  )
  • RAM: 8 GB enough to load the model, store intermediate data and handle small-to-medium texts
  • Storage: ~2–3 GB free for model downloads

Steps

Step 1. Clone the GitHub repository

To get hands on and run this project, clone the GitHub repository by using https://github.com/IBM/ibmdotcom-tutorials as the
HTTPS URL. For detailed steps on how to clone a repository, refer to the GitHub documentation.  

Step 2. Set up your environment

This tutorial uses a Jupyter Notebook to demonstrate abstractive text summarization with pretrained transformer models from HuggingFace. Jupyter Notebooks are versatile tools that allow you to combine code, text and visualization in a single environment. You can run this notebook in your local IDE or explore cloud-based options like watsonx.ai® Runtime, which provides a managed environment for running Jupyter Notebooks.

Step 3. Install dependencies for abstractive text summarization

Before we run our abstractive summarization example, we need to install a few Python libraries from Hugging Face and PyTorch. These libraries provide the tools and pretrained models needed to process text, run neural networks and generate summaries.

!pip install -q transformers torch sentencepiece
print(“Dependencies installed!”)
 
Generating explanation
 
  • transformers: Hugging Face’s library that provides access to pretrained models like BART, T5 and many more, along with easy-to-use pipelines for tasks such as summarization.
  • torch: The underlying deep-learning framework (PyTorch) used to run these models efficiently on CPUs or GPUs.
  • sentencepiece: A tokenizer used by many transformer models to split text into tokens the model can process.

Step 4. Import the pipeline function

The pipeline function from the Hugging Face Transformers library is a ready-to-use interface for running machine learning models on common tasks. For abstractive summarization, it automatically loads the proper tokenizer and pretrained model for summarizing text and handles all the steps from preprocessing to output generation. This function allows us to generate summaries with just a few lines of code.

 

from transformers import pipeline
 

Step 5. Load summarization pipeline

This step sets up the summarization pipeline by using the pretrained BART model.

summarizer = pipeline(
     “summarization”,
     model=”facebook/bart-large-cnn”
)
 
Generating explanation
 

The summarizer object now contains the tokenizer, model and all necessary post-processing so you can input text and get a summary. This cell might take a few minutes to download the model.

Other summarization models in the Hugging Face Transformer library can be used by changing the model argument. Different models might vary in speed, summary length and output style.

 

Step 6. Prepare your text

Let’s start with a simple example of abstractive summarization itself. You can replace this with any text you want, such as an article excerpt, a blog post or your own notes. For best results, keep it readable.  

This variable text will be passed into the summarization pipeline in the next step.

text = “”” 
Abstractive summarization is a technique in natural language processing 
that generates concise summaries by rephrasing the original content. 
Rather than copying sentences directly, the model creates new text 
that captures the core meaning of the source. 
“””
 

Step 7. Generate the summary

In this step, we pass our prepared text (text) into the summarizer pipeline. The model reads the input text, identifies the main points and generates a shorter version in its own words.

The max_length and min_length  parameters control how long the generated summary can be, allowing us to balance brevity and completeness. Setting do_sample=False  disables randomness during generation, ensuring that the model produces the same summary each time the cell is run.

The summarization pipeline always returns a list of results so it can handle multiple input texts at once. Because we provide a single input here, we extract the first result from the list and print the generated text, the summary.

summary = summarizer( 
    text, 
    max_length=50, # maximum number of tokens in the generated summary 
    min_length=20, # minimum number of tokens in the generated summary 
    do_sample=False # when False use deterministic output; True enables randomness for stochastic output variations 

 
print(summary[0][“summary_text”])
 
Generating explanation
 
# Output

The model creates new text that captures the core meaning of the source. It generates concise summaries by rephrasing the original content.
 
Generating explanation
 

Example abstractive summary

An example of the kind of output that you might see after running the summarization pipeline on the original text is:

The model creates new text that captures the core meaning of the source. It generates concise summaries by rephrasing the original content.
 
Generating explanation
 

This summary illustrates abstractive summarization because the model does not simply copy sentences from the input. Instead, it paraphrases and condenses the original context, expressing the main ideas by using different wording and sentence structure. While the meaning is preserved, the phrasing is new.

Try modifying the input text or adjusting the parameters to see how the summary changes. You can also experiment with longer paragraphs or different writing styles to observe how the model adapts. 

The do_sample  parameter controls whether the model introduces randomness during text generation: 

  • do_sample = True : The model samples from multiple possible tokens at each step, introducing randomness. This method can produce different summaries each time you run it, while still preserving the core meaning of the text.
  • do_sample = False : The model selects the most likely next token at each step, producing deterministic output. Running the same input with the same parameters will always produce the same summary.

For example, this example is the output generated from the same input text, but with do_sample=True

Rephrasing is a technique in natural language processing that generates concise summaries. Rather than copying sentences directly, the model creates new text that captures the core meaning of the source. 

This tradeoff between consistency (do_sample=False ) and variability (do_sample=True ) is common in text generation tasks.

 

Practical limitations of abstractive summarization

While models like BART or T5 generate fluent and concise summaries, they have some practical limitations: 

  • Factual consistency and hallucination: Sometimes, models generate content that sounds plausible but isn’t supported by the original text. This phenomenon, called hallucination, is especially important in sensitive domains like medicine, finance or law.
  • Context length: Summaries might miss important details when the input text is very long.
  • Domain specificity: Models trained on general datasets might struggle with specialized texts. 

Research has proposed methods to address these issues. For example, Zhang et al. (2020) developed methods to measure and optimize factual correctness, making summaries more reliable, particularly in domains like radiology reports.7

Comparing generated outputs to a reference summary is essential, using evaluation metrics, or automatic evaluation of summaries to assess quality. Common evaluation metrics include ROUGE (measuring overlap of n-grams between generated reference summaries), BLEU (originally developed for machine translation) and METEOR (which considers synonymy and stemming). These metrics provide quantitative ways to evaluate how well the generated summary preserves content and meaning.  

Conclusion

In this notebook, we explored abstractive text summarization and how modern transformer-based models can generate concise, human-like summaries by understanding and rephrasing the original text. Using the Hugging Face pipeline API and a pretrained BART model, we were able to move from raw input text to a meaningful summary with just a few lines of code.

Unlike extractive summarization, which selects and reuses existing sentences, abstractive summarization creates new text that captures the core ideas of the source. This method makes it more flexible and natural-sounding, but also more computationally complex. By working through each step, you’ve seen how these systems work in practice. 

If you’re interested in exploring other approaches to text summarization, in particular extractive methods, check out this Python text summarization tutorial that covers classic techniques such as Luhn, LexRank and Latent Semantic Analysis (LSA). Comparing extractive and abstractive approaches side by side can help deepen your understanding of when and why each method is used.

Footnotes

1 Angeli, Gabor, Melvin Jose Johnson Premkumar, and Christopher D. Manning. “Leveraging linguistic structure for open domain information extraction.” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 344-354. 2015. https://doi.org/10.3115/v1/P15-1034. 

2 Nallapati, Ramesh, Feifei Zhai, and Bowen Zhou. 2017. “SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents.” In Proceedings of the 31st AAAI Conference on Artificial Intelligence, 3075–3081. https://doi.org/10.1609/aaai.v31i1.10958.

3 Jiang, Xinyu, Bowen Zhang, Yunming Ye, and Zhenhua Liu. “A hierarchical model with recurrent convolutional neural networks for sequential sentence classification.” In CCF International conference on natural language processing and Chinese computing, pp. 78-89. Cham: Springer International Publishing, 2019.

4 Venkataramana, Attada, K. Srividya, and R. Cristin. “Abstractive text summarization using bart.” In 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), pp. 1-6. IEEE, 2022. https://doi.org/10.1109/MysuruCon55714.2022.9972639.

5 Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. “Exploring the limits of transfer learning with a unified text-to-text transformer.” Journal of machine learning research 21, no. 140 (2020): 1-67. https://arxiv.org/abs/1910.10683.

6 Zhang, Jingqing, Yao Zhao, Mohammad Saleh, and Peter Liu. “Pegasus: Pre-training with extracted gap-sentences for abstractive summarization.” In International conference on machine learning, pp. 11328-11339. PMLR, 2020.

7 Zhang, Yuhao, Derek Merck, Emily Tsai, Christopher D. Manning, and Curtis Langlotz. 2020. “Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5108–5120. https://doi.org/10.18653/v1/2020.acl-main.458.

Vanna Winland

AI Advocate & Technology Writer

Related solutions
IBM® watsonx Orchestrate™

Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.

Explore watsonx Orchestrate
Natural language processing tools and APIs

Accelerate the business value of artificial intelligence with a powerful and flexible portfolio of libraries, services and applications.

Explore NLP solutions
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI services
Take the next step

Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.

  1. Explore watsonx Orchestrate
  2. Explore NLP solutions