Text Summarization

Types of text summarization. Source: Chauhan 2018.
Types of text summarization. Source: Chauhan 2018.

On the web, everyone can be a publisher. We're already seeing vast amounts of information being published daily in the form of restaurant/movie/book reviews, blogs, status updates, and more. In addition, traditional print publications (newspapers, magazines, technical journals, whitepapers) are also available online. It's impossible for anyone to keep track of recent publications even if limited to one domain. This is where text summarization can help.

A summary, created automatically by algorithms, typically contains the most important information. The summary should be mindful of the reader and the communication goals. It may also help the reader decide if the original text is worth reading in full. The summary can also help improve document indexing for information retrieval. An automated summary is often less biased than a human-written summary.

Discussion

  • What are some real-world applications of text summarization?

    Here are some everyday examples of text summarization: news headlines, outlines for students, movie previews, meeting minutes, biographies for resumes or obituaries, abridged versions of books, newsletter production, financial research, patent research, legal contract analysis, tweeting about new content, chatbots that answer questions, email summaries, and more.

    When Google Search presents search results, some entries are accompanied by auto-generated summaries. Google may be leveraging a knowledge graph for this purpose. Google's approach to summarization is mainly entity centric. Summarization extends to timelines and events about entities.

    Doctors write long medical notes containing nutritional information for pregnant mothers. When these were reduced to short crisp summaries, pregnant mothers found them a lot easier to understand.

  • Which are the main approaches to text summarization?
    Illustrating extractive vs abstractive summarization. Source: Adapted from Opidi 2019.
    Illustrating extractive vs abstractive summarization. Source: Adapted from Opidi 2019.

    With extractive summarization, summary contains sentences picked and reproduced verbatim from the original text. With abstractive summarization, the algorithm interprets the text and generates a summary, possibly using new phrases and sentences.

    Extractive summarization is data-driven, easier and often gives better results. Abstractive summarization is how humans tend to summarize text but it's hard for algorithms since it involves semantic representation, inference and natural language generation. Often abstractive summarization relies on text extracts.

    For extraction, sentences are scored and those with highest scores are selected. Scoring criteria may include word frequencies, location heuristics, sentence similarity, rhetorical relations, and semantic roles.

    Typically an intermediate representation is used to select relevant summary content. With topic representation, the intent is to identify the main topics in the text. Topic words, word frequencies (including TF-IDF), clustering, LSA and LDA have been applied to summarization. With indicator representation, a feature set is used to rank and select sentences. Examples of this approach are graph-based methods and machine learning.

  • What are the challenges and requirements of multi-document summarization?
    Pipeline of multi-document summarization. Source: Jurafsky and Martin 2009, fig. 23.18.
    Pipeline of multi-document summarization. Source: Jurafsky and Martin 2009, fig. 23.18.

    The pipeline for multi-document summarization (MDS) has the same basic steps as for single-document summarization (SDS): content selection, information ordering, and sentence realization. However, MDS has some unique challenges:

    • Redundancy: A single document has far less redundancy than a topically-related group of documents. Summary shouldn't repeat similar sentences. Maximal Marginal Relevance (MMR) is a scoring system to penalize similar sentences.
    • Temporal Ordering: A stream of news articles might be reporting the unfolding of an event. Summary should order them correctly and be sensitive to later developments overriding earlier ones.
    • Cohesion and Coreference: Both are important for information ordering. Sometimes cohesion might demand a certain ordering but cause coreference problems, such as a person's shortened name appearing before the full name.
    • Compression Ratio: Summarization becomes more difficult when more compression is demanded.

    MDS may cluster similar documents and passages. Summary should include sufficient context and right level of detail. Factual inconsistencies across documents can be reported. Finally, users must be allowed to filter out irrelevant content, dig deeper into the the sources via attribution, or compare related passages across documents.

  • How does text summarization vary across domains or contexts?
    IBM Science Summarizer for computer science domain. Source: Erera et al. 2019, fig. 1.
    IBM Science Summarizer for computer science domain. Source: Erera et al. 2019, fig. 1.

    Summarization must tune its output to each domain or context. For example, summarization of a news article would involve different considerations from that of a corporate sales report.

    General text summarization techniques might not do well for specific domains. Summarizers therefore might wish to use domain-specific knowledge. For legal document summarization, CaseSummarizer is a tool. In biomedical domain, summaries are created of literature, treatments, drug information, clinical notes, health records, and more.

    Summarizing scientific literature is a challenge due to length, complexity, and structure (tables and figures). IBM Science Summarizer is a tool that IBM created to summarize computer science publications. It extracts domain-specific entities of types task, dataset and metric.

    Often there are extra clues about what might be important in a document. Summarization can use these for content selection. For example, comments and discussions on a blog post point to interesting content segments. Likewise, citations in scientific papers are useful pointers. For web summarization, it's possible to look at other pages linking to a particular page and determine the most suitable sentences.

  • How has machine learning been applied to text summarization?
    Some features used by an ML classifier for text summarization. Source: Wong et al. 2008, tables 1-3.
    Some features used by an ML classifier for text summarization. Source: Wong et al. 2008, tables 1-3.

    The common ML approach is to view text summarization as a classification problem. Algorithm is trained in a supervised manner on original text, an extractive summary and a set of features. Algorithm learns to classify sentences as either summary sentences or non-summary sentences.

    Classifiers could be based on naive-Bayes, decision trees, SVM, HMM, and CRF. Often each sentence is classified independently of others. However, since HMM and CRF capture dependencies, they outperform other techniques.

    The problem with supervised algorithms is in creating labelled data for training. This problem is worse for MDS. In a semi-supervised approach, a small amount of labelled data is used along with much larger amount of unlabelled data. The algorithm learns iteratively by classifying some unlabelled data in each iteration.

  • Could you describe neural network architectures for text summarization?
    Pointer-generator network. Source: See et al. 2017, fig. 3.
    Pointer-generator network. Source: See et al. 2017, fig. 3.

    The typical approach is to do sequence-to-sequence modelling since input is a sequence of words and the summary is also a sequence of words. In an encoder-decoder architecture, the encoder uses LSTM to give an input representation. The decoder is also an LSTM that generates the output sequence. An attention layer between the encoder and the decoder helps in determining the most relevant words for the summary.

    Seq2seq models, LSTMs and attention layers have made abstractive summarization possible, even if they're not yet state-of-the-art compared to extractive summarization methods. These models are trained end-to-end without bothering to model each step of a traditional summarization pipeline. They also don't need access to specialized vocabulary or do pre-processing. This end-to-end approach has been applied successfully to short output sequences, such as news headlines or short email responses.

    In a pointer-generator network, a generator provides new words whereas a pointer copies words from source text. Seq2seq models often produce repetitive sentences. A coverage model avoids repetitions.

    Fernandes et al. showed that sequence encoders with a graph component does better at capturing long-distance relationships.

  • How do I evaluate text summarization algorithms?

    Human evaluation is the simplest. In 2004, Recall-Oriented Understudy for Gisting Evaluation (ROUGE) was created to automate evaluation by comparing against hand-crafted summaries. ROUGE-N, ROUGE-L, ROUGE-W, ROUGE-S, and ROUGE-SU are some metrics in this family.

    Different people produce different summaries of the same text. Meaning shared across different human summaries is called Summary Content Unit (SCU). With a focus on meaning, Pyramid Method evaluates a summary using SCUs.

    While there's no universal system of metrics, text summarizers are typically evaluated based on TREC, DUC and MUC systems. DUC (2001-2007) became a summarization track in TAC (2008-).

    Datasets for supervised training of MDS algorithms are not common. For summarizing a single or a few documents, commonly used datasets are Gigaword, CNN/DailyMail, TAC (2008-2011) and DUC (2003-2004). ELI5 and WikiSum can be used for longform question answering and MDS respectively. Opinosis is a dataset of 51 article-summary pairs.

    Released in 2018, Cornell Newsroom is the largest dataset for training and evaluating summarization systems. Spanning 1998-2017 and containing 1.3 million articles, it's been collected from newsrooms of 38 major publications. Summaries are obtained from search and social metadata.

  • What are some useful resources for text summarization?
    MDSWriter is a useful annotation tool for multi-document summarization. Source: Meyer et al. 2016, fig. 1.
    MDSWriter is a useful annotation tool for multi-document summarization. Source: Meyer et al. 2016, fig. 1.

    Pengfei Liu has curated a useful list of datasets, research papers, and groups researching on text summarization.

    In Python, Gensim has a module for text summarization, which implements TextRank algorithm. An original implementation of the same algorithm is available as PyTextRank package. PyTeaser is a Python implementation of Scala's TextTeaser.

    Back in 2016, Google released a baseline TensorFlow implementation for summarization.

Milestones

Apr
1958
Ignore too common words and least frequent words. Source: Luhn 1958, fig. 1.
Ignore too common words and least frequent words. Source: Luhn 1958, fig. 1.

Luhn makes use of word frequencies to determine sentences most significant for summarization. Frequently occurring words close to one another suggest significant sentences. Thresholds are set to ignore most frequent and least frequent words. For example, in biology, the word 'cell' is too common and can be ignored. Luhn's algorithm, extractive in nature, is simple in that it doesn't merge word variations (differ, different, differently).

Apr
1969

In addition to word frequencies, Edmundson makes use of pragmatic or cue words, title and heading words, and structural indicators such as sentence location. He notes that these improve text extraction. Example cue words are 'significant', 'impossible' and 'hardly'. They're classified are positively relevant, negatively relevant and irrelevant. He hypothesizes that significant sentences or paragraphs occur very early and very late in the section or document. He also observes that future algorithms must consider language syntax and semantics. Statistical evidence alone is inadequate.

1995

Kupiec et al. implements a supervised machine learning algorithm based on the naive-Bayes classifier. Algorithm is trained on hand-selected extracts. The features considered include sentence length cut-off, fixed-phrase, paragraph, thematic word, and uppercase word. For example, the model ignores short sentences. It picks out thematic words, proper names and acronyms. Words such as 'conclusions', 'summary' or 'discussion' are more likely to be in the summary.

Dec
1997
Tree as an abstraction of discourse structure. Source: Marcu 1997, fig. 2.1.
Tree as an abstraction of discourse structure. Source: Marcu 1997, fig. 2.1.

For his PhD thesis on text summarization, Marcu takes inspiration from Rhetorical Structure Theory (RST). He looks at the rhetorical relation between two non-overlapping text spans called nucleus and satellite. Examples of such relations are justification, evidence, restatement, and concession. Text is decomposed into smaller units connected by rhetorical relations. In the example, Justification is the relation between Mars weather and its distant orbit.

Apr
2000
An overview of clustering for text summarization. Source: Kumar et al. 2016, fig. 4.
An overview of clustering for text summarization. Source: Kumar et al. 2016, fig. 4.

Radev et al. propose centroid-based summarization for multi-document summarization. Similar documents and sentences are grouped into clusters. Each cluster may represent a different sub-topic. Cluster centroid is a pseudo document representative of the cluster. Summary would include sentences similar to the centroids.

Oct
2000
Multi-document graph. Source: Radev 2000, fig. 4.
Multi-document graph. Source: Radev 2000, fig. 4.

Since RST is limited to single documents, Radev introduces Cross-document Structure Theory (CST) for multi-document summarization. He proposes multi-document graphs as a useful abstraction to represent relations at word, phrase, paragraph and document levels. He identifies 24 cross-document relations, such as Identity (same text), Subsumption (one sentence is contained in another), and Follow-up (additional information reflecting new developments). Summarization is done in four steps: clustering, document structure analysis, link analysis, and personalized graph-based summarization.

May
2004

Barzilay and Lee propose a domain-sensitive content model. They use Hidden Markov Model (HMM) in which domain topics are the states and generates sentences relevant to that topic. State transitions model topic change. An n-gram model is used to generate sentences. This model jointly learns both content selection and information ordering.

Jul
2004

Inspired by Google's PageRank algorithm, Mihalcea proposes TextRank, a graph-based algorithm. Each sentence is a node in the graph. Edges correspond to sentence similarities using a metric such as cosine similarity. A weighted graph is constructed from the text. A ranking algorithm (such as HITS, POS or PageRank) is run on the graph. Graph nodes with the best scores are selected for the summary.

2006

Wu proposes event-based summarization. Event terms could be verbs (incorporate) or action nouns (incorporation). Event elements are typically named entities (Person, Organisation, Location, Time). Document is represented as an event map on which PageRank algorithm is employed. The work of Li et al. is also event-based and it looks at intra-event and inter-event relevance.

Sep
2015

Rush et al. apply neural networks for abstractive summarization. Previous work on abstractive summarization relied on linguistic constraints or syntactic transformations. The proposed approach applies a neural language model along with an attention-based input encoder. They experiment with three different encoders: bag-of-words, convolutional (TDNN) and attention-based. The model using attention-based encoder performs best. Experiments are limited to headline generation based on only the first sentence. The model is trained on English Gigaword corpus. This work is improved by many others in 2016.

Aug
2016
Hierarchical encoder with hierarchical attention. Source: Nallapati et al. 2016, fig. 3.
Hierarchical encoder with hierarchical attention. Source: Nallapati et al. 2016, fig. 3.

Nallapati et al. use an attentional encoder-decoder RNN for abstractive summarization. Input embedding is feature-rich with word, POS, NER, TF, and IDF. A pointer-generator model handles rare or OOV words. The attention mechanism is hierarchical at word and sentence levels. Since existing datasets are limited to single sentence summaries, they present a new dataset from CNN/DailyMail news stories with an average of 53 words and 3.72 sentences in the summaries. This work establishes a baseline for abstractive summarization of long texts.

Jan
2018
Original self-attention decoder (left) and its modified versions. Source: Liu et al. 2018, fig. 1.
Original self-attention decoder (left) and its modified versions. Source: Liu et al. 2018, fig. 1.

As an exercise in multi-document summarization, Liu et al. attempt to generate Wikipedia articles. In the extractive stage, they select the most important content tokens. For the abstractive stage, they use a scalable decoder-only transformer architecture in which input and output sequences are combined into a single sequence. To make it scale for longer sequences, they introduce memory-compressed attention and local attention. The final model has five layers alternating between memory-compressed and local attention.

Oct
2019
Use of a knowledge graph and attention to generate answer to a question. Source: Fan et al. 2019, fig. 5.
Use of a knowledge graph and attention to generate answer to a question. Source: Fan et al. 2019, fig. 5.

Fan et al. show that using knowledge graph representations of the text as input to a seq2seq model gives better performance. The graph is linearized before it's given to a transformer encoder. Graph construction involves merging nodes and resolving coreferences.

Sep
2019
Architecture of BERTSUM. Source: Liu 2019, fig. 1.
Architecture of BERTSUM. Source: Liu 2019, fig. 1.

Liu proposes BERTSUM, a modification of BERT for summarization. The model encodes multiple sentences as a single input sequence. Interval segment embeddings are use to distinguish the sentences. For fine-tuning and capturing document-level features, he tries different summarization layers: simple classifier, RNN, inter-sentence transformer. He finds that two-layer inter-sentence transformer performs best.

References

  1. Allahyari, Mehdi, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut. 2017. "Text Summarization Techniques: A Brief Survey." arXiv, v3, July 28. Accessed 2020-02-20.
  2. Barzilay, Regina, and Lillian Lee. 2004. "Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization." Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004, pp. 113-120, May. Accessed 2020-02-20.
  3. Brownlee, Jason. 2017. "A Gentle Introduction to Text Summarization." Machine Learning Mastery, August 7. Accessed 2020-02-20.
  4. Chauhan, Kushal. 2018. "Unsupervised Text Summarization using Sentence Embeddings." Jatana, on Medium, August 6. Accessed 2020-02-20.
  5. DUC. 2014. "Document Understanding Conferences: Homepage." NIST, September 9. Accessed 2020-02-20.
  6. Das, Dipanjan, and André F. T. Martins. 2007. "A Survey on Automatic Text Summarization." Carnegie Mellon University, November 21. Accessed 2020-02-20.
  7. Edmundson, H. P. 1969. "New Methods in Automatic Extracting." Journal of the ACM, vol. 16, no. 2, pp. 264-285, April. doi:10.1145/321510.321519. Accessed 2020-02-20.
  8. Erera, Shai, Michal Shmueli-Scheuer, Guy Feigenblat, Ora Peled Nakash, Odellia Boni, Haggai Roitman, Doron Cohen, Bar Weiner, Yosi Mass, Or Rivlin, Guy Lev, Achiya Jerbi, Jonathan Herzig, Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Francesca Bonin, and David Konopnicki. 2019. "A Summarization System for Scientific Documents." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 211-216, November. Accessed 2020-02-20.
  9. Fan, Angela, Claire Gardent, Chloe Braud, and Antoine Bordes. 2019. "Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs." arXiv, v1, October 18. Accessed 2020-02-20.
  10. Fernandes, Patrick, Miltiadis Allamanis, and Marc Brockschmidt. 2019. "Structured Neural Summarization." arXiv, v2, February 20. Accessed 2020-02-20.
  11. Goldstein, Jade, Vibhu Mittal, Jaime Carbonell, and Mark Kantrowitz. 2000. "Multi-Document Summarization By Sentence Extraction." NAACL-ANLP 2000 Workshop: Automatic Summarization. Accessed 2020-02-20.
  12. Jurafsky, Daniel, and James H. Martin. 2009. "Question Answering and Summarization." Chapter 23 in: Speech and Language Processing, Second Edition, Prentice-Hall, Inc. Accessed 2020-02-20.
  13. Kumar, Yogan Jaya, Ong Sing Goh, Halizah Basiron, Ngo Hea Choon, and Puspalata C Suppiah. 2016. "A Review on Automatic Text Summarization Approaches." J. of Comp. Sci., Science Publications, April 29. Accessed 2020-02-20.
  14. Kupiec, Julian, Jan Pedersen, and Francine Chen. 1995. "A trainable document summarizer." SIGIR '95: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 68-73, July. doi:10.1145/215206.215333. Accessed 2020-02-20.
  15. Lebanoff, Logan, Kaiqiang Song, and Fei Liu. 2018. "Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization." arXiv, v2, August 28. Accessed 2020-02-20.
  16. Li, Wenchen. 2017. "Text summarization: applications." Medium, May 25. Accessed 2020-02-20.
  17. Li, Wenjie, Mingli Wu, Qin Lu, Wei Xu, and Chunfa Yuan. 2006. "Extractive Summarization using Inter- and Intra- Event Relevance." Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 369-376, July. Accessed 2020-02-20.
  18. Liu, Yang. 2019. "Fine-tune BERT for Extractive Summarization." arXiv, v2, September 5. Accessed 2020-02-20.
  19. Liu, Pengfei. 2020. "Modern History for Text Summarization." NLP Historiograpy. Accessed 2020-02-20.
  20. Liu, Peter, and Xin Pan. 2016. "Text summarization with TensorFlow." Google AI Blog, August 24. Accessed 2020-02-20.
  21. Liu, Peter J., Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer. 2018. "Generating Wikipedia by Summarizing Long Sequences." arXiv, v1, January 30. Accessed 2020-02-20.
  22. Luhn, H. P. 1958. "The automatic creation of literature abstracts." IBM Journal of Research and Development, pp. 159-165, April. doi:10.1147/rd.22.0159. Accessed 2020-02-20.
  23. Marcu, Daniel. 1997. "The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts." PhD Thesis, University of Toronto, December. Accessed 2020-02-20.
  24. Mathur, Pranay, Aman Gill, and Aayush Yadav. 2017. "Text Summarization in Python: Extractive vs. Abstractive techniques revisited." Rare Technologies, April 5. Accessed 2020-02-20.
  25. Meyer, Christian M., Darina Benikova, Margot Mieskes, and Iryna Gurevych. 2016. "MDSWriter: Annotation Tool for Creating High-Quality Multi-Document Summarization Corpora." Proceedings of ACL-2016 System Demonstrations, pp. 97-102, August. Accessed 2020-02-20.
  26. Mihalcea, Rada. 2004. "Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization." Proceedings of the ACL Interactive Poster and Demonstration Sessions, pp. 170-173, July. Accessed 2020-02-20.
  27. Moradi, Milad, and Nasser Ghadiri. 2019. "Text Summarization in the Biomedical Domain." arXiv, v1, August 6. Accessed 2020-02-20.
  28. Nallapati, Ramesh, Bowen Zhou, Cicero dos Santos, Çağlar Gu̇lçehre, and Bing Xiang. 2016. "Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond." Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, ACL, pp. 280-290, August. Accessed 2020-02-20.
  29. Opidi, Alfrick. 2019. "A Gentle Introduction to Text Summarization in Machine Learning." Blog, FloydHub, April 15. Accessed 2020-02-20.
  30. Pai, Aravind. 2019. "Comprehensive Guide to Text Summarization using Deep Learning in Python." Blog, Analytics Vidhya, June 10. Accessed 2020-02-20.
  31. Pawar, Manish. 2018. "Ai Text Summarizer." Medium, November 20. Accessed 2020-02-20.
  32. Polsley, Seth, Pooja Jhunjhunwala, and Ruihong Huang. 2016. "CaseSummarizer: A System for Automated Summarization of Legal Texts." Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pp. 258-262, December. Accessed 2020-02-20.
  33. Radev, Dragomir. 2000. "A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure." 1st SIGdial Workshop on Discourse and Dialogue, ACL, pp. 74-83, October. Accessed 2020-02-20.
  34. Radev, Dragomir R., Hongyan Jing, and Malgorzata Budzikowska. 2000. "Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies." NAACL-ANLP 2000 Workshop: Automatic Summarization, v2, April. Accessed 2020-02-20.
  35. Ratia, Tomas. 2018. "20 Applications of Automatic Summarization in the Enterprise." Blog, Frase, July 17. Accessed 2020-02-20.
  36. Rush, Alexander M., Sumit Chopra, and Jason Weston. 2015. "A Neural Attention Model for Abstractive Sentence Summarization." Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379-389, September. Accessed 2020-02-20.
  37. See, Abigail, Peter J. Liu, and Christopher D. Manning. 2017. "Get To The Point: Summarization with Pointer-Generator Networks." arXiv, v2, April 25. Accessed 2020-02-20.
  38. Wong, Kam-Fai, Mingli Wu, and Wenjie Li. 2008. "Extractive Summarization Using Supervised and Semi-Supervised Learning." Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 985-992, August. Accessed 2020-02-20.
  39. Wu, Mingli. 2006. "Investigations on Event-Based Summarization." Proceedings of the COLING/ACL 2006 Student Research Workshop, pp. 37-42, July. Accessed 2020-02-20.
  40. i2 Decisions. 2019. "Text Summarization." Case Studies, i2 Decisions, April 5. Updated 2019-05-21. Accessed 2020-02-20.

Further Reading

  1. Jurafsky, Daniel and James H. Martin. 2009. "Question Answering and Summarization." Chapter 23 in: Speech and Language Processing, Second Edition, Prentice-Hall, Inc. Accessed 2020-02-20.
  2. Allahyari, Mehdi, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut. 2017. "Text Summarization Techniques: A Brief Survey." arXiv, v3, July 28. Accessed 2020-02-20.
  3. Das, Dipanjan, and André F. T. Martins. 2007. "A Survey on Automatic Text Summarization." Carnegie Mellon University, November 21. Accessed 2020-02-20.
  4. Pai, Aravind. 2019. "Comprehensive Guide to Text Summarization using Deep Learning in Python." Blog, Analytics Vidhya, June 10. Accessed 2020-02-20.
  5. Paulus, Romain, Caiming Xiong, and Richard Socher. 2020. "Your TL;DR by an AI: A Deep Reinforced Model for Abstractive Summarization." Salesforce Einstein, Salesforce. Accessed 2020-02-20.
  6. Chauhan, Kushal. 2018. "Unsupervised Text Summarization using Sentence Embeddings." Jatana, on Medium, August 6. Accessed 2020-02-20.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
2
0
2438
2477
Words
2
Likes
14K
Hits

Cite As

Devopedia. 2020. "Text Summarization." Version 2, February 21. Accessed 2024-06-25. https://devopedia.org/text-summarization
Contributed by
1 author


Last updated on
2020-02-21 17:22:09

Improve this article

Article Warnings

  • Readability score of this article is below 50 (45.9). Use shorter sentences. Use simpler words.