Natural Language Generation

NLG and some of its sub-fields. Source: Santhanam and Shaikh 2019, fig. 1.
NLG and some of its sub-fields. Source: Santhanam and Shaikh 2019, fig. 1.

There's a lot of structured data that's perhaps easier to understand if described in a natural language. Highlights from a financial spreadsheet, next week's weather prediction, and short summary of a long technical report are some examples. Natural Language Generation (NLG) is the process of generating descriptions or narratives in natural language from structured data.

NLG is a sub-field of Natural Language Processing (NLP). NLG often works closely with Natural Language Understanding (NLU), another sub-field of NLP. Where input is unstructured text, NLU helps in producing a structured representation that can be consumed by NLG. Generating language uses more of our brain than understanding it. Likewise, computers might find NLG a more difficult task than NLU.

A mature NLG system can free humans from mundane writing, create narratives quickly, enable almost real-time reporting, and streamline operations.

Discussion

  • What are some applications of NLG?
    When comparing products, NLG is used to explain recommendations. Source: CoGenTex 2020.
    When comparing products, NLG is used to explain recommendations. Source: CoGenTex 2020.

    There are plenty of practical NLG applications: analysis for business intelligence dashboards, IoT device status and maintenance reporting, individual client financial portfolio summaries, personalized customer communications, and more. NLG, along with NLU, is at the core of chatbots and voice assistants.

    A familiar example might be Gmail's Smart Compose. It reads email content and suggests short responses. The Associated Press uses NLG to generate thousands of corporate earnings reports in seconds. During the December 2019 UK elections, BBC News published 689 local stories of 100K words in 10 hours. Public datasets were used at the input and articles were generated in a style and tone suited for local audiences.

    For computer science domain, NLG has been used to write specifications from UML diagrams, or describe source code changes.

    Forge.ai processes unstructured data using mainly supervised NLU models. However, training data is limited. They overcame this by using NLG to synthesize training data. They used human-annotated examples plus knowledge sources. Similarly, synthesized electronic health records can enable de-identified data sharing among healthcare providers and train ML models.

  • What's expected of a good NLG system?
    Overview of an NLG system. Source: Bateman and Zock 2005, fig. 15.1.
    Overview of an NLG system. Source: Bateman and Zock 2005, fig. 15.1.

    NLG must include in its response information that's most relevant to the user in the current context. This includes the level of detail that's to be included. For effective communication, information must be presented in a sensible order. Thus, organization and structure are important. One sentence must follow logically from another. Likewise, NLG must know when to group them into paragraphs or even sections.

    NLG must conform to the syntax of the language. In addition, it's a good idea to use common expressions. A sophisticated system could be trainable, domain-independent, and even capable of sarcasms and idiomatic expressions.

    Style matters. If the intent is to present high-level financial summary to top management, the writing style must be formal, concise and fact-based. If the intent is to convince readers of a certain point of view, the style could be argumentative with references to supporting evidence.

    On the whole, NLG is about making choices and pursuing specific communicative goals.

  • What's the typical pipeline in an NLG application?
    Workings of the Arria NLG Engine. Source: Arria 2016.

    A typical NLG pipeline has these basic steps:

    • Content Determination and Text Planning: Determine the information to be communicated. Structure the information. It's also called Macro Planning or Document Planning. Information could come from a knowledge base. Selection of information must consider goals and preferences of both writer and reader.
    • Sentence Planning: Decide how information must be split into sentences and paragraphs. Plan for a flowing narrative. Also called Micro Planning, this involves techniques such as referring expressions, aggregation, lexicalization and grammaticalization.
    • Surface Realization: Generate individual sentences in a grammatically correct manner. This involves syntax selection and inflection.
    • Physical Presentation: The output could be written or spoken text. Either way, this step does articulation, punctuation and layout.
  • Could you describe the main components or tasks in NLG?
    Illustrating a few components of NLG in medical domain. Source: Gatt and Krahmer 2018, fig. 1.
    Illustrating a few components of NLG in medical domain. Source: Gatt and Krahmer 2018, fig. 1.

    We describe a few NLG components:

    • Aggregation: Combining two sentences into one using a conjunction, such as "Sam has high blood pressure and low blood sugar." Aggregation could also be applied to paragraphs and higher-order structures.
    • Lexicalization: This is about word choice. Between 'depart' and 'leave', "the train departed" is more formal than "the train left".
    • Referring Expression Generation: Using preceding context, select words or phrases to identify domain entities. An example is the use of pronouns, such as "I just saw Mrs. Black. She has a high temperature."
    • Using Discourse Markers: The inclusion of the word 'also', makes this sentence more fluent: "If Sam goes to the hospital, he should also go to the store."
    • Linguistic Realization: Consider the sentence "There are 20 trains each day from Aberdeen to Glasgow." Following syntactic rules, NLG has added function words 'from' and 'to'. Due to morphology, the plural form of 'train' has been used. Due to orthography, the first word is capitalized and full stop added at the end.
  • Which are main technical approaches to building NLG systems?
    Illustrating the use of Markov Chain in NLG. Source: Dejeu 2017.
    Illustrating the use of Markov Chain in NLG. Source: Dejeu 2017.

    We've two approaches to NLG:

    • Template-based: Texts have pre-defined structure with gaps. Gaps are filled with data. These systems evolved to allow greater control via scripts and business rules. These developed further to include linguistic capability to generate grammatically correct text.
    • Dynamic Creation: At a micro level, sentences are created dynamically from semantic representations and a desired linguistic structure. At a macro level, sentences are organized into a logical narrative as suited for the audience and the purpose of communication.

    For dynamic creation of text, a Markov Chain model was initially applied. Given the current word and its relationships with other words, this model predicts or generates the next word. However, such models often lacked structure and context.

    More recently, Language Modelling has been applied to NLG. Given recent words, the model predicts the next word. An n-gram model is a specific way to build a language model. Later, neural networks were successfully applied to language modelling.

    Some classify NLG into Basic NLG, Templated NLG and Advanced NLG.

  • How have neural networks been applied to build NLG models?
    BERT used for question generation, one token at a time. Source: Chan and Fan 2019, fig. 2.
    BERT used for question generation, one token at a time. Source: Chan and Fan 2019, fig. 2.

    RNNs, LTSMs and transformer networks have been used for NLG models. RNNs exploit the sequential nature of text and "remember" previous words to predict the next word. LSTMs are better than RNNs at remembering longer word sequences. They also have a mechanism to selectively forget when context changes.

    Transformers have become popular due to their state-of-the-art performance. They look at the relationships among words in context. All words are represented individually in the vector space instead of reducing them to a single fixed-length vector. Transformers have been used to build language models and thereby enable NLG. Two popular ones are BERT and GPT-2. However, it's been remarked that GPT-2 lacks language understanding.

    Neural networks typically take an end-to-end approach to NLG. We could use an encoder-decoder architecture along with an attention layer. The encoder produces a hidden representation of the input text. The decoder generates the text. Compared to template-based systems, neural network models give less control but are more flexible and expressive.

  • How can I evaluate NLG models?
    Possible evaluation criteria and methods for NLG. Source: Gatt and Krahmer 2018, fig. 8.
    Possible evaluation criteria and methods for NLG. Source: Gatt and Krahmer 2018, fig. 8.

    Evaluating NLG systems is a challenge in itself and this has been addressed since the mid-1990s. Evaluations are useful to assess the underlying theory of an NLG system, compare two NLG systems, or determine if a particular is showing improvements. In black box evaluation, we evaluate the entire system. In glass box evaluation, we evaluate the system's component parts such as document structuring or aggregation.

    Evaluations should look at accuracy (conveying the desired meaning) and fluency (text flows and is readable). These two factors are not necessarily correlated. Evaluations can be by humans or automated. Multiple methods and metrics are preferred.

    Evaluation metrics can be word-based (TER, BLEU, ROUGE), grammar-based (readability, spelling errors, characters per word, etc.) or measure semantic similarity. Another way to categorize the metrics is n-gram overlap (BLEU, GTM, CIDEr), content overlap (MASI, SPICE) and string distance (Levenshtein, TER). BLEU has been criticized by researchers.

    The E2E NLG Challenge (2017-18) used five metrics: BLEU, NIST, METEOR, ROUGE-L, and CIDEr; plus RankME. Rank-based Magnitude Estimation (RankME) enables reliable and consistent human evaluation.

  • Could you describe some tools to help with NLG?

    Among the commercial tools are Arria NLG, AX Semantics, Yseop, Quill, and Wordsmith. Among the open source tools are SimpleNLG and NaturalOWL. Wordsmith is from Automated Insights, who can be regarded as a pioneer in NLG. Quill, from Narrative Science, makes use of deep learning.

    Some NLG platforms (such as Wordsmith) can be integrated with business intelligence platforms so that dashboards can be enhanced with descriptive explanations or highlights.

Milestones

1950

During the 1950s and 1960s, NLG is not seen as a dedicated field of study but as a component of machine translation systems. It's only in the 1970s that researchers look into NLG for the purpose of communication. In the 1980s, NLG becomes a dedicated field of research. Also in the 1950s, the approach is template-based and therefore NLG produces the same output to all listeners in all situations.

Nov
1970
Example of sentence form determination. Source: Simmons and Slocum 1970, table 6.
Example of sentence form determination. Source: Simmons and Slocum 1970, table 6.

Simmons and Slocum adapt semantic networks to the task of NLG. Nodes are word senses. Connecting paths are relations. Essentially, the network defines the grammar as ordered sets of syntactic transformations. The grammar includes voice, form, aspect, tense, mood, and so on. This is used by the algorithm when generating text. They implement this in LISP.

1974
Discource production: game commentary on a sequence of moves. Source: Davey 1974, sec. 2.5.
Discource production: game commentary on a sequence of moves. Source: Davey 1974, sec. 2.5.

Davey makes the case that production (NLG) is not a reverse process of comprehension (NLU). While language syntax may benefit comprehension, it's not sufficient for discourse-based production. From a given semantic representation, the form that the sentence would take is unpredictable.

1992

As one of the earliest commercial applications of NLG, Goldberg et al. describe a system that generates bilingual weather reports from graphical weather predictions.

Mar
1997
Architecture of an NLG system. Source: Reiter and Dale 1997, fig. 3.
Architecture of an NLG system. Source: Reiter and Dale 1997, fig. 3.

Reiter and Dale note that the most common architecture for NLG is a three-stage pipeline. They describe each stage along with essential tasks that belong to each stage. They give guidelines to developers to build practical NLG systems. They also note that NLG may not be the right approach for some use cases. Instead, present information graphically, use mail-merge systems or employ human authors.

Sep
1989
The same event is described differently by different stakeholders. Source: Hovy 1990, table 2.
The same event is described differently by different stakeholders. Source: Hovy 1990, table 2.

Hovy notes that beyond the literal meaning of words, we often tune our communication in relation to the listener, the situation, or interpersonal goals. Hovy therefore studies the importance of pragmatics. NLG has often asked two questions: "what shall I say?" and "how shall I say it?" NLG should also concern itself with "why should I say it?" Hovy also implements a program called PAULINE, which he acknowledges being primitive and much more research needs to be done.

1993
Word representations stored in the lexicon are learned via a neural network. Source: Miikkulainen 2002, fig. 3.
Word representations stored in the lexicon are learned via a neural network. Source: Miikkulainen 2002, fig. 3.

Miikkulainen proposes DISCERN as a sub-symbolic neural network model for NLG. It can create summaries or answer questions. The system includes parsers, at sentence level and story level. For generation, there's a story generator and a sentence generator. These are helped by a memory block and a lexicon. Distributed real-valued vectors represent words both at lexical and semantic levels. Representations are based on thematic case roles. Other early neural networks for NLG from the 1990s include ANA and FIG.

Jun
2000

The first International Natural Language Generation Conference (INLG) is held, as a continuation of international workshops on NLG held regularly during 1982-1998. INLG becomes an almost biennial event. Full papers of these conferences are available online.

2007

For college basketball in the US, StatSheet is launched to provide game previews, injury reports, and recaps. Much of this is generated automatically. Soon it becomes the top website for college basketball statistics. In 2011, StatSheet changes its name to Automated Insights and partners with NFL for automated content production.

2013
Screenshot of a demo showing generated text (left panel) in retail promotional analysis. Source: Automated Insights 2020.
Screenshot of a demo showing generated text (left panel) in retail promotional analysis. Source: Automated Insights 2020.

NLG enters the fields of data analysis and business intelligence. A particular example is Allstate Industries adopting NLG. In 2015, Gartner names NLG as an official technology space. It predicts that by 2020, NLG will be a standard feature in 90% on these platforms. By 2016, NLG gets integrated into many BI and data visualization platforms such as Tableau, MicroStrategy and Power BI.

Jun
2016
Data counterfeiting used for multi-domain NLG. Source: Wen et al. 2016, fig. 1.
Data counterfeiting used for multi-domain NLG. Source: Wen et al. 2016, fig. 1.

By exploiting similarities across domains, Wen et al. propose a method to train models for multi-domain NLG. Synthesized out-of-domain training data is created by data counterfeiting. Then the model is fine-tuned using a much smaller dataset of in-domain utterances. Semantically Conditioned Long Short-Term Memory (SC-LSTM) network is used. Reading gates control semantic input features presented to the network.

Aug
2016

Researchers show that when users are presented with uncertain data, NLG improves decision making. They use graphs of weather forecasts with likelihoods. Decisions are better by 24% with NLG, and by 44% when NLG is combined with graphs. The improvement is as high a 87% for women.

Mar
2017
MR-Reference example and restaurant domain ontology. Source: Dušek et al. 2018, sec. 2.1.
MR-Reference example and restaurant domain ontology. Source: Dušek et al. 2018, sec. 2.1.

Researchers initiate the E2E NLG Challenge. Whereas there are many distinct steps in traditional text generation method, end-to-end generation attempts to combine all steps into a single model. Such a model would be trained on vast amounts of annotated text. For the Challenge, a crowdsourced dataset of 50k instances in the restaurant domain is used. Training data consists of meaning representations (MRs) and corresponding reference texts. In 2018, from 62 submissions, a seq2seq-based ensemble model with re-ranker is judged the overall winner.

References

  1. AISmartz. 2019. "The Past and the Present of Natural Language Generation." Blog, AISmartz, August 22. Accessed 2020-02-18.
  2. Arria. 2016. "How the Arria NLG Engine works - Data in, Language out." Arria NLG Plc, on YouTube, December 2. Accessed 2020-02-18.
  3. Arria NLG. 2019. "Arria Natural Language Generation Technology Expands BBC's Coverage of UK Elections." PR Newswire, December 17. Accessed 2020-02-18.
  4. Automated Insights. 2018. "The Ultimate Guide to Natural Language Generation." Automated Insights, on Medium, January 31. Accessed 2020-02-18.
  5. Automated Insights. 2020. "Demo: Promotional Analysis." Wordsmith Demo, Automated Insights. Accessed 2020-02-18.
  6. Bateman, John and Michael Zock. 2005. "Natural Language Generation." Chapter 15 in: Mitkov, R. (ed), The Oxford Handbook of Computational Linguistics (1 ed.), January. Accessed 2020-02-18.
  7. Chan, Ying-Hong, and Yao-Chung Fan. 2019. "BERT for Question Generation." Proceedings of the 12th International Conference on Natural Language Generation, pp. 173-177, October-November. Accessed 2020-02-18.
  8. CoGenTex. 2020. "Recommender." CoGenTex. Accessed 2020-02-18.
  9. Dale, Robert. 1995. "An Introduction to Natural Language Generation." ESSLI, Macquarie University, Sydney. Accessed 2020-02-18.
  10. Davey, Anthony. 1974. "The Formalisation Of Discourse Production." PhD Thesis, University of Edinburgh. Accessed 2020-02-19.
  11. Dejeu, Alexander. 2017. "From “What is a Markov Model” to “Here is how Markov Models Work”." Hackernoon, January 8. Accessed 2020-02-18.
  12. Dušek, Ondřej, Jekaterina Novikova, and Verena Rieser. 2018. "Findings of the E2E NLG Challenge." arXiv, v1, October 2. Accessed 2020-02-18.
  13. Gatt, Albert and Emiel Krahmer. 2018. "Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation." Journal of Artificial Intelligence Research, vol. 61, pp. 65-170, January. Accessed 2020-02-18.
  14. Gkatzia, Dimitra, Oliver Lemon, and Verena Rieser. 2016. "Natural Language Generation enhances human decision-making with uncertain information." arXiv, v2, August 15. Accessed 2020-02-18.
  15. Glascott, Mary Grace. 2017. "Defined: Natural Language Generation." Narrative Science, on Medium, September 29. Accessed 2020-02-18.
  16. Goldberg, E., N. Driedger, and R. I. Kittredge. 1994. "Using natural-language processing to produce weather forecasts." IEEE Expert, vol. 9, no. 2, pp. 45-53, April. Accessed 2020-02-18.
  17. Hammervold, Kathrine. 2001. "Neural Networks and Sentence Generation." Universitetet i Bergen, November 20. Accessed 2020-02-18.
  18. Hovy, Eduard H. 1990. "Pragmatics and natural language generation." Artificial Intelligence, Elsevier, vol. 43, no. 2, pp. 153-197, May. Accessed 2020-02-18.
  19. INLG. 2000. "Proceedings of the First International Conference on Natural Language Generation." Association for Computational Linguistics, June 12-16. Accessed 2020-02-18.
  20. Joshi, Naveen. 2019. "The state of the art in natural language generation." Blog, Allerin, April 9. Accessed 2020-02-18.
  21. Kaput, Mike. 2020. "The Beginner’s Guide to Using Natural Language Generation to Scale Content Marketing." Blog, Marketing AI Institute, January 16. Accessed 2020-02-18.
  22. Lee, Scott. 2018. "Natural Language Generation for Electronic Health Records." arXiv, v1, June 1. Accessed 2020-02-18.
  23. Loyola, Pablo, Edison Marrese-Taylor, and Yutaka Matsuo. 2017. "A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes." arXiv, v1, April 17. Accessed 2020-02-18.
  24. Mairesse, François. 2006. "Natural Language Generation." University of Sheffield, April 28. Accessed 2020-02-18.
  25. Marcus, Gary. 2020. "GPT-2 and the Nature of Intelligence." The Gradient, January 25. Accessed 2020-02-18.
  26. Mellish, C. and R. Dale. 1998. "Evaluation in the context of natural language generation." Computer Speech and Language, Academic Press, vol. 12,pp. 349-373. Accessed 2020-02-18.
  27. Meziane, F., N. Athanasakis, and S. Ananiadou. 2008. "Generating Natural Language Specifications From UML Class Diagrams." University of Salford, Manchester. Accessed 2020-02-18.
  28. Miikkulainen, Risto. 2002. "Text and Discourse Understanding: The DISCERN System." In: Dale, R., H. Moisl and H. Somers (eds), A Handbook of Natural Language Processing: Techniques and Applications for the Processing of Language as Text, pp. 905-919. Accessed 2020-02-18.
  29. Neely, Jake. 2018. "How We’re Using Natural Language Generation to Scale." Hackernoon, April 11. Accessed 2020-02-18.
  30. Novikova, Jekaterina, Ondřej Dušek, and Verena Rieser. 2017. "Data-driven Natural Language Generation: Paving the Road to Success." arXiv, v1, June 28. Accessed 2020-02-18.
  31. Novikova, Jekaterina, Ondřej Dušek, and Verena Rieser. 2018. "RankME: Reliable Human Ratings for Natural Language Generation." arXiv, v1, March 15. Accessed 2020-02-18.
  32. Pressman, Laura. 2018. "The history of natural language generation." Blog, Automated Insights, November 8. Accessed 2020-02-18.
  33. Reiter, Ehud. 1996. "Building Natural Language Generation Systems." arXiv, v1, May 2. Accessed 2020-02-18.
  34. Reiter, Ehud. 2018. "Where is NLG Most Successful Commercially?" Blog, October 30. Accessed 2020-02-18.
  35. Reiter, Ehud, and Robert Dale. 1997. "Building applied natural language generation systems." Natural Language Engineering, Cambridge University Press, vol. 3, no. 1, pp. 57-87, March. Accessed 2020-02-18.
  36. Retresco. 2019a. "How does Natural Language Generation work?" Retresco GmbH, March 4. Accessed 2020-02-18.
  37. Retresco. 2019b. "NLG research: Does text generation using neural networks work? And how?" Retresco GmbH, June 25. Accessed 2020-02-18.
  38. Santhanam, Sashank, and Samira Shaikh. 2019. "A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions." arXiv, v1, June 2. Accessed 2020-02-18.
  39. SciForce. 2019. "A Comprehensive Guide to Natural Language Generation." SciForce, on Medium, July 4. Accessed 2020-02-18.
  40. Simmons, R. F. and J. Slocum. 1970. "Generating English discourse from semantic networks." Technical Report No. NL-3, University of Texas, November. Accessed 2020-02-18.
  41. Sunnak, Abhishek, Sri Gayatri Rachakonda, and Oluwaseyi Talabi. 2019. "Evolution of Natural Language Generation." SFU Big Data Science, on Medium, March 16. Accessed 2020-02-18.
  42. Tatman, Rachael. 2019. "Evaluating Text Output in NLP: BLEU at your own risk." Towards Data Science, on Medium, January 16. Accessed 2020-02-20.
  43. Wen, Tsung-Hsien, Milica Gašić, Nikola Mrkšić, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, and Steve Young. 2016. "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems." Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 120-129, June. Accessed 2020-02-18.
  44. Xie, Ziang. 2018. "Neural Text Generation: A Practical Guide." Stanford University, March 22. Accessed 2020-02-18.

Further Reading

  1. Gatt, Albert and Emiel Krahmer. 2018. "Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation." Journal of Artificial Intelligence Research, vol. 61, pp. 65-170, January. Accessed 2020-02-18.
  2. Bateman, John and Michael Zock. 2005. "Natural Language Generation." Chapter 15 in: Mitkov, R. (ed), The Oxford Handbook of Computational Linguistics (1 ed.), January. Accessed 2020-02-18.
  3. Reiter, Ehud, and Robert Dale. 1997. "Building applied natural language generation systems." Natural Language Engineering, Cambridge University Press, vol. 3, no. 1, pp. 57-87, March. Accessed 2020-02-18.
  4. Xie, Ziang. 2018. "Neural Text Generation: A Practical Guide." Stanford University, March 22. Accessed 2020-02-18.
  5. Santhanam, Sashank, and Samira Shaikh. 2019. "A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions." arXiv, v1, June 2. Accessed 2020-02-18.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
3
0
2288
2376
Words
3
Likes
14K
Hits

Cite As

Devopedia. 2020. "Natural Language Generation." Version 3, February 20. Accessed 2024-06-25. https://devopedia.org/natural-language-generation
Contributed by
1 author


Last updated on
2020-02-20 10:18:06