Natural Language Understanding

Some aspects of text that NLU understands. Source: Waldron 2015.
Some aspects of text that NLU understands. Source: Waldron 2015.

Given some text, Natural Language Understanding (NLU) is about enabling computers to understand the meaning of the text. Once meaning is understood, along with the context, computers can interact with humans in a natural way.

If a human were to ask a computer a question, NLU attempts to understand the question. Such an understanding leads to a semantic representation of the input text. The representation is then fed into other related systems to generate a suitable response.

Language is what makes us human and manifests our intelligence. NLU is a challenging NLP task, often considered an AI-Hard problem. It combines elements of syntactic and semantic parsing, and predicate logic.


  • How is NLU different from NLP?
    NLP = NLU + NLG. Source: Lucid Thoughts 2019.

    Natural Language Processing (NLP) is an umbrella term that includes both Natural Language Understanding (NLU) and Natural Language Generation (NLG). NLP turns unstructured data into structured data. NLU is more specifically about the meaning or semantics. For example, if the user is asking about today's weather or the traffic conditions on a particular route, NLU helps in understanding the intent of the user's query. NLG is invoked when framing answers in natural language.

    Voice-based human-computer interaction such as Apple Siri or Amazon Alexa is a typical example. Speech is converted to text using Automatic Speech Recognition (ASR). NLU then takes the text and outputs a semantic representation of the input. Once relevant facts are gathered, NLG helps in forming the answer. Text-to-Speech (TTS) synthesis finally converts the textual answer to speech.

    Apart from sub-fields such as ASR and TTS, NLP consists of basic language processing tasks such as sentence segmentation, tokenization, handling stopwords, lemmatization, POS tagging and syntactic parsing.

    There's also Natural Language Inference (NLI). Given a premise, NLI attempts to infer if a hypothesis is true, false or indeterminate.

  • What are the typical challenges in NLU?

    Consider the sentence "We saw her duck". 'We' could be a Chinese name. 'Her' could refer to another person introduced earlier in the text. 'Duck' could refer to a bird or the action of ducking. Likewise, 'saw' could be a noun or a verb. This variety of interpretations is what makes NLU a challenging task. This is because language is highly ambiguous.

    Ambiguity could be syntactic, such as "I saw the man with the binoculars". An example of word sense ambiguity is "I need to go to the bank".

    Synonymy is also a problem for NLU. This is when many different sentences are expressing the same meaning. This is because language allows for variety and complex constructions.

    Human often communicate with errors and less than perfect grammar. NLU systems have to account for these as well. In addition, human language has sarcasms. A sentence may have a literal meaning (semantics) but also a different intended meaning (pragmatics).

  • Could you explain semantic parsing?
    Semantic parsing translates directions to a robot into procedural steps. Source: MacCartney 2019, slide 20.
    Semantic parsing translates directions to a robot into procedural steps. Source: MacCartney 2019, slide 20.

    Semantic parsing translates text into a formal meaning representation. This representation is something that's easier for machines to process. In some ways, semantic parsing is similar to machine translation except that in the latter, the final representation is human readable.

    The form of the representation depends on the purpose. It could use scalars or vectors. It could be continuous or discrete, such as tuples for relation extraction.

    Consider the question "Which country had the highest carbon emissions last year?" Assuming the answer is to be searched in a relational database, the representation would take the form of a database query: SELECT FROM country, co2_emissions WHERE = co2_emissions.country_id AND co2_emissions.year = 2014 ORDER BY co2_emissions.volume DESC LIMIT 1.

    In a robotic application, the representation might be a sequence of steps to guide the robot from one place to another. In smartphones that process voice commands, the representation might be categorized into intents and their arguments.

  • Which are the typical NLU tasks?
    Some NLU tasks and applications. Source: SciForce 2019.
    Some NLU tasks and applications. Source: SciForce 2019.

    Since NLU's focus is on meaning, here are some typical NLU tasks:

    • Sentiment Analysis: Understand if the text is expressing positive or negative sentiment. Emotion detection is a more granular form of sentiment analysis.
    • Named Entity Recognition: Identify and classify named entities (Person, Organization, Location, etc.) in the text.
    • Relation Extraction: Identify and classify the relation between named entities.
    • Semantic Role Labelling: Identify and label parts a sentence with their semantic roles. This helps in answering questions of the type "who did what to whom".
    • Word Sense Disambiguation: Looking at the context of word usage, figure out the correct sense since a word can have multiple senses.
    • Inductive Reasoning: Given some facts, use logic to infer relations not stated explicitly in the text.

    When combined with NLG, other tasks that require NLU are question answering, text summarization, chatbots, and voice assistants. The use of NLU in chatbots and voice assistants has become increasingly more important. NLU helps chatbots to better understand user intent, and to respond correctly and in a more natural way.

  • Could you share examples of real-world NLU systems?

    Many examples are in relation to chatbots or voice assistants. Microsoft offers Language Understanding Intelligent Service (LUIS) that developers can use to quickly build natural language interfaces into their apps, bots and IoT devices.

    A similar offering from IBM is Watson Assistant. IBM also offers Watson Natural Language Understanding to extract entities, keywords, categories, sentiment, emotion, relations, and syntax.

    From Google, we have Dialogflow for voice and text-based conversational interfaces. Other examples are Amazon Lex, SAP Conversational AI, , Rasa NLU, and Snips.

  • Which are the main approaches to NLU?

    One approach is to initialize an NLU system with some knowledge, structure and common sense. The system then learns from experience via reinforcement learning. Since some see this as introducing biases, an alternative approach is to require the system to learn everything by itself. Just as humans learn by interacting with their environments, NLU systems can also benefit from such embodied learning. The ability to detect human emotions can lead to deeper understanding of language.

    These two approaches are mirrored in Western philosophy by nativism (core inbuilt knowledge) and empiricism (learned by experience).

    NLU systems could combine elements of statistical approaches with knowledge resources such as FrameNet or Wikidata. FrameNet is a lexical database of word senses with examples. FrameNet can therefore help NLU in obtaining common sense knowledge. Other common sense datasets include Event2Mind and SWAG.

    Su et al. noted the duality between NLU and NLG. Via dual supervised learning they trained a model to jointly optimize on both tasks. Their approach gave state-of-the-art F1 score for NLU.

    There are four categories of NLU systems: distributional, frame-based, model-theoretical, interactive learning.

  • Which are the common benchmarks for evaluating NLU systems?

    The General Language Understanding Evaluation (GLUE) benchmark has nine sentence or sentence-pair NLU tasks. It has good diversity of genres, linguistic variations and difficulty. It's also model agnostic. There's also a leaderboard. This was extended in 2019 to Super GLUE.

    Quora Question Pairs (QQP) has question pairs. The task is to determine if two questions mean the same. Sharma et al. showed that a Continuous Bag-of-Words neural network model gave best performance. Incidentally, QQP is included in GLUE.

    SentEval is a toolkit for evaluating the quality of universal sentence representations. The tasks include sentiment analysis, semantic similarity, paraphrase detection, entailment, and more.

    CLUTRR is a diagnostic benchmark suite for inductive reasoning. It was created to evaluate if NLU systems can generalize in a systematic and robust way.

    For evaluating chatbots, Snips released three benchmarks for built-in intents and custom intent engines.

    Most models are trained to exploit statistical patterns rather than learn the meaning. Hence, it's easy for someone to construct examples to expose how poorly a model performs. Inspired by this, Adversarial NLI is another benchmark dataset that was produced by having humans in the training loop.

  • Are current NLU systems capable of real understanding?

    Back in 2019, it was reported that NLU systems are doing little more than pattern matching. There's no real understanding in terms of agents, objects, settings, relations, goals, beliefs, etc.

    OpenAI's GPT-2 was trained on 40GB of data with no prior knowledge. When prompted with a few words, GPT-2 can complete the sentence sensibly. But despite its fluency, GPT-2 doesn't understand what it's talking about. It fails at answering simple questions. In other words, it's good at NLG but not at NLU.

    Language Models (LMs) have proven themselves in many NLP tasks. However, their success in reasoning has shown to be poor or context-dependent. LMs capture statistics of the language rather than reasoning. In other words, prediction does not imply or equate to understanding.

    The failure to "understand" could be due to lack of grounding. For example, dictionaries define words in terms of other words, which too are defined by other words. Real understanding can come only when words are associated with sensory experiences grounded in the real world. Without grounding, NLU systems are simply mapping a set of symbols to another set or representation.



Joseph Weizenbaum at MIT creates ELIZA, a program that takes inputs and responds in the manner of a psychotherapist. ELIZA has no access to knowledge databases. It only looks at keywords, does pattern matching and gives sensible responses. Many users get fooled by ELIZA's human-like behaviour, although Weizenbaum insists that ELIZA has no understanding of either language or the situation.

A typical task given to the robot in SHRDLU. Source: Winograd 1971, fig. 11.
A typical task given to the robot in SHRDLU. Source: Winograd 1971, fig. 11.

Terry Winograd at MIT describes SHRDLU in his PhD thesis. The task is to guide a robotic arm to move children's blocks. SHRDLU can understand block types, colours, sizes, verbs describing movements, and so on. In later years, SHRDLU is considered a successful AI system. However, attempts to apply it to complex real-world environments prove disappointing. A modern variation of SHRDLU is SHRDLURN due to Wang et al. (2016).

A sentence is understood within the frame of 'Commercial Transaction'. Source: Yao 2017.
A sentence is understood within the frame of 'Commercial Transaction'. Source: Yao 2017.

Marvin Minsky at MIT publishes A Framework for Representing Knowledge. He defines a frame as a structure that's represents a stereotyped situation. Given a situation, a suitable frame is selected along with its associated information. Then it's customized to fit the current situation. This is the frame-based approach to NLU.


Pereira and Warren develop CHAT-80, a natural language interface to databases. Implemented in Prolog, it uses hand-built lexicon and grammar. It can answer questions about geography, such as "What countries border Denmark?"


Apple releases the Macintosh 128K along with a computer mouse. Although mouse was invented 20 years earlier, it's the Macintosh that makes it popular, and with it the Graphical User Interface (GUI). This causes some companies to change focus from research into natural language interfaces to adoption of GUIs.

Single-symbol SCT for fare.fare_id for ATIS task. Source: Kuhn 1995, fig. 1.
Single-symbol SCT for fare.fare_id for ATIS task. Source: Kuhn 1995, fig. 1.

Kuhn proposes Semantic Classification Tree (SCT) that automatically learns semantic rules from training data. This overcomes the need to hand code and debug large number of rules. The learned rules are seen to be robust to grammatical and lexical errors in input. In general, the 1990s see a growing use of statistical approaches to NLU.


Gobbi et al. compare many different algorithms used for concept tagging, a sub-task of NLU. Among the algorithms compared are generative (WFST), discriminative (SVM, CRF) and neural networks (RNN, LSTM, GRU, CNN, attention). LSTM-CRF models show best performance. Adding a CRF top layer to a neural network improves performance with only a modest increase in number of parameters.

Concepts (meaning space) and words (linguistic space). Source: Khashabi et al. 2019, fig. 1.
Concepts (meaning space) and words (linguistic space). Source: Khashabi et al. 2019, fig. 1.

Reasoning is one of the tasks of NLU with practical use in applications such as question answering, reading comprehension, and textual entailment. In a graph-based approach, Khashabi et al. show the impossibility of reasoning in a noisy linguistic graph if it requires many hops in the meaning graph. Meaning space is internal conceptualization in the human mind. It's free of noise and uncertainty. Linguistic space is where thought is expressed via language and has plenty of room for imperfections.


  1. Baron, Justine. 2018. "Recast.AI will be renamed SAP Conversational AI early 2019!" Blog, SAP Conversational AI, November 15. Accessed 2020-02-17.
  2. COSO IT. 2019. "Evolution of chat bots from NLP to NLU." COSO IT, March 1. Accessed 2020-02-17.
  3. Conneau, Alexis, and Douwe Kiela. 2018. "SentEval: An Evaluation Toolkit for Universal Sentence Representations." arXiv, v1, March 14. Accessed 2020-02-15.
  4. DeVault, David. 2013. "Natural language understanding in dialogue systems." University of Southern California. Accessed 2020-02-15.
  5. Expert System. 2019. "Natural Language Understanding: What is it and how is it different from NLP." Expert System, January 22. Accessed 2020-02-15.
  6. Filiz, Fahrettin. 2018. "Natural Language Understanding." Medium, January 28. Accessed 2020-02-15.
  7. FrameNet. 2020. "About FrameNet." Accessed 2020-02-17.
  8. GLUE. 2020. "Homepage." GLUE. Accessed 2020-02-15.
  9. Geitgey, Adam. 2018. "Natural Language Processing is Fun!" Medium, July 18. Accessed 2020-02-15.
  10. Gobbi, Jacopo, Evgeny Stepanov, and Giuseppe Riccardi. 2018. "Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development." arXiv, v1, July 27. Accessed 2020-02-15.
  11. Goebel, Tobias. 2017. "Don’t Confuse Speech Recognition with Natural Language Understanding When Talking Bots." Blog, Aspect, July 26. Accessed 2020-02-15.
  12. Hughes, Neil. 2013. "Inventor of the computer mouse dies at 88." AppleInsider, July 4. Accessed 2020-02-17.
  13. IBM Cloud. 2018. "Watson Natural Language Understanding." IBM Cloud, August 29. Accessed 2020-02-15.
  14. Jones, Karen Sparck. 1994. "Natural Language Processing: A Historical Review." In: Zampolli A., Calzolari N., Palmer M. (eds), Current Issues in Computational Linguistics: In Honour of Don Walker, Linguistica Computazionale, vol. 9, Springer, Dordrecht. doi:10.1007/978-0-585-35958-8_1. Accessed 2020-02-15.
  15. Khashabi, Daniel, Erfan Sadeqi Azer, Tushar Khot, Ashish Sabharwal, and Dan Roth. 2019. "On the Capabilities and Limitations of Reasoning for Natural Language Understanding." arXiv, v2, September 11. Accessed 2020-02-15.
  16. Kuhn, Roland. 1995. "The Application of Semantic Classification Trees to Natural Language Understanding." IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 17, no. 5, pp. 449-460, April-May. Accessed 2020-02-15.
  17. Lucid Thoughts. 2019. "NLP vs. NLU: Natural Language Processing vs. Natural Language Understanding." Lucid Thoughts, on YouTube, October 3. Accessed 2020-02-15.
  18. MacCartney, Bill. 2014. "Understanding Natural Language Understanding." ACM SIGAI Bay Area Chapter Inaugural Meeting, July 16. Accessed 2020-02-15.
  19. MacCartney, Bill. 2019. "Introduction to semantic parsing." CS224U, Stanford University, May 8. Accessed 2020-02-17.
  20. Marcus, Gary. 2020. "GPT-2 and the Nature of Intelligence." The Gradient, January 25. Accessed 2020-02-15.
  21. Minsky, Marvin. 1974. "A Framework for Representing Knowledge." MIT-AI Laboratory Memo 306, June. Accessed 2020-02-18.
  22. Nie, Yixin, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, and Douwe Kiela. 2019. "Adversarial NLI: A New Benchmark for Natural Language Understanding." arXiv, v1, October 31. Accessed 2020-02-18.
  23. Ruder, Sebastian. 2018. "10 Exciting Ideas of 2018 in NLP." December 19. Accessed 2020-02-17.
  24. Ruder, Sebastian. 2019. "The 4 Biggest Open Problems in NLP." January 15. Accessed 2020-02-15.
  25. Ruder, Sebastian. 2019b. "Natural language inference." NLP-progress, September 9. Accessed 2020-02-18.
  26. SciForce. 2019. "NLP vs. NLU: from Understanding a Language to Its Processing." KDnuggets, July. Accessed 2020-02-17.
  27. Sharma, Lakshay, Laura Graesser, Nikita Nangia, and Utku Evci. 2019. "Natural Language Understanding with the Quora Question Pairs Dataset." arXiv, v1, July 1. Accessed 2020-02-15.
  28. Sinha, Koustuv, Shagun Sodhani, Jin Dong, Joelle Pineau, and William L. Hamilton. 2019. "CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text." arXiv, v2, September 4. Accessed 2020-02-17.
  29. Snips GitHub. 2019. "snipsco/nlu-benchmark." Snips GitHub, April 3. Accessed 2020-02-15.
  30. Socher, Richard. 2018. "AI’s Next Great Challenge: Understanding the Nuances of Language." Harvard Business Review, July 25. Accessed 2020-02-15.
  31. Su, Shang-Yu, Chao-Wei Huang, and Yun-Nung Chen. 2019. "Dual Supervised Learning for Natural Language Understanding and Generation." GroundAI, May 15. Accessed 2020-02-15.
  32. Talmor, Alon, Yanai Elazar, Yoav Goldberg, and Jonathan Berant. 2019. "oLMpics -- On what Language Model Pre-training Captures." arXiv, v1, December 31. Accessed 2020-02-17.
  33. Waldron, Mike. 2015. "Structured vs Unstructured Data: Exploring an Untapped Data Reserve." AYLIEN, April 15. Accessed 2019-02-17.
  34. Wang, Sida I., Percy Liang, and Christopher D. Manning. 2016. "Learning Language Games through Interaction." arXiv, v1, June 8. Accessed 2020-02-18.
  35. Wikipedia. 2020. "Natural-language understanding." Wikipedia, January 13. Accessed 2020-02-15.
  36. Wikipedia. 2020b. "ELIZA." Wikipedia, February 5. Accessed 2020-02-17.
  37. Wikipedia. 2020c. "SHRDLU." Wikipedia, February 3. Accessed 2020-02-17.
  38. Winograd, Terry. 1971. "Procedures as a Representation for Data in a Computer Program for Understanding Natural Language." PhD Thesis, MIT, January. Accessed 2020-02-17.
  39. Yao, Mariya. 2017. "4 Approaches To Natural Language Processing & Understanding." Topbots, May 21. Accessed 2020-02-15.

Further Reading

  1. Yao, Mariya. 2017. "4 Approaches To Natural Language Processing & Understanding." Topbots, May 21. Accessed 2020-02-15.
  2. Jones, Karen Sparck. 1994. "Natural Language Processing: A Historical Review." In: Zampolli A., Calzolari N., Palmer M. (eds), Current Issues in Computational Linguistics: In Honour of Don Walker, Linguistica Computazionale, vol. 9, Springer, Dordrecht. doi:10.1007/978-0-585-35958-8_1. Accessed 2020-02-15.
  3. Coucke, Alice. 2017. "Benchmarking Natural Language Understanding Systems: Google, Facebook, Microsoft, Amazon, and Snips." Snips, on Medium, June 2. Accessed 2020-02-15.
  4. Priest, Mike. 2018. "Six challenges in NLP and NLU - and how solves them.", November 20. Accessed 2020-02-15.
  5. Stanford. 2019. "CS224U: Natural Language Understanding." Stanford University. Accessed 2020-02-15.

Article Stats

Author-wise Stats for Article Edits

No. of Edits
No. of Chats

Cite As

Devopedia. 2020. "Natural Language Understanding." Version 2, February 18. Accessed 2023-11-13.
Contributed by
1 author

Last updated on
2020-02-18 05:41:25