Aspect-Based Opinion Mining
Aspect-Based Opinion Mining (ABOM) involves extracting aspects or features of an entity and figuring out opinions about those aspects. It's a method of text classification that has evolved from sentiment analysis and named entity extraction (NER). ABOM is thus a combination of aspect extraction and opinion mining. While opinions about entities are useful, opinions about aspects of those entities are more granular and insightful.
The ABOM workflow constitutes initial text pre-processing, POS tagging, splitting sentences to extract aspects and classifying them into various dimensions/buckets.
A 2018 survey states that 90% of the total data in world has been generated in last two years in the form of tweets, text, images and video. ABOM is therefore important in analyzing this unstructured data.
What is opinion mining and its importance?
Suppose a written or spoken content expresses an opinion about some subject. Extraction of these opinions is called opinion mining. For example, "Adam was very satisfied with the flavour of black tea at Starbucks". Here we extract a positive opinion about black tea (entity) whose aspect is flavour.
With increased use of online transactions and service-based industries, there has been immense growth of consumer reviews and opinions through voice or text. Opinion mining finds out sentiments from these reviews and estimates customer satisfaction.
Sentiment analysis and opinion are related. Sentiment analysis is first done to extract positive, negative or neutral sentiments. This leads to opinion mining using various text classifiers.
What do you mean by aspects and why are they relevant?
Consider the example of "ice-cream" as the entity. Aspects or features of this entity include flavour, temperature, taste, presentation, etc. A person may express that he disliked the ice-cream but this is expressing opinion on the overall entity. If we analyze this deeper, we may find that the person liked the flavour and the presentation but didn't like the taste. Thus, figuring out opinions on specific aspects gives more useful information that can aid decision making.
With "restaurant" as the entity, its aspects could include food, service, price, ambience, etc. Often user reviews or ratings of the restaurant as a whole is not actionable. However, if we know that the food was great but the music was terrible, the restaurant owner can take actions to improve on specific aspects.
What's the workflow of ABOM?
ABOM involves data collection, data cleaning, extraction of aspects and entities, classification of aspects and entities, sentiment scoring of aspects, evaluation and validation.
Text is typically split into sentences and parsed according to Named Entity Recognition (NER), which extracts the entities. With aspect mining, we find out features or characteristics of these entities. There are some automated aspect extractor APIs. The output from such APIs can be validated and improved manually.
Data pre-processing is time consuming. Selecting and training aspects of entities are complex tasks. Some approaches include aggregate score of opinion words, SentiWordNet, aspect table, dependency relations, emotion analysis using lexicon and semantic representation.
Opinion mining is usually a rule-based approach where frame certain rules to identify most used words. These are further analyzed and processed. A dictionary-based approach is to curate common words of aspect terms for further classification according to various domains.
Could you describe the classification of opinions in ABOM?
Classification of opinion completely depends on the dataset and business problem. Opinion classification could be trend based, aspect based, sentence based, etc. from tweets, reviews, blogs and others. The most versatile way of classification holds positive and negative analysis.
Similarly, classification can be done on a multi-aspect basis using co-occurrence of aspects and sentiments, aspect-sentiment hierarchy and polarity classification of sentiments. These are different approaches of classification where aspects are filled in different buckets according to polarity from sentiment scores or design/occurrence of aspects with POS tags.
There are several researchers who work on automatic aspect generation and classification after effective cleaning of unwanted and irrelevant sentences. Insignificant words or sentences make the data noisy and degrade the classification accuracy. To get rid of this, automatic aspect extraction methods like fuzzy aspect based opinion classification can be used. This method covers the demerits of infrequent and coreferential aspects and gives a good classification accuracy.
What are some techniques for extracting aspects?
Among the supervised techniques are Hidden Markov Model (HMM), Conditional Random Fields (CRFs) and dependency tree kernels. Among the unsupervised ones are frequency or statistical methods, rule-based methods, and Pointwise Mutual Information (PMI). Latent Dirichlet Allocation (LDA) from topic modelling has also been adapted and applied.
PMI is a score that indicates how often a candidate aspect co-occurs with an entity. Low PMI implies it's probably not an aspect. PMI has been used along with Term Frequency-Inverse Document Frequency (TF-IDF).
Some techniques are able to extract both explicit and implicit aspects. Association rule mining and clustering have been adapted for extracting implicit aspects. Dependency rules and lexicons such as WordNet and SenticNet have been used to identify implicit aspects.
Some techniques jointly model both aspects and opinions. One well-known technique is called double propagation that exploits syntactic relations of opinion words and aspects. From known opinion words, aspects can be extracted. From known aspects, new opinion words can be identified; and so on.
Neural network approaches such as CNN, LSTM, and attention mechanisms have been applied as well.
Could you share some tips and best practices for doing ABOM?
When extracting aspect, we must always know which entity it belongs to. Consider for example "The picture quality of this camera is amazing" versus "I love this camera". In the former sentence, "picture quality" is the aspect. In the latter sentence, camera entity is evaluated as a whole and hence the aspect is "general".
It's essential to consider the domain of application. The phrase "please go and read the book" is a positive opinion for a book review but a negative one for a movie review. This implies that we should train our models with domain-specific data.
Sometimes opinion words on their own are inadequate. They need to be analyzed along with the aspect. For example, even within the same domain of digital cameras, "long" can either be positive or negative depending on the context, such as "long battery life" versus "takes long time to focus".
For a developer, what are some resources and projects to learn ABOM?
One can start an ABOM project from online reviews. There are many e-commerce and social websites containing thousands of reviews on any topic, person or product. Twitter, Facebook, Amazon, MouthShut, and Yelp are example sources.
These reviews can be scraped using any API or libraries like BeautifulSoup or Scrapy in Python. Some sample projects can be aspect-based sentiment analysis for e-commerce websites or restaurants. In latent aspect rating analysis, ratings were predicted using the latent opinions from review text data. This work can be replicated in different industries such as hospitality, automotive, and education.
What are some industrial applications of ABOM ?
ABOM applications is spread out across various industries wherever there is customer feedback. Examples include travel, automotive, hotel, tourism, ecommerce and many more.
Usually, before any hotel booking, people read reviews. We can distribute these into several buckets like service, ambience or hygiene. This makes it easier for customers to compare different hotels. Similarly, there are several websites for travel like Yelp, Trip Advisor or MakeMyTrip that categorize the service into relevant aspects and shows it in the form of ratings.
In every industry, reviews are broken down into aspects and rated it according to domain knowledge. Good domain knowledge is essential for better decision making. Before any drug trials or testing, there needs to be a lot of expert opinion from doctors and domain-specific experts. This work takes very long to come to a conclusion. ABOM simplifies it by rating several aspects like adverse reactions, efficacy of a drug, symptoms and conditions of patients.
Work on opinion mining starts with the realization that importance of opinions on various aspects is more insightful rather than summary of reviews. At this time, ABOM is named feature-based opinion mining. Researchers start working on text summarization and opinion mining due to large scale availability of customer reviews and an increasing demand for customer satisfaction.
As a neural network approach to ABOM, Long Short-Term Memory (LSTM) is used for extracting opinion target expressions (OTEs) and aspect sentiment polarities. Using various types of neural networks is an art now for researchers and we hope to see better accuracy and automation in ABOM across various industries.
- Afzaal, Muhammad. 2015. "A novel framework for aspect-based opinion classification for tourist places." 2015 Tenth International Conference on Digital Information Management (ICDIM). Accessed 2019-11-05.
- Afzaal, Muhammad. 2016. "Fuzzy Aspect Based Opinion Classification System for Mining Tourist Reviews." Advances in Fuzzy Systems. Accessed 2019-11-05.
- Afzaal, Muhammad. 2019. "Multiaspect‐based opinion classification model for tourist reviews." Wiley Online Library. Accessed 2019-11-05.
- Al-Smadi, Mohammad. 2019. "Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews." International Journal of Machine Learning and Cybernetics. Accessed 2019-11-10.
- Cavalcanti, Diana. 2017. "Aspect-Based Opinion Mining in Drug Reviews." Springer International Publishing. Accessed 2019-11-13.
- Da'u, Aminu, and Naomie Salim. 2019. "Aspect extraction on user textual reviews using multi-channel convolutional neural network." PeerJ Computer Science, 5:e191, May 06. Accessed 2019-11-10.
- Das, Manoj Kumar. 2017. "Opinion mining and sentiment classification: A review." 2017 International Conference on Inventive Systems and Control (ICISC). Accessed 2019-10-31.
- Eirinaki, Magdalini. 2011. "Feature-based opinion mining and ranking." Science Direct. Accessed 2019-11-10.
- Griffith, A. 2018. "90 percent of the big data we generate is an unstructured mess." PCmag. Accessed 2019-10-28.
- Hannemann, Jan. 2001. "Aspect Mining Tool." Software Practices Lab. Accessed 2019-10-31.
- He, Ruidan, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2017. "An Unsupervised Neural Attention Model for Aspect Extraction." Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, pp. 388–397, July. Accessed 2019-11-10.
- Hemmatian, Fatemeh. 2019. "A survey on classification techniques for opinion mining and sentiment analysis." Artificial Intelligence Review. Accessed 2019-11-04.
- Hu, Minqing. 2004. "Mining Opinion Features in Customer Reviews." American Association for Artificial Intelligence. Accessed 2019-11-10.
- Joshi, Achyut, Ishika Arora, Sumedha Raman, and Andrew Giannotto. 2018. "Aspect Extraction." Accessed 2019-11-10.
- K Samha, Amani. 2016. "Aspect-Based Opinion Mining Using Dependency Relations." International Journal of Computer Science Trends and Technology (IJCST) – Volume 4 Issue 1, Jan - Feb 2016. Accessed 2019-11-04.
- Kobayashi, Nozomi. 2007. "Extracting Aspect-Evaluation and Aspect-of Relations in Opinion Mining." Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1065–1074. Accessed 2019-11-10.
- Liu, Bing. 2011. "Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data." Second Edition, Springer. Accessed 2019-11-10.
- Liu, Bing. 2012. "Sentiment Analysis and Opinion Mining." Vol 5, No 1 , Pages 1-167, Synthesis Lectures on Human Language Technologies. Accessed 2019-10-28.
- Marrese, Edison. 2013. "Identifying Customer Preferences about Tourism Products using an Aspect-Based Opinion Mining Approach." Procedia Computer Science. Accessed 2019-11-10.
- McFadden, Renata Rand. 2013. "Survey of aspect mining case study software and benchmarks." 2013 Proceedings of IEEE Southeastcon. Accessed 2019-10-31.
- Meena, Sanjay. 2019. "Your Guide to Sentiment Analysis." Seek Blog, on Medium, February 08. Accessed 2019-10-11.
- Min, Peter. 2018. "Aspect-Based Opinion Mining (NLP with Python)" Medium. Accessed 2019-11-10.
- Polignano, Marco. 2018. "An Emotion-driven Approach for Aspect-based Opinion Mining." IIR 2018, May 28-30, 2018, Rome, Ital. Accessed 2019-11-04.
- Rana, Toqir A. and Yu-N Cheah. 2016. "Aspect extraction in sentiment analysis: comparative analysis and survey." Artificial Intelligence Review, vol. 46, no. 4, pp 459–483, February. Accessed 2019-11-10.
- Reddy, Saujanya. 2016. "Aspect-Based-Sentiment-Analysis-IRE-Major-Project." Github. Accessed 2019-11-12.
- S. Tewari, Anand. 2018. "Personalized Product Recommendation Using Aspect-Based Opinion Mining of Reviews." Proceedings of International Ethical Hacking Conference 2018 pp 443-453. Accessed 2019-11-10.
- Smeureanu, Ion. 2012. "Applying Supervised Opinion Mining Techniques on Online User Reviews." Semantic Scholar. Accessed 2019-10-31.
- Somabhai, Patel Bhumi. 2015. "A Survey on Feature Based Opinion Mining For Tourism Industry." Journal of Engineering Computers & Applied Sciences(JECAS). Accessed 2019-11-13.
- T C, Chinsha. 2015. "A syntactic approach for aspect based opinion mining." Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015). Accessed 2019-11-04.
- Tripathi, Nitesh. 2019. "Aspect-Based Sentiment Analysis in Product Reviews: Unsupervised Way." Medium. Accessed 2019-11-13.
- Vivekanandan, K. and J. Soonu Aravindan. 2014. "Aspect-based Opinion Mining: A Survey." International Journal of Computer Applications, vol. 106, no. 3, pp. 975-8887, November. Accessed 2019-10-31.
- Wang, Hongning. 2010. "Latent Aspect Rating Analysis on Review Text Data: A Rating Regression Approach." Urbana Champaign. Accessed 2019-11-12.
- Zhang, Lei. 2014. "Aspect and Entity Extraction for Opinion Mining." Studies in Big Data vol 1, Springer. Accessed 2019-10-28.
- Zhang, Lei and Bing Liu. 2014. "Aspect and Entity Extraction for Opinion Mining." Accessed 2019-11-15.
- İrsoy, Ozan. 2014. "Opinion Mining with Deep Recurrent Neural Networks." Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Accessed 2019-11-14.
- Liu, Pengfei. 2015. "Fine-grained Opinion Mining with Recurrent Neural Networks and Word Embeddings." Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Accessed 2019-11-14.
- Cambria, Erik. 2013. "Application of multi-dimensional scaling and artificial neural networks for biologically inspired opinion mining." Volume 4, April 2013, Pages 41-53, Biologically Inspired Cognitive Architectures. Accessed 2109-11-14.
- Vinodhini, G.. 2012. "Sentiment Analysis and Opinion Mining: A Survey." International Journal of Advanced Research in Computer Science and Software Engineering. Accessed 2019-11-14.
- Aspect Extraction
- Deep Learning for ABOM
- Sentiment Analysis
- Topic Modelling
- Natural Language Processing
- Structured vs Unstructured Data