• Sentiment Analysis. Source: paxcom.net
    image
  • Hathaway or Hathaway?. Source: Google
    image
  • An example usage. Source: Google
    image
  • Sentiment Analysis Process. Source: Devopedia
    image
  • Sentiment Analysis approaches. Source: Devopedia
    image

Sentiment Analysis

Improve this article. Show messages.

Summary

image
Sentiment Analysis. Source: paxcom.net

Sentiment Analysis is a process which focuses on analyzing people’s opinions, feelings, and attitudes towards a specific product, organization or service.

It is not uncommon for us to consider what other people think in our decision-making process. Prior to the advent of the Internet, many of us relied on friends and families for product or service recommendations, or information when buying a product. The Internet eases our efforts to get opinions of the general population.

In a world where colossal amounts of user-generated content is produced every day, it is practically impossible for human workforce to collect all the data and determine the opinions expressed in those data. Therefore, there arises a need to develop computer algorithms to automate the classification of reviews on the basis of their polarities as: positive, negative or neutral.

Milestones

3
BC

The history of Sentiment Analysis begins in Ancient Greece and the concept of 'Doxa', which referred to popular belief. Voting was used as a means of collecting public opinion and the same was used to formulate policies as well as bring down the opposition.

1824

The earliest known example of opinion polling is an instance of straw polling conducted by The Aru Pennsylvanian in a quest to determine who had the lead in the race for the United States Presidency in the year 1824. This poll showed that Republican John Quincy Adams was overtaken by Andrew Jackson, who not only went on to win Pennsylvania but the whole of the country, as well. The accuracy of the straw polls soon made them popular throughout America. Nation-wide surveys began to be conducted a few years later and these correctly predicted the next few Presidential winners.

1997

Hatzivassiloglou and McKeown coined the term 'semantic orientation'. The semantic orientation of an opinion or sentence is characterized by whether it is positive, negative or neutral. This concept forms the basis of all sentiment analysis techniques even today. For instance, ParallelDots, Inc. has come up with a tool which categorizes the entered sentence based on its semantic orientation

2004

Bo Pang and Lillian Lee's paper on Sentiment classification based on Machine Learning techniques was the next great breakthrough. They proposed a 'subjectivity detector' that would filter out the sentences labelled 'subjective' in a document and employ text categorization techniques on the resulting data. They implemented algorithms like Naive Bayes and SVM to find minimum cuts in a graph. They claimed an accuracy of 86.4% on the NB polarity classifier.

2005

Daniel Gruhl and team conducted one of the first studies to determine if online comments determines the sales figures of a product. They obtained the sales data of books from Amazon.com and used automated query generation algorithms to predict the rise and fall of sales of certain books, based on blog mentions of and online chatter on the particular book and found that positive comments led to increased sales.

2012

Onur Kucuktunc and fellow researchers pioneered the large-scale sentiment analysis of Yahoo! Answers. They found that answers differed according to the attributes of users, i.e., the best rated answers had a neutral tone to them. They also identified what feelings were evoked in a user on reading a certain question. These revealing findings began to be used in advertising and recommendations.

2014

In June 2014, a researcher named Aleksandr Kogan collected and provided a database containing information of about 87 million Facebook users, to the voter-profiling company Cambridge Analytica. Cambridge Analytica used it to make 30 million “psychographic” profiles about voters. The data was allegedly used to attempt to influence voter opinion on behalf of politicians who hired them.

2017

The next advancement came when Soujanya Poria and colleagues were able to identify the sentiments present in online videos,and their findings were published in 2017. They used sentiment analysis at the utterance level. An 'utterance' is a unit of speech bound by a pause, and each utterance was classified subjectively.

Dec
2017
image

Sentiment analysis seems to rake in the profits for Warren Buffet. The shares of Buffet-owned Berkshire Hathaway was found to rise by as high as 2.94% following the release of Anne Hathaway's movies. This trend was written about by Dan Mirvish, who reasoned that the automated trading programming was picking up online chatter about 'Hathaway' and applying it to the stock markets.

2018
image

The evolution of Sentiment Analysis has been rapid. While the basic idea of analysing the public opinions remains the same, the tools available and areas of use have grown tremendously.

Discussion

  • What kind of questions are answered by Sentiment Analysis?

    Since Sentiment Analysis tools classify a sample of text as positive, negative or neutral, some of the questions which can be answered using Sentiment Analysis are as follows:

    • Is a given product review positive or negative?
    • Is a customer satisfied or dissatisfied based on his email response?
    • Based on a sample of tweets, how are people responding to a given ad campaign/product release/news item?
    • How have bloggers' attitudes about the president changed since the election?
  • What are the steps involved in Sentiment Analysis?
    image
    Sentiment Analysis Process. Source: Devopedia
    • Data acquisition: The collection of data is an important phase since a proper dataset needs to be defined for analyzing and classifying the text in the dataset.
    • Text preprocessing: After collecting the data, the preprocessing phase allows to reduce noise in data. This is done by removing the unnecessary stop words, repeated words, stemming, removal of emoticons, removal of URLs etc.
    • Feature selection and extraction: Proper selection and extraction of features plays a key role in determining the accuracy of the model. Hence, the appropriate feature extraction technique must be chosen for extracting the features.
    • Sentiment classification: In this phase, various sentiment classification techniques are applied to classify the text. Some popular sentiment classification techniques are Naïve Bayes (NB) and Support Vector Machines(SVM).
    • Polarity detection: After classifying the sentiments, the polarity of the sentiment is determined. The goal of polarity detection is to decide whether a text expresses positive, negative or neutral sentiment.
    • Validation and evaluation: Finally, validation and evaluation of the obtained results is performed so as to determine the overall accuracy of the techniques used for sentiment analysis.
  • What are the various approaches for sentiment analysis?
    image
    Sentiment Analysis approaches. Source: Devopedia
    • Machine Learning based: Classifies the text as positive, negative or neutral using Machine Learning classification algorithms and linguistic features.
    • Lexicon-based: Makes use of sentiment lexicons, sentiment lexicons are collections of annotated and preprocessed sentiment terms. Sentiment values are assigned to words that describe the positive, negative and neutral attitude of the speaker. It is further classified as:

    1) Dictionary-based method: It uses a small set of seed words and an online dictionary. The strategy here is initial seed set of words with their known orientations are collected and then online dictionaries are searched to find their probable synonyms and antonyms. The sample is classified based on the presence of such signalling sentiment words.

    2) Corpus-based method: Uses corpus data to identify sentiment words. Even though it is not as effective as dictionary based scheme, it is helpful in finding the domain and context of specific sentiment words against the corpus data. Uses corpus data to identify sentiment words. Even though it is not as effective as dictionary based scheme, it is helpful in finding the domain and context of specific sentiment words against the corpus data. The algorithm will have access not only to sentiment labels, but also to a context.

    • Hybrid: It is a combination of both Machine Learning and lexicon-based approaches.
  • What are the advantages and limitations of the Sentiment Analysis approaches?

    Machine Learning based:

    • Advantage: Unlike Lexicon-based, these models can be built for a specific purpose or context.
    • Limitation: Obtaining labeled data for training could be difficult or expensive.

    Lexicon-based:

    • Advantage: No training is required.
    • Limitation: Accuracy depends on lexical resources. Finite number of words in lexicons and the assignment of a fixed sentiment orientation and score to words.

    Hybrid:

    • Advantage: It incorporates the best of Machine learning based and Lexicon based approaches.

    Generally, Machine Learning based models perform better than Lexicon-based models. But Machine Learning models require labeled data in huge quantitites.

  • What are some tools available for Sentiment Analysis at present?
    • Python NLTK: A python based tool for text processing, cataloging, tokenization, stopping, tagging, parsing and much more.
    • GATE, the General Architecture for Text Engineering: A Java suite of tools used for all sorts of natural language processing tasks, including information extraction in many languages.
    • Opinion Finder: A system that processes documents and automatically identifies subjective sentences as well as various aspects of subjectivity within sentences, including agents who are sources of opinion, direct subjective expressions and speech events, and sentiment expressions.
    • LingPipe: LingPipe is tool kit for processing text using computational linguistics.
    • LIWC (Linguistic Inquiry and Word Count): A computerized text analysis tool that reads a given text and counts the percentage of words that reflect different emotions, thinking styles and social concerns.
  • What are the challenges involved in Sentiment Analysis?

    Human language is intricate. People often express opinions in complex ways. To mention a few:

    • Named entity recognition: Locating and classifying named entities in text into pre-defined categories such as the names of persons, organizations, locations. Eg: Is 300 Spartans a group of Greeks or a movie?
    • Anaphora Resolution: It is the problem of resolving references to earlier or later items in the discourse. Eg: "We watched the movie and went to dinner. It was awful." What does "It" refer to?
    • Parsing: This refers to resolving a sentence into its component parts. What is the subject and object of the sentence, which one does the verb and/or adjective actually refer to?
    • Rhetorical modes: Typically the analysed posts contain sarcasm, irony, implication, etc, which are particularly difficult to detect.
    • Social media website: It is not uncommon to find reviews and opinions containing slang, abbreviations, lack of capitals and poor punctuation, which would make sentiment analysis even more challenging.
    • Visual sentiment analysis: Posts often contain a mixture of visual and textual information. The sentiment polarities implied by texts may contradict the sentiments of images, which poses a challenge for textual sentiment analysis.
  • Where is Sentiment Analysis being used at present?

    A very broad answer to this can be broken up into three categories:

    • Brand Monitoring - Sentiment Analysis is used to gauge how a brand, product or company has been received by the public. In fact, private companies like Unamo offer this as a service.
    • Customer Service - Customer service agents classify incoming mail into 'urgent' and 'non-urgent', in order to be able to serve the more frustrated customers quicker. The speech analytics platform Callminer Eureka implements AI and ML techniques to draw insight from consumer interactions, in order to offer quality customer service.
    • Market Research and Analysis - Opinion mining plays a crucial role in business intelligence, by helping analysts understand why a particular product was well received or not. Stock markets and hedge funds have been known to shift with the shift in sentiments on social media.

    Sentiment analysis has also driven forward various other initiatives. Bing Search used the concept in their newly launched Multi-Perspective Answers product.

    Apart from the above, Sentiment analysis is used in the areas of political science, sociology, psychology; flame detection, identifying child-suitability of videos, bias identification in news sources are the variety of applications.

References

  1. Alessia, D., Fernando Ferri, Patrizia Grifoni, and Tiziana Guzzo. 2015. "Approaches, tools and applications for sentiment analysis implementation." International Journal of Computer Applications 125, no. 3.
  2. Asimuzzaman, Md, Pinku Deb Nath, Farah Hossain, Asif Hossain, and Rashedur M. Rahman. 2017. "Sentiment analysis of bangla microblogs using adaptive neuro fuzzy system." In 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 1631-1638. IEEE.
  3. Bing. 2018. “Toward a More Intelligent Search: Bing Multi-Perspective Answers.”
  4. Dai, Shuanglu, and Hong Man. 2018. "Integrating Visual and Textual Affective Descriptors for Sentiment Analysis of Social Media Posts." In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE.
  5. Forbes. 2015. “How Quant Traders Use Sentiment To Get An Edge On The Market.”
  6. Forbes. 2017. “AI Is Super-Charging The Customer Service World.”
  7. Gruhl, D., R Guha. et al. 2005. The predictive power of online chatter
  8. Guevara, Juan, Joana Costa, Jorge Arroba, and Catarina Silva. 2018. "Harvesting opinions in Twitter for sentiment analysis." In 2018 13th Iberian Conference on Information Systems and Technologies (CISTI). IEEE.
  9. Huffpost. 2017. "The Hathaway Effect: How Anne Gives Warren Buffett a Rise."
  10. Katarya, Rahul, and Ashima Yadav. 2018. "A comparative study of genetic algorithm in sentiment analysis." In 2018 2nd International Conference on Inventive Systems and Control (ICISC). IEEE.
  11. Kharde, Vishal, and Prof Sonawane. 2016. "Sentiment analysis of twitter data: a survey of techniques." arXiv preprint arXiv:1601.06971.
  12. Kucuktunc, O., Cambazoglu, B.B., et al. 2012. A large-scale sentiment analysis for Yahoo! Answers, Proceedings of the 5th ACM International Conference on Web Search and Data Mining
  13. Pang, Bo, and Lillian, Lee. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts
  14. Poria, Soujanya, Erik Cambria, et all. 2017. Context-Dependent Sentiment Analysis in User-Generated Videos.
  15. Sankar, H., and V. Subramaniyaswamy. 2017. "Investigating sentiment analysis using machine learning approach." In 2017 International Conference on Intelligent Sustainable Systems (ICISS), pp. 87-92. IEEE.
  16. Shayaa, Shahid, Noor Ismawati Jaafar, Shamshul Bahri, Ainin Sulaiman, Phoong Seuk Wai, Yeong Wai Chung, Arsalan Zahid Piprani, and Mohammed Ali Al-Garadi. 2018. "Sentiment Analysis of Big Data: Methods, Applications, and Open Challenges." IEEE Access.
  17. Stack Overflow. 2011.
  18. Unamo. 2018.
  19. Vohra, S. M., and J. B. Teraiya. 2013. "A comparative study of sentiment analysis techniques."
  20. Wikipedia. 2018a. "Doxa"
  21. Wikipedia. 2018b. "Sentiment analysis."
  22. Wikipedia. 2018c. "Facebook–Cambridge Analytica data scandal."

Milestones

3
BC

The history of Sentiment Analysis begins in Ancient Greece and the concept of 'Doxa', which referred to popular belief. Voting was used as a means of collecting public opinion and the same was used to formulate policies as well as bring down the opposition.

1824

The earliest known example of opinion polling is an instance of straw polling conducted by The Aru Pennsylvanian in a quest to determine who had the lead in the race for the United States Presidency in the year 1824. This poll showed that Republican John Quincy Adams was overtaken by Andrew Jackson, who not only went on to win Pennsylvania but the whole of the country, as well. The accuracy of the straw polls soon made them popular throughout America. Nation-wide surveys began to be conducted a few years later and these correctly predicted the next few Presidential winners.

1997

Hatzivassiloglou and McKeown coined the term 'semantic orientation'. The semantic orientation of an opinion or sentence is characterized by whether it is positive, negative or neutral. This concept forms the basis of all sentiment analysis techniques even today. For instance, ParallelDots, Inc. has come up with a tool which categorizes the entered sentence based on its semantic orientation

2004

Bo Pang and Lillian Lee's paper on Sentiment classification based on Machine Learning techniques was the next great breakthrough. They proposed a 'subjectivity detector' that would filter out the sentences labelled 'subjective' in a document and employ text categorization techniques on the resulting data. They implemented algorithms like Naive Bayes and SVM to find minimum cuts in a graph. They claimed an accuracy of 86.4% on the NB polarity classifier.

2005

Daniel Gruhl and team conducted one of the first studies to determine if online comments determines the sales figures of a product. They obtained the sales data of books from Amazon.com and used automated query generation algorithms to predict the rise and fall of sales of certain books, based on blog mentions of and online chatter on the particular book and found that positive comments led to increased sales.

2012

Onur Kucuktunc and fellow researchers pioneered the large-scale sentiment analysis of Yahoo! Answers. They found that answers differed according to the attributes of users, i.e., the best rated answers had a neutral tone to them. They also identified what feelings were evoked in a user on reading a certain question. These revealing findings began to be used in advertising and recommendations.

2014

In June 2014, a researcher named Aleksandr Kogan collected and provided a database containing information of about 87 million Facebook users, to the voter-profiling company Cambridge Analytica. Cambridge Analytica used it to make 30 million “psychographic” profiles about voters. The data was allegedly used to attempt to influence voter opinion on behalf of politicians who hired them.

2017

The next advancement came when Soujanya Poria and colleagues were able to identify the sentiments present in online videos,and their findings were published in 2017. They used sentiment analysis at the utterance level. An 'utterance' is a unit of speech bound by a pause, and each utterance was classified subjectively.

Dec
2017
image

Sentiment analysis seems to rake in the profits for Warren Buffet. The shares of Buffet-owned Berkshire Hathaway was found to rise by as high as 2.94% following the release of Anne Hathaway's movies. This trend was written about by Dan Mirvish, who reasoned that the automated trading programming was picking up online chatter about 'Hathaway' and applying it to the stock markets.

2018
image

The evolution of Sentiment Analysis has been rapid. While the basic idea of analysing the public opinions remains the same, the tools available and areas of use have grown tremendously.

Tags

See Also

Further Reading

  1. Bannister, Kristian. 2015. Understanding Sentiment Analysis: What It Is & Why It’s Used.
  2. Shinde, P.D. and Rathod, S., 2018. A Comparative Study of Sentiment Analysis Techniques.
  3. Sentiment Analysis: Concept, Analysis and Applications
  4. Sentiment Analysis: nearly everything you need to know
  5. NPTEL - Sentiment Analysis

Top Contributors

Last update: 2018-07-29 11:38:56 by Varsha2018
Creation: 2018-07-25 07:31:32 by Varsha2018

Article Stats

1902
Words
1
Chats
4
Authors
48
Edits
5
Likes
389
Hits

Cite As

Devopedia. 2018. "Sentiment Analysis." Version 48, July 29. Accessed 2018-08-14. https://devopedia.org/sentiment-analysis
BETA V0.16