Monday, July 31, 2017

Acquiring and Exploiting Lexical Knowledge for Twitter Sentiment Analysis [Problem Statement]


 

.

1.3 Research Problem

We refer to message-level polarity classification as the task of automatically classifying tweets into sentiment categories. This problem has been successfully tackled by representing tweets from a corpus of hand-annotated examples using feature vectors and training classification algorithms on them (Mohammad, Kiritchenko and Zhu, 2013). A popular choice for building the feature space X is the vector space model (Salton, Wong and Yang, 1975), in which all the different words or unigrams found in the corpus are mapped into individual features. Word n-grams, which are consecutive sequences of n words, can also been used analogously. Each tweet is represented as a sparse vector whose active dimensions (dimensions that are different from zero) correspond to the words or n-grams found in the message. The values of each active dimension can be calculated using different weighting schemes, such as binary weights or frequency-based weights with different normalisation schemes.

.

The message-level sentiment label space Y corresponds to the different sentiment categories that can be expressed in a tweet, e.g., positive, negative, and neutral. Because sentiment is a subjective judgment, the ground-truth sentiment category of a tweet must be determined by a human evaluator, and hence, the manual annotation of tweets into sentiment classes is a timeconsuming and labour-intensive task. We refer to this problem as the label sparsity problem. Because supervised machine learning models are impractical in the absence of labelled tweets, the label sparsity problem imposes practical limitations on using these techniques for classifying the sentiment of tweets.

.

Crowdsourcing tools such as Amazon Mechanical Turk10 or CrowdFlower11 allow clients to use human intelligence to perform tasks in exchange for a monetary payment set by the client. They have been successfully used for manually labelling tweets into sentiment classes (Nakov, Rosenthal, Kozareva, Stoyanov, Ritter and Wilson, 2013). Nevertheless, a classifier trained from a particular collection of manually annotated tweets will not necessarily perform well on tweets about topics that were not included in the training data or on tweets written in a different period of time. This is because the relation between messages and the corresponding sentiment label can change from one domain to another or over time. We refer to this problem as the sentiment drift problem.

.

Social media opinions are expressed in different domains such as politics, products, movie reviews, sports, among others. More specifically, opinions are expressed about particular topics, entities or subjects of a certain domain. For example, “Barack Obama” is a specific entity of the domain “politics”. 

.

The words and expressions that define the sentiment of a text passage are referred to in the literature as opinion words (Liu, 2012). For instance, happy is a positive opinion word and sad is a negative one. As has been studied in (Engström, 2004; Read, 2005) many opinion words are domain-dependent. That means that words or expressions that are considered as positive or negative for a certain domain will not necessarily have the same relevance or orientation in a different context. This situation is clarified in the following examples taken from real posts on Twitter:

1. For me the queue was pretty small and it was only a 20 minute wait I think but was so worth it!!! :D @raynwise
.
2. Odd spatiality in Stuttgart. Hotel room is so small I can barely turn around but surroundings are inhumanly vast & long under construction.
.
3. My girlfriend just called me to say good night because she accident (sic) fell asleep without saying it earlier :) #ShesTooCute
.
4. I got some RAGE over this #Harambe accident. This is why there should be NO zoos.
.
Here we can see that opinion words small and accident can be used to express opposite sentiment in different contexts. This is a manifestation of the sentiment drift problem, and its main consequence is that a sentiment classifier that was trained on data of a particular domain may not necessarily have the same classification performance for other topics or domains.
.
Temporal changes in the sentiment pattern are another manifestation of sentiment drift. The relation between messages and their corresponding sentiment label for a particular topic is non-stationary, i.e., it can change over time (Durant and Smith, 2007; Bifet and Frank, 2010; Bifet, Holmes and Pfahringer, 2011; Silva, Gomide, Veloso, Meira and Ferreira, 2011; Calais Guerra, Veloso, Meira Jr and Almeida, 2011; Guerra, Meira and Cardie, 2014). For instance, when an unexpected event associated with the topic occurs suddenly (e.g.,a scandal linked to a public figure), new expressions conveying sentiment
can arise spontaneously, such as #trumpwall and #PrayForParis. Additionally, other existing words or expressions can change their frequency affecting the polarity pattern of the topic. Hence, the accuracy of a sentiment classifier affected by this change would decrease over time.
.
This problem was empirically studied in (Durant and Smith, 2007) by training sentiment classifiers using training and testing data from different time periods. The results indicated a significant decrease in the classification performance as the time difference between the training and the testing data was increased.
.
A possible approach to overcome the sentiment drift problem is to constantly update the sentiment classifier with new labelled data (Silva et al., 2011). However, as discussed in (Silva et al., 2011; Calais Guerra et al., 2011; Guerra et al., 2014), the high volume and sparsity of social streams make the continuous acquisition of sentiment labels, even using crowdsourcing tools, infeasible. The label sparsity and sentiment drift problems are connected.
.
The research problem considered in this thesis is how to derive accurate polarity classifiers for Twitter in label sparsity conditions without relying on the costly process of human annotation.
.
.







No comments: