.
1.1 BACKGROUND AND PROBLEM DEFINITION
.
In computational linguistics, sentiment analysis is considered to be a classification problem. It involves natural language processing (NLP) on many levels, and inherits its challenges. There exists a wide variety of applications that could benefit from its results, such as news analytics, marketing, question answering, knowledge bases and so on. The challenge of this field is to improve the machine’s ability to understand texts in the same way as human readers are able to. Taking advantages from the huge amount of opinions expressed on the internet especially from social media blogs is vital for many companies and institutions, whether it is in terms of product feedback, public mood, or investor opinions.
.
The present thesis searches into different possibilities to improve sentiment classification performance. To address this problem, three different key issues are investigated. The first issue is to improve sentiment classification through text preprocessing. The second issue is to improve it through utilising text properties. The third issue is to improve it through inferring sentiment from one domain to another. These issues are explained in the following.
.
...
.
1.2 AIM AND OBJECTIVES
.
The main aim of this thesis is to explore key ways of improving sentiment classification performance. To achieve this, there are three distinctive objectives. The first objective aims to improve sentiment prediction through text pre-processing. A wide variety of pre-processing methods is presented and an appropriate feature selection method is selected for the analysis. Document level sentiment classification is performed along with the focus on products reviews and the use of movie reviews as an example.
.
The second objective intends to improve sentiment classification through delving into various text properties. The example here is financial news that has two properties. Firstly, the financial news contains announcements of financial events that could be utilised in the sentiment prediction. A model that employs news events in sentiment classification is proposed. Secondly, financial news allows for capturing the investors (reader) opinions through stock market returns. It is argued in this thesis that in some tasks such as financial forecasting, it is the sentiment expressed in the responses of content readers (for instance, through trading behaviour) that may be more useful as a means of creating predictive models. A new model that is built to predict financial news sentiment based on a novel method to capture reader sentiment is presented.
.
Furthermore, the financial news covers a wide variety of different domains such as economics, accounting, law, etc. Therefore, the third objective aims to improve sentiment classification through investigating the case of cross-domain sentiment analysis. A method for selecting domain dependent and independent words is proposed, and a new model for cross domain sentiment analysis is evaluated against other approaches.
.
.
No comments:
Post a Comment