Friday, March 31, 2017

Approaches to Cross-Domain Sentiment Analysis: A Systematic Literature Review


 

Abstract

A sentiment analysis has received a lot of attention from researchers working in the fields of natural language processing and text mining. 

However, there is a lack of annotated data sets that can be used to train a model for all domains, which is hampering the accuracy of sentiment analysis. 

Many research studies have attempted to tackle this issue and to improve cross-domain sentiment classification. 

In this paper, we present the results of a comprehensive systematic literature review of the methods and techniques employed in a cross-domain sentiment analysis. 

We focus on studies published during the period of 2010-2016. 

From our analysis of those works, it is clear that there is no perfect solution. 

Hence, one of the aims of this review is to create a resource in the form of an overview of the techniques, methods, and approaches that have been used to attempt to solve the problem of cross-domain sentiment analysis in order to assist researchers in developing new and more accurate techniques in the future.

https://ieeexplore.ieee.org/abstract/document/7891035


REFERENCES


[1] A. Immonen, P. Pääkkönen, and E. Ovaska, ‘‘Evaluating the Quality

of Social Media Data in Big Data Architecture,’’ IEEE Access, vol. 3,

pp. 2028–2043, 2015.


[2] D. Jiang, X. Luo, J. Xuan, and Z. Xu, ‘‘Sentiment computing for the

news event based on the social media big data,’’ IEEE Access, vol. 5,

pp. 2373–2382, 2016.


[3] M. N. Injadat, F. Salo, and A. B. Nassif, ‘‘Data mining techniques in social

media: A survey,’’ Neurocomputing, vol. 214, pp. 654–670, Nov. 2016.


[4] T. A. A. Al-Moslmi, Machine Learning and Lexicon-Based Approach for

Arabic Sentiment Analysis. Bangi, Malaysia: Fakulti Teknologi & Sains

Maklumat/Institut, 2014.


[5] N. Omar, M. Albared, T. Al-Moslmi, and A. Al-Shabi, ‘‘A comparative

study of feature selection and machine learning algorithms for arabic

sentiment classification,’’ in Information Retrieval Technology. Springer,

2014, pp. 429–443.


[6] M. Bouazizi and T. Ohtsuki, ‘‘A pattern-based approach for sarcasm detec-

tion on twitter,’’ IEEE Access, vol. 4, pp. 5477–5488, 2016.


[7] F. Bertola and V. Patti, ‘‘Ontology-based affective models to organize

artworks in the social semantic Web,’’ Inf. Process. Manage., vol. 52, no. 1,

pp. 139–162, 2016.


[8] G. Vinodhini and R. M. Chandrasekaran, ‘‘A sampling based sentiment

mining approach for e-commerce applications,’’ Inf. Process. Manag.,

vol. 53, no. 1, pp. 223–236, 2016.


[9] R. Piryani, D. Madhavi, and V. K. Singh, ‘‘Analytical mapping of opinion

mining and sentiment analysis research during 2000âĂŞ2015,’’ Inf. Pro-

cess. Manag., vol. 53, no. 1, pp. 122–150, 2016.


[10] T. Al-Moslmi, S. Gaber, A. Al-Shabi, M. Albared, and N. Omar, ‘‘Feature

selection methods effects on machine learning approaches in malay senti-

ment analysis,’’ in Proc. 1st ICRIL-Int. Conf. Inno. Sci. Technol. (IICIST),

2015, pp. 1–2.


[11] N. Omar, M. Albared, A. Al-Shabi, and T. Al-Moslmi, ‘‘Ensemble of

classification algorithms for subjectivity and sentiment analysis of ara-

bic customers ‘reviews,’’ Int. J. Adv. Comput. Technol., vol. 14, no. 5,

pp. 77–85, 2013.


[12] Z. Liu, S. Liu, L. Liu, J. Sun, X. Peng, and T. Wang, ‘‘Sentiment recognition

of online course reviews using multi-swarm optimization-based selected

features,’’ Neurocomputing, vol. 185, pp. 11–20, Mar. 2016.


[13] T. Al-Moslmi, M. Albared, A. Al-Shabi, N. Omar, and S. Abdullah,

‘‘Arabic senti-lexicon: Constructing publicly available language resources

for Arabic sentiment analysis,’’ J. Inf. Sci., vol. 23, p. 16555, Feb. 2017.


[14] F. Wu, Y. Huang, and Y. Song, ‘‘Structured microblog sentiment clas-

sification via social context regularization,’’ Neurocomputing, vol. 175,

pp. 599–609, Jun. 2016.


[15] K. Dashtipour et al., ‘‘Multilingual sentiment analysis: State of the art and

independent comparison of techniques,’’ Cognit. Comput., vol. 8, no. 4,

pp. 1–15, 2016.


[16] S. Keele, ‘‘Guidelines for performing systematic literature reviews in soft-

ware engineering,’’ Dept. Comput. Sci., Univ. f Durham, Durham, U.K.,

Tech. Rep. Ver.2.3, 2007.


[17] J. Carrillo de Albornoz, L. Plaza, and P. Gervás, ‘‘A hybrid approach to

emotional sentence polarity and intensity classification,’’ in Proc. 14th

Conf. Comput. Natural Lang. Learn., 2010, pp. 153–161.


[18] I. Maks and P. Vossen, ‘‘A lexicon model for deep sentiment analysis

and opinion mining applications,’’ Decision Support Syst., vol. 53, no. 4,

pp. 680–688, Nov. 2012.


[19] S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, ‘‘Cross-domain sentiment

classification via spectral feature alignment,’’ in Proc. 19th Int. Conf. World

Wide Web, vol. 10. 2010, p. 751.


[20] J. Blitzer, M. Dredze, and F. Pereira, ‘‘Biographies, bollywood, boom-

boxes and blenders: Domain adaptation for sentiment classification,’’ in

Proc. ACL, vol. 7. 2007, pp. 440–447.


[21] D. Bollegala, T. Mu, and J. Y. Goulermas, ‘‘Cross-domain sentiment

classification using sentiment sensitive embeddings,’’ IEEE Trans. Knowl.

Data Eng., vol. 28, no. 2, pp. 398–410, Feb. 2016.


[22] J. Liang, K. Zhang, X. Zhou, Y. Hu, J. Tan, and S. Bai, ‘‘Leveraging

latent sentiment constraint in probabilistic matrix factorization for cross-

domain sentiment classification,’’ Proc. Comput. Sci., vol. 80, pp. 366–375,

Mar. 2016.


[23] N. X. Bach, V. T. Hai, and T. M. Phuong, ‘‘Cross-domain sentiment

classification with word embeddings and canonical correlation analysis,’’

in Proc. 7th Symp. Inf. Commun. Technol., 2016, pp. 159–166.


[24] Y. Zhang, X. Hu, P. Li, L. Li, and X. Wu, ‘‘Cross-domain sentiment

classification-feature divergence, polarity divergence or both?’’ Pattern

Recognit. Lett., vol. 65, pp. 44–50, Jun. 2015.


[25] M. Franco-Salvador, F. L. Cruz, J. A. Troyano, and P. Rosso,

‘‘Cross-domain polarity classification using a knowledge-enhanced meta-

classifier,’’ Knowl.-Based Syst., vol. 86, pp. 46–56, Jun. 2015.


[26] H. Hammer, A. Yazidi, A. Bai, and P. Engelstad, ‘‘Building domain specific

sentiment lexicons combining information from many sentiment lexicons

and a domain specific corpus,’’ Comput. Sci. Appl., vol. 456, pp. 205–216,

Dec. 2015.


[27] G. Zhou, Y. Zhou, X. Guo, X. Tu, and T. He, ‘‘Cross-domain senti-

ment classification via topical correspondence transfer,’’ Neurocomputing,

vol. 159, pp. 298–305, Dec. 2015.


[28] R. Zhao and K. Mao, ‘‘Supervised adaptive-transfer PLSA for cross-

domain text classification,’’ in Proc. IEEE Int. Conf. Data Mining Work-

shop, Jan. 2014, pp. 259–266.


[29] C. Lin, Y. Lee, C. Yu, and H. Chen, ‘‘Exploring ensemble of models

in taxonomy-based cross-domain sentiment classification,’’ in Proc. 23rd

ACM Int. Conf. Conf. Inf. Knowl. Manage.-(CIKM), 2014, pp. 1279–1288.


[30] Y. Tsai, R. T. Tsai, C. Chueh, and S. Chang, ‘‘Cross-domain opinion word

identification with query-by-committee active learning,’’ in Technologies

and Applications of Artificial Intelligence. Cham, Switzerland: Springer,

2014, pp. 334–343.


[31] A. Tsakalidis, ‘‘An ensemble model for cross-domain polarity classifica-

tion on twitter,’’ in Web Information Systems Engineering. Cham, Switzer-

land: Springer, 2014, pp. 168–177.


[32] F. Bisio, P. Gastaldo, C. Peretti, R. Zunino, and E. Cambria, ‘‘Data intensive

review mining for sentiment classification across heterogeneous domains,’’

in Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining, vol. 13.

Aug. 2013, pp. 1061–1067.


[33] P. Yang, W. Gao, Q. Tan, and K.-F. Wong, ‘‘A link-bridged topic model

for cross-domain document classification,’’ Inf. Process. Manage., vol. 49,

no. 6, pp. 1181–1193, Nov. 2013.


[34] D. Bollegala, D. Weir, and J. Carroll, ‘‘Cross-domain sentiment classifi-

cation using a sentiment sensitive thesaurus,’’ IEEE Trans. Knowl. Data

Eng., vol. 25, no. 8, pp. 1719–1731, Aug. 2013.


[35] R. Xia, C. Zong, X. Hu, and E. Cambria, ‘‘Feature ensemble plus sample

selection: Domain adaptation for sentiment classification,’’ IEEE Intell.

Syst., vol. 28, no. 3, pp. 10–18, May 2013.


[36] Z. Zhu, D. Dai, Y. Ding, J. Qian, and S. Li, ‘‘Employing emotion keywords

to improve cross-domain sentiment classification,’’ in Proc. Chin. Lexical

Semantics, 2013, pp. 64–71.


[37] Y. He, C. Lin, W. Gao, and K.-F. Wong, ‘‘Dynamic joint sentiment-topic

model,’’ ACM Trans. Intell. Syst. Technol., vol. 5, no. 1, p. 6, 2013.


[38] S. Li, Y. Xue, Z. Wang, and G. Zhou, ‘‘Active learning for cross-domain

sentiment classification,’’ in Proc. 23rd Int. Joint Conf. Artif. Intell., 2013,

pp. 2127–2133.


[39] N. Ponomareva and M. Thelwall, ‘‘Semi-supervised vs. cross-domain

graphs for sentiment analysis,’’ in Proc. RANLP, 2013, pp. 571–578.


[40] B. Ohana, S. J. Delany, and B. Tierney, ‘‘A case-based approach to cross

domain sentiment classification,’’ in Lecture Notes in Computer Science,

vol. 7466. Cham, Switzerland: Springer, 2012, pp. 284–296.


[41] R. Remus, ‘‘Domain adaptation using domain similarity- and domain

complexity-based instance selection for cross-domain sentiment analy-

sis,’’ in Proc.-12th IEEE Int. Conf. Data Mining Workshops, Jun. 2012,

pp. 717–723.


[42] N. Ponomareva and M. Thelwall, ‘‘Biographies or blenders: Which

resource is best for cross-domain sentiment analysis?’’ in Lecture Notes in

Computer Science, vol. 7181. Cham, Switzerland: Springer, 2012, pp. 488–

499.


[43] N. Ponomareva and M. Thelwall, ‘‘Do neighbours help?: An exploration

of graph-based algorithms for cross-domain sentiment classification,’’ in

Proc. Joint Conf. Empirical Methods Natural Lang. Process. Comput.

Natural Lang. Learn., Jul. 2012, pp. 655–665.


[44] S. D. Roy, T. Mei, W. Zeng, and S. Li, ‘‘SocialTransfer: Cross-domain

transfer learning from social streams for media applications,’’ in Proc. 20th

ACM Int. Conf. Multimedia, 2012, pp. 649–658.


[45] Q. Wu and S. Tan, ‘‘A two-stage framework for cross-domain sentiment

classification,’’ Expert Syst. Appl., vol. 38, no. 11, pp. 14269–14275,

2011.


[46] Y. He, C. Lin, and H. Alani, ‘‘Automatically extracting polarity-bearing

topics for cross-domain sentiment classification,’’ in Proc. 49th Annu.

Meeting, Jun. 2011, pp. 123–131.


[47] X. Glorot, A. Bordes, and Y. Bengio, ‘‘Domain adaptation for large-scale

sentiment classification: A deep learning approach,’’ in Proc. 28th Int.

Conf. Mach. Learn., 2011, pp. 513–520.


[48] S. Li and C. Zong, ‘‘Multi-domain adaptation for sentiment classification:

Using multiple classifier combining methods,’’ in Proc. Int. Conf. Natural

Lang. Process. Knowl. Eng. (NLP-KE), 2008, pp. 1–8.


[49] G.-R. Xue, W. Dai, Q. Yang, and Y. Yu, ‘‘Topic-bridged PLSA for cross-

domain text classification,’’ in Proc. 31st Annu. Int. ACM SIGIR Conf. Res.

Develop. Inf. Retrieval, 2008, pp. 627–634.


[50] B. Wang, J. Tang, W. Fan, S. Chen, Z. Yang, and Y. Liu, ‘‘Heterogeneous

cross domain ranking in latent space,’’ in Proc. 18th ACM Conf. Inf. Knowl.

Manage., 2009, pp. 987–996.


[51] F. Zhuang et al., ‘‘Collaborative dual-PLSA: Mining distinction

and commonality across multiple domains for text classification,’’

in Proc. 19th ACM Int. Conf. Inf. Knowl. Manage., 2010,

pp. 359–368.


[52] K. Lang, ‘‘NewsWeeder: Learning to filter netnews,’’ in Proc. 12th Int.

Conf. Mach. Learn., 1995, pp. 331–339.


[53] D. D. Lewis. (1997). Reuters-21578 Text Categorization Test Col-

lection, Distribution 1.0. [Online]. Available: http://www.research.att.

com/lewis/reuters21578.html


[54] A. Go, R. Bhayani, and L. Huang, ‘‘Twitter sentiment classification

using distant supervision,’’ CS224N Project Rep., Stanford, vol. 1, no. 2,

pp. 12–18, 2009.


[55] M. Speriosu, N. Sudan, S. Upadhyay, and J. Baldridge, ‘‘Twitter polar-

ity classification with label propagation over lexical links and the fol-

lower graph,’’ in Proc. 1st Workshop Unsupervised Learn. (NLP), 2011,

pp. 53–63.


[56] H. Wang, Y. Lu, and C. Zhai, ‘‘Latent aspect rating analysis on review text

data: A rating regression approach,’’ in Proc. 16th ACM SIGKDD Int. Conf.

Knowl. Discovery Data Mining, 2010, pp. 783–792.


[57] A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, ‘‘Automating the

construction of Internet portals with machine learning,’’ Inf. Retr., vol. 3,

no. 2, pp. 127–163, 2000.


[58] B. Pang, L. Lee, and S. Vaithyanathan, ‘‘Thumbs up?: Sentiment classifi-

cation using machine learning techniques,’’ in Proc. ACL-Conf. Empirical

Methods Natural Lang., vol. 10. 2002, pp. 79–86.


[59] S. Baccianella, A. Esuli, and F. Sebastiani, ‘‘Multi-facet rating of prod-

uct reviews,’’ in Advances in Information Retrieval. Cham, Switzerland:

Springer, 2009, pp. 461–472.


[60] N. Jindal and B. Liu, ‘‘Opinion spam and analysis,’’ in Proc. Int. Conf. Web

Search Data Mining, 2008, pp. 219–230.


[61] J. Blitzer, R. McDonald, and F. Pereira, ‘‘Domain adaptation with struc-

tural correspondence learning,’’ in Proc. Conf. Empirical Methods Natural

Lang. Process., 2006, pp. 120–128.


[62] S. Li and C. Zong, ‘‘Multi-domain sentiment classification,’’ in Proc. 46th

Annu. Meeting Assoc. Comput. Linguistics Human Lang. Technol., Short

Papers, 2008, pp. 257–260.


[63] H. Guo, H. Zhu, Z. Guo, X. Zhang, X. Wu, and Z. Su, ‘‘Domain adaptation

with latent semantic association for named entity recognition,’’ in Proc.

Human Lang. Technol., Annu. Conf. North Amer. Chapter Assoc. Comput.

Linguistics, 2009, pp. 281–289.


[64] P. Wang, C. Domeniconi, and J. Hu, ‘‘Using wikipedia for co-clustering

based cross-domain text classification,’’ in Proc. 8th IEEE Int. Conf. Data

Mining (ICDM), Dec. 2008, pp. 1085–1090.


[65] G. Paltoglou and M. Thelwall, ‘‘A study of information retrieval weighting

schemes for sentiment analysis,’’ in Proc. 48th Annu. Meeting Assoc.

Comput. Linguistics, 2010, pp. 1386–1395.


[66] N. Jakob and I. Gurevych, ‘‘Extracting opinion targets in a single-and

cross-domain setting with conditional random fields,’’ in Proc. Conf.

Empirical Methods Natural Lang. Process., Oct. 2010, pp. 1035–1045.


[67] F. Huang and A. Yates, ‘‘Exploring representation-learning approaches to

domain adaptation,’’ in Proc. Workshop Domain Adaptation Natural Lang.

Process., 2010, pp. 23–30.


[68] Y. Bao, N. Collier, and A. Datta, ‘‘A partially supervised cross-collection

topic model for cross-domain text classification,’’ in Proc. 22nd ACM Int.

Conf. Conf. Inf. Knowl. Manage., 2013, pp. 239–248.


[69] D. C. T. Hofmann, ‘‘The missing link-a probabilistic model of docu-

ment content and hypertext connectivity,’’ in Proc. Conf. Adv. Neural Inf.

Process. Syst., 2001, pp. 430–436.


[70] R. Serafin and B. Di Eugenio, ‘‘FLSA: Extending latent semantic analysis

with features for dialogue act classification,’’ in Proc. 42nd Annu. Meeting

Assoc. Comput. Linguistics, 2004, p. 692.


[71] A. P. Dempster, N. M. Laird, and D. B. Rubin, ‘‘Maximum likelihood from

incomplete data via the EM algorithm,’’ J. Roy. Statist. Soc. Ser. B, 1977,

pp. 1–38.


[72] T. Li, V. Sindhwani, C. Ding, and Y. Zhang, ‘‘Knowledge transformation

for cross-domain sentiment classification,’’ in Proc. 32nd Int. ACM SIGIR

Conf. Res. Develop. Inf. Retrieval, 2009, pp. 716–717.


[73] L. Li, X. Jin, and M. Long, ‘‘Topic correlation analysis for cross-domain

text classification,’’ in Proc. AAAI, 2012, p. 12.


[74] T. Joachims, ‘‘Transductive inference for text classification using support

vector machines,’’ in Proc. ICML, vol. 99. 1999, pp. 200–209.


[75] Q. Wu, S. Tan, H. Zhai, G. Zhang, M. Duan, and X. Cheng, ‘‘Senti-

Rank: Cross-domain graph ranking for sentiment classification,’’ in Proc.

IEEE/WIC/ACM Int. Joint Conf. Web Intell. Intell. Agent Technol., vol. 1.

2009, pp. 309–314.


[76] J. Yu and J. Jiang, ‘‘Learning sentence embeddings with auxiliary tasks for

cross-domain sentiment classification,’’ in Proc. Conf. Empirical Methods

Natural Lang. Process., 2016, pp. 236–246.


[77] D. M. Blei and M. I. Jordan, ‘‘Modeling annotated data,’’ in Proc.

26th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2003,

pp. 127–134.


[78] J. Kranjc et al., ‘‘Active learning for sentiment analysis on data streams:

Methodology and workflow implementation in the ClowdFlows platform,’’

Inf. Process. Manage., vol. 51, no. 2, pp. 187–203, 2015.


[79] B. Settles, Active Learning Literature Survey. Madison, WI, USA:

Univ. Wisconsin, 2010.


[80] F. Olsson, ‘‘A literature survey of active machine learning in the context of

natural language processing,’’ in Proc. SODA, 2009, p. 36.


[81] D. Nozza, E. Fersini, and E. Messina, ‘‘Deep learning and ensemble

methods for domain adaptation,’’ in Proc. IEEE 28th Int. Conf. Tools Artif.

Intell. (ICTAI), Nov. 2016, pp. 184–189.


[82] M. Long, J. Wang, Y. Cao, J. Sun, and S. Y. Philip, ‘‘Deep learning of

transferable representation for scalable domain adaptation,’’ IEEE Trans.

Knowl. Data Eng., vol. 28, no. 8, pp. 2027–2040, Feb. 2016.


[83] T. Hofmann, ‘‘Unsupervised learning by probabilistic latent semantic anal-

ysis,’’ Mach. Learn., vol. 42, no. 1, pp. 177–196, Jan. 2001.


[84] P. Sanju and T. T. Mirnalinee, ‘‘Construction of enhanced sentiment sensi-

tive thesaurus for cross domain sentiment classification using wiktionary,’’

in Proc. 3rd Int. Conf. Soft Comput. Problem Solving, 2014, pp. 195–206.


[85] S. M. Jiménez-Zafra et al., ‘‘Domain adaptation of polarity lexicon com-

bining term frequency and bootstrapping,’’ in Proc. NAACL-HLT, 2016,

pp. 137–146.


[86] A. Aamodt and E. Plaza, ‘‘Case-based reasoning: Foundational issues,

methodological variations, and system approaches,’’ AI Commun., vol. 7,

no. 1, pp. 39–59, 1994.


[87] X. Zhu, J. Lafferty, and R. Rosenfeld, Semi-Supervised Learning With

Graphs. Pittsburgh, PA, USA: Carnegie Mellon Univ., 2005.


[88] X. Zhu and Z. Ghahramani, Learning From Labeled and Unlabeled Data

With Label Propagation. Seattle, WA, USA: Semantic Scholar, 2002.


[89] A. B. Goldberg and X. Zhu, ‘‘Seeing stars when there aren’t many stars:

Graph-based semi-supervised learning for sentiment categorization,’’ in

Proc. 1st Workshop Graph Based Methods Natural Lang. Process., 2006,

pp. 45–52.


[90] R. Navigli and S. P. Ponzetto, ‘‘BabelNet: The automatic construction,

evaluation and application of a wide-coverage multilingual semantic net-

work,’’ Artif. Intell., vol. 193, pp. 217–250, Dec. 2012.


[91] D. Moher, A. Liberati, J. Tetzlaff, and D. G. Altman, ‘‘Preferred reporting

items for systematic reviews and meta-analyses: The PRISMA statement,’’

Ann. Internal Med., vol. 151, no. 4, pp. 264–269, 2009.


Wednesday, March 1, 2017

The Effects of Emoji in Sentiment Analysis


.
Abstract: This study investigates the usage of Emoji characters on social networks and the effects of Emoji in  text  mining and sentiment  analysis. As  it provides live access  to  text  based  public opinions,  we  chose Twitter  as our  information  source  in  our  analysis.  We  collected  text  data  for  some  global  positive  and negative events to analyze the impact of Emoji characters in sentiment analysis. In our analysis, we noticed that  the  utilization  of  Emoji  characters  in  sentiment  analysis  results  in  higher  sentiment  scores. Furthermore, we observed that the usage of Emoji characters in sentiment analysis appeared to have higher impact on overall sentiments of the positive opinions in comparison to the negative opinions.    Key words: Emoji, opinion mining, sentiment analysis, twitter.
.
https://www.researchgate.net/publication/320446679_The_Effects_of_Emoji_in_Sentiment_Analysis