Friday, March 31, 2017

Approaches to Cross-Domain Sentiment Analysis: A Systematic Literature Review


 

Abstract

A sentiment analysis has received a lot of attention from researchers working in the fields of natural language processing and text mining. 

However, there is a lack of annotated data sets that can be used to train a model for all domains, which is hampering the accuracy of sentiment analysis. 

Many research studies have attempted to tackle this issue and to improve cross-domain sentiment classification. 

In this paper, we present the results of a comprehensive systematic literature review of the methods and techniques employed in a cross-domain sentiment analysis. 

We focus on studies published during the period of 2010-2016. 

From our analysis of those works, it is clear that there is no perfect solution. 

Hence, one of the aims of this review is to create a resource in the form of an overview of the techniques, methods, and approaches that have been used to attempt to solve the problem of cross-domain sentiment analysis in order to assist researchers in developing new and more accurate techniques in the future.

https://ieeexplore.ieee.org/abstract/document/7891035


REFERENCES


[1] A. Immonen, P. Pääkkönen, and E. Ovaska, ‘‘Evaluating the Quality

of Social Media Data in Big Data Architecture,’’ IEEE Access, vol. 3,

pp. 2028–2043, 2015.


[2] D. Jiang, X. Luo, J. Xuan, and Z. Xu, ‘‘Sentiment computing for the

news event based on the social media big data,’’ IEEE Access, vol. 5,

pp. 2373–2382, 2016.


[3] M. N. Injadat, F. Salo, and A. B. Nassif, ‘‘Data mining techniques in social

media: A survey,’’ Neurocomputing, vol. 214, pp. 654–670, Nov. 2016.


[4] T. A. A. Al-Moslmi, Machine Learning and Lexicon-Based Approach for

Arabic Sentiment Analysis. Bangi, Malaysia: Fakulti Teknologi & Sains

Maklumat/Institut, 2014.


[5] N. Omar, M. Albared, T. Al-Moslmi, and A. Al-Shabi, ‘‘A comparative

study of feature selection and machine learning algorithms for arabic

sentiment classification,’’ in Information Retrieval Technology. Springer,

2014, pp. 429–443.


[6] M. Bouazizi and T. Ohtsuki, ‘‘A pattern-based approach for sarcasm detec-

tion on twitter,’’ IEEE Access, vol. 4, pp. 5477–5488, 2016.


[7] F. Bertola and V. Patti, ‘‘Ontology-based affective models to organize

artworks in the social semantic Web,’’ Inf. Process. Manage., vol. 52, no. 1,

pp. 139–162, 2016.


[8] G. Vinodhini and R. M. Chandrasekaran, ‘‘A sampling based sentiment

mining approach for e-commerce applications,’’ Inf. Process. Manag.,

vol. 53, no. 1, pp. 223–236, 2016.


[9] R. Piryani, D. Madhavi, and V. K. Singh, ‘‘Analytical mapping of opinion

mining and sentiment analysis research during 2000âĂŞ2015,’’ Inf. Pro-

cess. Manag., vol. 53, no. 1, pp. 122–150, 2016.


[10] T. Al-Moslmi, S. Gaber, A. Al-Shabi, M. Albared, and N. Omar, ‘‘Feature

selection methods effects on machine learning approaches in malay senti-

ment analysis,’’ in Proc. 1st ICRIL-Int. Conf. Inno. Sci. Technol. (IICIST),

2015, pp. 1–2.


[11] N. Omar, M. Albared, A. Al-Shabi, and T. Al-Moslmi, ‘‘Ensemble of

classification algorithms for subjectivity and sentiment analysis of ara-

bic customers ‘reviews,’’ Int. J. Adv. Comput. Technol., vol. 14, no. 5,

pp. 77–85, 2013.


[12] Z. Liu, S. Liu, L. Liu, J. Sun, X. Peng, and T. Wang, ‘‘Sentiment recognition

of online course reviews using multi-swarm optimization-based selected

features,’’ Neurocomputing, vol. 185, pp. 11–20, Mar. 2016.


[13] T. Al-Moslmi, M. Albared, A. Al-Shabi, N. Omar, and S. Abdullah,

‘‘Arabic senti-lexicon: Constructing publicly available language resources

for Arabic sentiment analysis,’’ J. Inf. Sci., vol. 23, p. 16555, Feb. 2017.


[14] F. Wu, Y. Huang, and Y. Song, ‘‘Structured microblog sentiment clas-

sification via social context regularization,’’ Neurocomputing, vol. 175,

pp. 599–609, Jun. 2016.


[15] K. Dashtipour et al., ‘‘Multilingual sentiment analysis: State of the art and

independent comparison of techniques,’’ Cognit. Comput., vol. 8, no. 4,

pp. 1–15, 2016.


[16] S. Keele, ‘‘Guidelines for performing systematic literature reviews in soft-

ware engineering,’’ Dept. Comput. Sci., Univ. f Durham, Durham, U.K.,

Tech. Rep. Ver.2.3, 2007.


[17] J. Carrillo de Albornoz, L. Plaza, and P. Gervás, ‘‘A hybrid approach to

emotional sentence polarity and intensity classification,’’ in Proc. 14th

Conf. Comput. Natural Lang. Learn., 2010, pp. 153–161.


[18] I. Maks and P. Vossen, ‘‘A lexicon model for deep sentiment analysis

and opinion mining applications,’’ Decision Support Syst., vol. 53, no. 4,

pp. 680–688, Nov. 2012.


[19] S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, ‘‘Cross-domain sentiment

classification via spectral feature alignment,’’ in Proc. 19th Int. Conf. World

Wide Web, vol. 10. 2010, p. 751.


[20] J. Blitzer, M. Dredze, and F. Pereira, ‘‘Biographies, bollywood, boom-

boxes and blenders: Domain adaptation for sentiment classification,’’ in

Proc. ACL, vol. 7. 2007, pp. 440–447.


[21] D. Bollegala, T. Mu, and J. Y. Goulermas, ‘‘Cross-domain sentiment

classification using sentiment sensitive embeddings,’’ IEEE Trans. Knowl.

Data Eng., vol. 28, no. 2, pp. 398–410, Feb. 2016.


[22] J. Liang, K. Zhang, X. Zhou, Y. Hu, J. Tan, and S. Bai, ‘‘Leveraging

latent sentiment constraint in probabilistic matrix factorization for cross-

domain sentiment classification,’’ Proc. Comput. Sci., vol. 80, pp. 366–375,

Mar. 2016.


[23] N. X. Bach, V. T. Hai, and T. M. Phuong, ‘‘Cross-domain sentiment

classification with word embeddings and canonical correlation analysis,’’

in Proc. 7th Symp. Inf. Commun. Technol., 2016, pp. 159–166.


[24] Y. Zhang, X. Hu, P. Li, L. Li, and X. Wu, ‘‘Cross-domain sentiment

classification-feature divergence, polarity divergence or both?’’ Pattern

Recognit. Lett., vol. 65, pp. 44–50, Jun. 2015.


[25] M. Franco-Salvador, F. L. Cruz, J. A. Troyano, and P. Rosso,

‘‘Cross-domain polarity classification using a knowledge-enhanced meta-

classifier,’’ Knowl.-Based Syst., vol. 86, pp. 46–56, Jun. 2015.


[26] H. Hammer, A. Yazidi, A. Bai, and P. Engelstad, ‘‘Building domain specific

sentiment lexicons combining information from many sentiment lexicons

and a domain specific corpus,’’ Comput. Sci. Appl., vol. 456, pp. 205–216,

Dec. 2015.


[27] G. Zhou, Y. Zhou, X. Guo, X. Tu, and T. He, ‘‘Cross-domain senti-

ment classification via topical correspondence transfer,’’ Neurocomputing,

vol. 159, pp. 298–305, Dec. 2015.


[28] R. Zhao and K. Mao, ‘‘Supervised adaptive-transfer PLSA for cross-

domain text classification,’’ in Proc. IEEE Int. Conf. Data Mining Work-

shop, Jan. 2014, pp. 259–266.


[29] C. Lin, Y. Lee, C. Yu, and H. Chen, ‘‘Exploring ensemble of models

in taxonomy-based cross-domain sentiment classification,’’ in Proc. 23rd

ACM Int. Conf. Conf. Inf. Knowl. Manage.-(CIKM), 2014, pp. 1279–1288.


[30] Y. Tsai, R. T. Tsai, C. Chueh, and S. Chang, ‘‘Cross-domain opinion word

identification with query-by-committee active learning,’’ in Technologies

and Applications of Artificial Intelligence. Cham, Switzerland: Springer,

2014, pp. 334–343.


[31] A. Tsakalidis, ‘‘An ensemble model for cross-domain polarity classifica-

tion on twitter,’’ in Web Information Systems Engineering. Cham, Switzer-

land: Springer, 2014, pp. 168–177.


[32] F. Bisio, P. Gastaldo, C. Peretti, R. Zunino, and E. Cambria, ‘‘Data intensive

review mining for sentiment classification across heterogeneous domains,’’

in Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining, vol. 13.

Aug. 2013, pp. 1061–1067.


[33] P. Yang, W. Gao, Q. Tan, and K.-F. Wong, ‘‘A link-bridged topic model

for cross-domain document classification,’’ Inf. Process. Manage., vol. 49,

no. 6, pp. 1181–1193, Nov. 2013.


[34] D. Bollegala, D. Weir, and J. Carroll, ‘‘Cross-domain sentiment classifi-

cation using a sentiment sensitive thesaurus,’’ IEEE Trans. Knowl. Data

Eng., vol. 25, no. 8, pp. 1719–1731, Aug. 2013.


[35] R. Xia, C. Zong, X. Hu, and E. Cambria, ‘‘Feature ensemble plus sample

selection: Domain adaptation for sentiment classification,’’ IEEE Intell.

Syst., vol. 28, no. 3, pp. 10–18, May 2013.


[36] Z. Zhu, D. Dai, Y. Ding, J. Qian, and S. Li, ‘‘Employing emotion keywords

to improve cross-domain sentiment classification,’’ in Proc. Chin. Lexical

Semantics, 2013, pp. 64–71.


[37] Y. He, C. Lin, W. Gao, and K.-F. Wong, ‘‘Dynamic joint sentiment-topic

model,’’ ACM Trans. Intell. Syst. Technol., vol. 5, no. 1, p. 6, 2013.


[38] S. Li, Y. Xue, Z. Wang, and G. Zhou, ‘‘Active learning for cross-domain

sentiment classification,’’ in Proc. 23rd Int. Joint Conf. Artif. Intell., 2013,

pp. 2127–2133.


[39] N. Ponomareva and M. Thelwall, ‘‘Semi-supervised vs. cross-domain

graphs for sentiment analysis,’’ in Proc. RANLP, 2013, pp. 571–578.


[40] B. Ohana, S. J. Delany, and B. Tierney, ‘‘A case-based approach to cross

domain sentiment classification,’’ in Lecture Notes in Computer Science,

vol. 7466. Cham, Switzerland: Springer, 2012, pp. 284–296.


[41] R. Remus, ‘‘Domain adaptation using domain similarity- and domain

complexity-based instance selection for cross-domain sentiment analy-

sis,’’ in Proc.-12th IEEE Int. Conf. Data Mining Workshops, Jun. 2012,

pp. 717–723.


[42] N. Ponomareva and M. Thelwall, ‘‘Biographies or blenders: Which

resource is best for cross-domain sentiment analysis?’’ in Lecture Notes in

Computer Science, vol. 7181. Cham, Switzerland: Springer, 2012, pp. 488–

499.


[43] N. Ponomareva and M. Thelwall, ‘‘Do neighbours help?: An exploration

of graph-based algorithms for cross-domain sentiment classification,’’ in

Proc. Joint Conf. Empirical Methods Natural Lang. Process. Comput.

Natural Lang. Learn., Jul. 2012, pp. 655–665.


[44] S. D. Roy, T. Mei, W. Zeng, and S. Li, ‘‘SocialTransfer: Cross-domain

transfer learning from social streams for media applications,’’ in Proc. 20th

ACM Int. Conf. Multimedia, 2012, pp. 649–658.


[45] Q. Wu and S. Tan, ‘‘A two-stage framework for cross-domain sentiment

classification,’’ Expert Syst. Appl., vol. 38, no. 11, pp. 14269–14275,

2011.


[46] Y. He, C. Lin, and H. Alani, ‘‘Automatically extracting polarity-bearing

topics for cross-domain sentiment classification,’’ in Proc. 49th Annu.

Meeting, Jun. 2011, pp. 123–131.


[47] X. Glorot, A. Bordes, and Y. Bengio, ‘‘Domain adaptation for large-scale

sentiment classification: A deep learning approach,’’ in Proc. 28th Int.

Conf. Mach. Learn., 2011, pp. 513–520.


[48] S. Li and C. Zong, ‘‘Multi-domain adaptation for sentiment classification:

Using multiple classifier combining methods,’’ in Proc. Int. Conf. Natural

Lang. Process. Knowl. Eng. (NLP-KE), 2008, pp. 1–8.


[49] G.-R. Xue, W. Dai, Q. Yang, and Y. Yu, ‘‘Topic-bridged PLSA for cross-

domain text classification,’’ in Proc. 31st Annu. Int. ACM SIGIR Conf. Res.

Develop. Inf. Retrieval, 2008, pp. 627–634.


[50] B. Wang, J. Tang, W. Fan, S. Chen, Z. Yang, and Y. Liu, ‘‘Heterogeneous

cross domain ranking in latent space,’’ in Proc. 18th ACM Conf. Inf. Knowl.

Manage., 2009, pp. 987–996.


[51] F. Zhuang et al., ‘‘Collaborative dual-PLSA: Mining distinction

and commonality across multiple domains for text classification,’’

in Proc. 19th ACM Int. Conf. Inf. Knowl. Manage., 2010,

pp. 359–368.


[52] K. Lang, ‘‘NewsWeeder: Learning to filter netnews,’’ in Proc. 12th Int.

Conf. Mach. Learn., 1995, pp. 331–339.


[53] D. D. Lewis. (1997). Reuters-21578 Text Categorization Test Col-

lection, Distribution 1.0. [Online]. Available: http://www.research.att.

com/lewis/reuters21578.html


[54] A. Go, R. Bhayani, and L. Huang, ‘‘Twitter sentiment classification

using distant supervision,’’ CS224N Project Rep., Stanford, vol. 1, no. 2,

pp. 12–18, 2009.


[55] M. Speriosu, N. Sudan, S. Upadhyay, and J. Baldridge, ‘‘Twitter polar-

ity classification with label propagation over lexical links and the fol-

lower graph,’’ in Proc. 1st Workshop Unsupervised Learn. (NLP), 2011,

pp. 53–63.


[56] H. Wang, Y. Lu, and C. Zhai, ‘‘Latent aspect rating analysis on review text

data: A rating regression approach,’’ in Proc. 16th ACM SIGKDD Int. Conf.

Knowl. Discovery Data Mining, 2010, pp. 783–792.


[57] A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, ‘‘Automating the

construction of Internet portals with machine learning,’’ Inf. Retr., vol. 3,

no. 2, pp. 127–163, 2000.


[58] B. Pang, L. Lee, and S. Vaithyanathan, ‘‘Thumbs up?: Sentiment classifi-

cation using machine learning techniques,’’ in Proc. ACL-Conf. Empirical

Methods Natural Lang., vol. 10. 2002, pp. 79–86.


[59] S. Baccianella, A. Esuli, and F. Sebastiani, ‘‘Multi-facet rating of prod-

uct reviews,’’ in Advances in Information Retrieval. Cham, Switzerland:

Springer, 2009, pp. 461–472.


[60] N. Jindal and B. Liu, ‘‘Opinion spam and analysis,’’ in Proc. Int. Conf. Web

Search Data Mining, 2008, pp. 219–230.


[61] J. Blitzer, R. McDonald, and F. Pereira, ‘‘Domain adaptation with struc-

tural correspondence learning,’’ in Proc. Conf. Empirical Methods Natural

Lang. Process., 2006, pp. 120–128.


[62] S. Li and C. Zong, ‘‘Multi-domain sentiment classification,’’ in Proc. 46th

Annu. Meeting Assoc. Comput. Linguistics Human Lang. Technol., Short

Papers, 2008, pp. 257–260.


[63] H. Guo, H. Zhu, Z. Guo, X. Zhang, X. Wu, and Z. Su, ‘‘Domain adaptation

with latent semantic association for named entity recognition,’’ in Proc.

Human Lang. Technol., Annu. Conf. North Amer. Chapter Assoc. Comput.

Linguistics, 2009, pp. 281–289.


[64] P. Wang, C. Domeniconi, and J. Hu, ‘‘Using wikipedia for co-clustering

based cross-domain text classification,’’ in Proc. 8th IEEE Int. Conf. Data

Mining (ICDM), Dec. 2008, pp. 1085–1090.


[65] G. Paltoglou and M. Thelwall, ‘‘A study of information retrieval weighting

schemes for sentiment analysis,’’ in Proc. 48th Annu. Meeting Assoc.

Comput. Linguistics, 2010, pp. 1386–1395.


[66] N. Jakob and I. Gurevych, ‘‘Extracting opinion targets in a single-and

cross-domain setting with conditional random fields,’’ in Proc. Conf.

Empirical Methods Natural Lang. Process., Oct. 2010, pp. 1035–1045.


[67] F. Huang and A. Yates, ‘‘Exploring representation-learning approaches to

domain adaptation,’’ in Proc. Workshop Domain Adaptation Natural Lang.

Process., 2010, pp. 23–30.


[68] Y. Bao, N. Collier, and A. Datta, ‘‘A partially supervised cross-collection

topic model for cross-domain text classification,’’ in Proc. 22nd ACM Int.

Conf. Conf. Inf. Knowl. Manage., 2013, pp. 239–248.


[69] D. C. T. Hofmann, ‘‘The missing link-a probabilistic model of docu-

ment content and hypertext connectivity,’’ in Proc. Conf. Adv. Neural Inf.

Process. Syst., 2001, pp. 430–436.


[70] R. Serafin and B. Di Eugenio, ‘‘FLSA: Extending latent semantic analysis

with features for dialogue act classification,’’ in Proc. 42nd Annu. Meeting

Assoc. Comput. Linguistics, 2004, p. 692.


[71] A. P. Dempster, N. M. Laird, and D. B. Rubin, ‘‘Maximum likelihood from

incomplete data via the EM algorithm,’’ J. Roy. Statist. Soc. Ser. B, 1977,

pp. 1–38.


[72] T. Li, V. Sindhwani, C. Ding, and Y. Zhang, ‘‘Knowledge transformation

for cross-domain sentiment classification,’’ in Proc. 32nd Int. ACM SIGIR

Conf. Res. Develop. Inf. Retrieval, 2009, pp. 716–717.


[73] L. Li, X. Jin, and M. Long, ‘‘Topic correlation analysis for cross-domain

text classification,’’ in Proc. AAAI, 2012, p. 12.


[74] T. Joachims, ‘‘Transductive inference for text classification using support

vector machines,’’ in Proc. ICML, vol. 99. 1999, pp. 200–209.


[75] Q. Wu, S. Tan, H. Zhai, G. Zhang, M. Duan, and X. Cheng, ‘‘Senti-

Rank: Cross-domain graph ranking for sentiment classification,’’ in Proc.

IEEE/WIC/ACM Int. Joint Conf. Web Intell. Intell. Agent Technol., vol. 1.

2009, pp. 309–314.


[76] J. Yu and J. Jiang, ‘‘Learning sentence embeddings with auxiliary tasks for

cross-domain sentiment classification,’’ in Proc. Conf. Empirical Methods

Natural Lang. Process., 2016, pp. 236–246.


[77] D. M. Blei and M. I. Jordan, ‘‘Modeling annotated data,’’ in Proc.

26th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2003,

pp. 127–134.


[78] J. Kranjc et al., ‘‘Active learning for sentiment analysis on data streams:

Methodology and workflow implementation in the ClowdFlows platform,’’

Inf. Process. Manage., vol. 51, no. 2, pp. 187–203, 2015.


[79] B. Settles, Active Learning Literature Survey. Madison, WI, USA:

Univ. Wisconsin, 2010.


[80] F. Olsson, ‘‘A literature survey of active machine learning in the context of

natural language processing,’’ in Proc. SODA, 2009, p. 36.


[81] D. Nozza, E. Fersini, and E. Messina, ‘‘Deep learning and ensemble

methods for domain adaptation,’’ in Proc. IEEE 28th Int. Conf. Tools Artif.

Intell. (ICTAI), Nov. 2016, pp. 184–189.


[82] M. Long, J. Wang, Y. Cao, J. Sun, and S. Y. Philip, ‘‘Deep learning of

transferable representation for scalable domain adaptation,’’ IEEE Trans.

Knowl. Data Eng., vol. 28, no. 8, pp. 2027–2040, Feb. 2016.


[83] T. Hofmann, ‘‘Unsupervised learning by probabilistic latent semantic anal-

ysis,’’ Mach. Learn., vol. 42, no. 1, pp. 177–196, Jan. 2001.


[84] P. Sanju and T. T. Mirnalinee, ‘‘Construction of enhanced sentiment sensi-

tive thesaurus for cross domain sentiment classification using wiktionary,’’

in Proc. 3rd Int. Conf. Soft Comput. Problem Solving, 2014, pp. 195–206.


[85] S. M. Jiménez-Zafra et al., ‘‘Domain adaptation of polarity lexicon com-

bining term frequency and bootstrapping,’’ in Proc. NAACL-HLT, 2016,

pp. 137–146.


[86] A. Aamodt and E. Plaza, ‘‘Case-based reasoning: Foundational issues,

methodological variations, and system approaches,’’ AI Commun., vol. 7,

no. 1, pp. 39–59, 1994.


[87] X. Zhu, J. Lafferty, and R. Rosenfeld, Semi-Supervised Learning With

Graphs. Pittsburgh, PA, USA: Carnegie Mellon Univ., 2005.


[88] X. Zhu and Z. Ghahramani, Learning From Labeled and Unlabeled Data

With Label Propagation. Seattle, WA, USA: Semantic Scholar, 2002.


[89] A. B. Goldberg and X. Zhu, ‘‘Seeing stars when there aren’t many stars:

Graph-based semi-supervised learning for sentiment categorization,’’ in

Proc. 1st Workshop Graph Based Methods Natural Lang. Process., 2006,

pp. 45–52.


[90] R. Navigli and S. P. Ponzetto, ‘‘BabelNet: The automatic construction,

evaluation and application of a wide-coverage multilingual semantic net-

work,’’ Artif. Intell., vol. 193, pp. 217–250, Dec. 2012.


[91] D. Moher, A. Liberati, J. Tetzlaff, and D. G. Altman, ‘‘Preferred reporting

items for systematic reviews and meta-analyses: The PRISMA statement,’’

Ann. Internal Med., vol. 151, no. 4, pp. 264–269, 2009.


No comments: