Abstract
A sentiment analysis has received a lot of attention from researchers working in the fields of natural language processing and text mining.
However, there is a lack of annotated data sets that can be used to train a model for all domains, which is hampering the accuracy of sentiment analysis.
Many research studies have attempted to tackle this issue and to improve cross-domain sentiment classification.
In this paper, we present the results of a comprehensive systematic literature review of the methods and techniques employed in a cross-domain sentiment analysis.
We focus on studies published during the period of 2010-2016.
From our analysis of those works, it is clear that there is no perfect solution.
Hence, one of the aims of this review is to create a resource in the form of an overview of the techniques, methods, and approaches that have been used to attempt to solve the problem of cross-domain sentiment analysis in order to assist researchers in developing new and more accurate techniques in the future.
https://ieeexplore.ieee.org/abstract/document/7891035
REFERENCES
[1] A. Immonen, P. Pääkkönen, and E. Ovaska, ‘‘Evaluating the Quality
of Social Media Data in Big Data Architecture,’’ IEEE Access, vol. 3,
pp. 2028–2043, 2015.
[2] D. Jiang, X. Luo, J. Xuan, and Z. Xu, ‘‘Sentiment computing for the
news event based on the social media big data,’’ IEEE Access, vol. 5,
pp. 2373–2382, 2016.
[3] M. N. Injadat, F. Salo, and A. B. Nassif, ‘‘Data mining techniques in social
media: A survey,’’ Neurocomputing, vol. 214, pp. 654–670, Nov. 2016.
[4] T. A. A. Al-Moslmi, Machine Learning and Lexicon-Based Approach for
Arabic Sentiment Analysis. Bangi, Malaysia: Fakulti Teknologi & Sains
Maklumat/Institut, 2014.
[5] N. Omar, M. Albared, T. Al-Moslmi, and A. Al-Shabi, ‘‘A comparative
study of feature selection and machine learning algorithms for arabic
sentiment classification,’’ in Information Retrieval Technology. Springer,
2014, pp. 429–443.
[6] M. Bouazizi and T. Ohtsuki, ‘‘A pattern-based approach for sarcasm detec-
tion on twitter,’’ IEEE Access, vol. 4, pp. 5477–5488, 2016.
[7] F. Bertola and V. Patti, ‘‘Ontology-based affective models to organize
artworks in the social semantic Web,’’ Inf. Process. Manage., vol. 52, no. 1,
pp. 139–162, 2016.
[8] G. Vinodhini and R. M. Chandrasekaran, ‘‘A sampling based sentiment
mining approach for e-commerce applications,’’ Inf. Process. Manag.,
vol. 53, no. 1, pp. 223–236, 2016.
[9] R. Piryani, D. Madhavi, and V. K. Singh, ‘‘Analytical mapping of opinion
mining and sentiment analysis research during 2000âĂŞ2015,’’ Inf. Pro-
cess. Manag., vol. 53, no. 1, pp. 122–150, 2016.
[10] T. Al-Moslmi, S. Gaber, A. Al-Shabi, M. Albared, and N. Omar, ‘‘Feature
selection methods effects on machine learning approaches in malay senti-
ment analysis,’’ in Proc. 1st ICRIL-Int. Conf. Inno. Sci. Technol. (IICIST),
2015, pp. 1–2.
[11] N. Omar, M. Albared, A. Al-Shabi, and T. Al-Moslmi, ‘‘Ensemble of
classification algorithms for subjectivity and sentiment analysis of ara-
bic customers ‘reviews,’’ Int. J. Adv. Comput. Technol., vol. 14, no. 5,
pp. 77–85, 2013.
[12] Z. Liu, S. Liu, L. Liu, J. Sun, X. Peng, and T. Wang, ‘‘Sentiment recognition
of online course reviews using multi-swarm optimization-based selected
features,’’ Neurocomputing, vol. 185, pp. 11–20, Mar. 2016.
[13] T. Al-Moslmi, M. Albared, A. Al-Shabi, N. Omar, and S. Abdullah,
‘‘Arabic senti-lexicon: Constructing publicly available language resources
for Arabic sentiment analysis,’’ J. Inf. Sci., vol. 23, p. 16555, Feb. 2017.
[14] F. Wu, Y. Huang, and Y. Song, ‘‘Structured microblog sentiment clas-
sification via social context regularization,’’ Neurocomputing, vol. 175,
pp. 599–609, Jun. 2016.
[15] K. Dashtipour et al., ‘‘Multilingual sentiment analysis: State of the art and
independent comparison of techniques,’’ Cognit. Comput., vol. 8, no. 4,
pp. 1–15, 2016.
[16] S. Keele, ‘‘Guidelines for performing systematic literature reviews in soft-
ware engineering,’’ Dept. Comput. Sci., Univ. f Durham, Durham, U.K.,
Tech. Rep. Ver.2.3, 2007.
[17] J. Carrillo de Albornoz, L. Plaza, and P. Gervás, ‘‘A hybrid approach to
emotional sentence polarity and intensity classification,’’ in Proc. 14th
Conf. Comput. Natural Lang. Learn., 2010, pp. 153–161.
[18] I. Maks and P. Vossen, ‘‘A lexicon model for deep sentiment analysis
and opinion mining applications,’’ Decision Support Syst., vol. 53, no. 4,
pp. 680–688, Nov. 2012.
[19] S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, ‘‘Cross-domain sentiment
classification via spectral feature alignment,’’ in Proc. 19th Int. Conf. World
Wide Web, vol. 10. 2010, p. 751.
[20] J. Blitzer, M. Dredze, and F. Pereira, ‘‘Biographies, bollywood, boom-
boxes and blenders: Domain adaptation for sentiment classification,’’ in
Proc. ACL, vol. 7. 2007, pp. 440–447.
[21] D. Bollegala, T. Mu, and J. Y. Goulermas, ‘‘Cross-domain sentiment
classification using sentiment sensitive embeddings,’’ IEEE Trans. Knowl.
Data Eng., vol. 28, no. 2, pp. 398–410, Feb. 2016.
[22] J. Liang, K. Zhang, X. Zhou, Y. Hu, J. Tan, and S. Bai, ‘‘Leveraging
latent sentiment constraint in probabilistic matrix factorization for cross-
domain sentiment classification,’’ Proc. Comput. Sci., vol. 80, pp. 366–375,
Mar. 2016.
[23] N. X. Bach, V. T. Hai, and T. M. Phuong, ‘‘Cross-domain sentiment
classification with word embeddings and canonical correlation analysis,’’
in Proc. 7th Symp. Inf. Commun. Technol., 2016, pp. 159–166.
[24] Y. Zhang, X. Hu, P. Li, L. Li, and X. Wu, ‘‘Cross-domain sentiment
classification-feature divergence, polarity divergence or both?’’ Pattern
Recognit. Lett., vol. 65, pp. 44–50, Jun. 2015.
[25] M. Franco-Salvador, F. L. Cruz, J. A. Troyano, and P. Rosso,
‘‘Cross-domain polarity classification using a knowledge-enhanced meta-
classifier,’’ Knowl.-Based Syst., vol. 86, pp. 46–56, Jun. 2015.
[26] H. Hammer, A. Yazidi, A. Bai, and P. Engelstad, ‘‘Building domain specific
sentiment lexicons combining information from many sentiment lexicons
and a domain specific corpus,’’ Comput. Sci. Appl., vol. 456, pp. 205–216,
Dec. 2015.
[27] G. Zhou, Y. Zhou, X. Guo, X. Tu, and T. He, ‘‘Cross-domain senti-
ment classification via topical correspondence transfer,’’ Neurocomputing,
vol. 159, pp. 298–305, Dec. 2015.
[28] R. Zhao and K. Mao, ‘‘Supervised adaptive-transfer PLSA for cross-
domain text classification,’’ in Proc. IEEE Int. Conf. Data Mining Work-
shop, Jan. 2014, pp. 259–266.
[29] C. Lin, Y. Lee, C. Yu, and H. Chen, ‘‘Exploring ensemble of models
in taxonomy-based cross-domain sentiment classification,’’ in Proc. 23rd
ACM Int. Conf. Conf. Inf. Knowl. Manage.-(CIKM), 2014, pp. 1279–1288.
[30] Y. Tsai, R. T. Tsai, C. Chueh, and S. Chang, ‘‘Cross-domain opinion word
identification with query-by-committee active learning,’’ in Technologies
and Applications of Artificial Intelligence. Cham, Switzerland: Springer,
2014, pp. 334–343.
[31] A. Tsakalidis, ‘‘An ensemble model for cross-domain polarity classifica-
tion on twitter,’’ in Web Information Systems Engineering. Cham, Switzer-
land: Springer, 2014, pp. 168–177.
[32] F. Bisio, P. Gastaldo, C. Peretti, R. Zunino, and E. Cambria, ‘‘Data intensive
review mining for sentiment classification across heterogeneous domains,’’
in Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining, vol. 13.
Aug. 2013, pp. 1061–1067.
[33] P. Yang, W. Gao, Q. Tan, and K.-F. Wong, ‘‘A link-bridged topic model
for cross-domain document classification,’’ Inf. Process. Manage., vol. 49,
no. 6, pp. 1181–1193, Nov. 2013.
[34] D. Bollegala, D. Weir, and J. Carroll, ‘‘Cross-domain sentiment classifi-
cation using a sentiment sensitive thesaurus,’’ IEEE Trans. Knowl. Data
Eng., vol. 25, no. 8, pp. 1719–1731, Aug. 2013.
[35] R. Xia, C. Zong, X. Hu, and E. Cambria, ‘‘Feature ensemble plus sample
selection: Domain adaptation for sentiment classification,’’ IEEE Intell.
Syst., vol. 28, no. 3, pp. 10–18, May 2013.
[36] Z. Zhu, D. Dai, Y. Ding, J. Qian, and S. Li, ‘‘Employing emotion keywords
to improve cross-domain sentiment classification,’’ in Proc. Chin. Lexical
Semantics, 2013, pp. 64–71.
[37] Y. He, C. Lin, W. Gao, and K.-F. Wong, ‘‘Dynamic joint sentiment-topic
model,’’ ACM Trans. Intell. Syst. Technol., vol. 5, no. 1, p. 6, 2013.
[38] S. Li, Y. Xue, Z. Wang, and G. Zhou, ‘‘Active learning for cross-domain
sentiment classification,’’ in Proc. 23rd Int. Joint Conf. Artif. Intell., 2013,
pp. 2127–2133.
[39] N. Ponomareva and M. Thelwall, ‘‘Semi-supervised vs. cross-domain
graphs for sentiment analysis,’’ in Proc. RANLP, 2013, pp. 571–578.
[40] B. Ohana, S. J. Delany, and B. Tierney, ‘‘A case-based approach to cross
domain sentiment classification,’’ in Lecture Notes in Computer Science,
vol. 7466. Cham, Switzerland: Springer, 2012, pp. 284–296.
[41] R. Remus, ‘‘Domain adaptation using domain similarity- and domain
complexity-based instance selection for cross-domain sentiment analy-
sis,’’ in Proc.-12th IEEE Int. Conf. Data Mining Workshops, Jun. 2012,
pp. 717–723.
[42] N. Ponomareva and M. Thelwall, ‘‘Biographies or blenders: Which
resource is best for cross-domain sentiment analysis?’’ in Lecture Notes in
Computer Science, vol. 7181. Cham, Switzerland: Springer, 2012, pp. 488–
499.
[43] N. Ponomareva and M. Thelwall, ‘‘Do neighbours help?: An exploration
of graph-based algorithms for cross-domain sentiment classification,’’ in
Proc. Joint Conf. Empirical Methods Natural Lang. Process. Comput.
Natural Lang. Learn., Jul. 2012, pp. 655–665.
[44] S. D. Roy, T. Mei, W. Zeng, and S. Li, ‘‘SocialTransfer: Cross-domain
transfer learning from social streams for media applications,’’ in Proc. 20th
ACM Int. Conf. Multimedia, 2012, pp. 649–658.
[45] Q. Wu and S. Tan, ‘‘A two-stage framework for cross-domain sentiment
classification,’’ Expert Syst. Appl., vol. 38, no. 11, pp. 14269–14275,
2011.
[46] Y. He, C. Lin, and H. Alani, ‘‘Automatically extracting polarity-bearing
topics for cross-domain sentiment classification,’’ in Proc. 49th Annu.
Meeting, Jun. 2011, pp. 123–131.
[47] X. Glorot, A. Bordes, and Y. Bengio, ‘‘Domain adaptation for large-scale
sentiment classification: A deep learning approach,’’ in Proc. 28th Int.
Conf. Mach. Learn., 2011, pp. 513–520.
[48] S. Li and C. Zong, ‘‘Multi-domain adaptation for sentiment classification:
Using multiple classifier combining methods,’’ in Proc. Int. Conf. Natural
Lang. Process. Knowl. Eng. (NLP-KE), 2008, pp. 1–8.
[49] G.-R. Xue, W. Dai, Q. Yang, and Y. Yu, ‘‘Topic-bridged PLSA for cross-
domain text classification,’’ in Proc. 31st Annu. Int. ACM SIGIR Conf. Res.
Develop. Inf. Retrieval, 2008, pp. 627–634.
[50] B. Wang, J. Tang, W. Fan, S. Chen, Z. Yang, and Y. Liu, ‘‘Heterogeneous
cross domain ranking in latent space,’’ in Proc. 18th ACM Conf. Inf. Knowl.
Manage., 2009, pp. 987–996.
[51] F. Zhuang et al., ‘‘Collaborative dual-PLSA: Mining distinction
and commonality across multiple domains for text classification,’’
in Proc. 19th ACM Int. Conf. Inf. Knowl. Manage., 2010,
pp. 359–368.
[52] K. Lang, ‘‘NewsWeeder: Learning to filter netnews,’’ in Proc. 12th Int.
Conf. Mach. Learn., 1995, pp. 331–339.
[53] D. D. Lewis. (1997). Reuters-21578 Text Categorization Test Col-
lection, Distribution 1.0. [Online]. Available: http://www.research.att.
com/lewis/reuters21578.html
[54] A. Go, R. Bhayani, and L. Huang, ‘‘Twitter sentiment classification
using distant supervision,’’ CS224N Project Rep., Stanford, vol. 1, no. 2,
pp. 12–18, 2009.
[55] M. Speriosu, N. Sudan, S. Upadhyay, and J. Baldridge, ‘‘Twitter polar-
ity classification with label propagation over lexical links and the fol-
lower graph,’’ in Proc. 1st Workshop Unsupervised Learn. (NLP), 2011,
pp. 53–63.
[56] H. Wang, Y. Lu, and C. Zhai, ‘‘Latent aspect rating analysis on review text
data: A rating regression approach,’’ in Proc. 16th ACM SIGKDD Int. Conf.
Knowl. Discovery Data Mining, 2010, pp. 783–792.
[57] A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, ‘‘Automating the
construction of Internet portals with machine learning,’’ Inf. Retr., vol. 3,
no. 2, pp. 127–163, 2000.
[58] B. Pang, L. Lee, and S. Vaithyanathan, ‘‘Thumbs up?: Sentiment classifi-
cation using machine learning techniques,’’ in Proc. ACL-Conf. Empirical
Methods Natural Lang., vol. 10. 2002, pp. 79–86.
[59] S. Baccianella, A. Esuli, and F. Sebastiani, ‘‘Multi-facet rating of prod-
uct reviews,’’ in Advances in Information Retrieval. Cham, Switzerland:
Springer, 2009, pp. 461–472.
[60] N. Jindal and B. Liu, ‘‘Opinion spam and analysis,’’ in Proc. Int. Conf. Web
Search Data Mining, 2008, pp. 219–230.
[61] J. Blitzer, R. McDonald, and F. Pereira, ‘‘Domain adaptation with struc-
tural correspondence learning,’’ in Proc. Conf. Empirical Methods Natural
Lang. Process., 2006, pp. 120–128.
[62] S. Li and C. Zong, ‘‘Multi-domain sentiment classification,’’ in Proc. 46th
Annu. Meeting Assoc. Comput. Linguistics Human Lang. Technol., Short
Papers, 2008, pp. 257–260.
[63] H. Guo, H. Zhu, Z. Guo, X. Zhang, X. Wu, and Z. Su, ‘‘Domain adaptation
with latent semantic association for named entity recognition,’’ in Proc.
Human Lang. Technol., Annu. Conf. North Amer. Chapter Assoc. Comput.
Linguistics, 2009, pp. 281–289.
[64] P. Wang, C. Domeniconi, and J. Hu, ‘‘Using wikipedia for co-clustering
based cross-domain text classification,’’ in Proc. 8th IEEE Int. Conf. Data
Mining (ICDM), Dec. 2008, pp. 1085–1090.
[65] G. Paltoglou and M. Thelwall, ‘‘A study of information retrieval weighting
schemes for sentiment analysis,’’ in Proc. 48th Annu. Meeting Assoc.
Comput. Linguistics, 2010, pp. 1386–1395.
[66] N. Jakob and I. Gurevych, ‘‘Extracting opinion targets in a single-and
cross-domain setting with conditional random fields,’’ in Proc. Conf.
Empirical Methods Natural Lang. Process., Oct. 2010, pp. 1035–1045.
[67] F. Huang and A. Yates, ‘‘Exploring representation-learning approaches to
domain adaptation,’’ in Proc. Workshop Domain Adaptation Natural Lang.
Process., 2010, pp. 23–30.
[68] Y. Bao, N. Collier, and A. Datta, ‘‘A partially supervised cross-collection
topic model for cross-domain text classification,’’ in Proc. 22nd ACM Int.
Conf. Conf. Inf. Knowl. Manage., 2013, pp. 239–248.
[69] D. C. T. Hofmann, ‘‘The missing link-a probabilistic model of docu-
ment content and hypertext connectivity,’’ in Proc. Conf. Adv. Neural Inf.
Process. Syst., 2001, pp. 430–436.
[70] R. Serafin and B. Di Eugenio, ‘‘FLSA: Extending latent semantic analysis
with features for dialogue act classification,’’ in Proc. 42nd Annu. Meeting
Assoc. Comput. Linguistics, 2004, p. 692.
[71] A. P. Dempster, N. M. Laird, and D. B. Rubin, ‘‘Maximum likelihood from
incomplete data via the EM algorithm,’’ J. Roy. Statist. Soc. Ser. B, 1977,
pp. 1–38.
[72] T. Li, V. Sindhwani, C. Ding, and Y. Zhang, ‘‘Knowledge transformation
for cross-domain sentiment classification,’’ in Proc. 32nd Int. ACM SIGIR
Conf. Res. Develop. Inf. Retrieval, 2009, pp. 716–717.
[73] L. Li, X. Jin, and M. Long, ‘‘Topic correlation analysis for cross-domain
text classification,’’ in Proc. AAAI, 2012, p. 12.
[74] T. Joachims, ‘‘Transductive inference for text classification using support
vector machines,’’ in Proc. ICML, vol. 99. 1999, pp. 200–209.
[75] Q. Wu, S. Tan, H. Zhai, G. Zhang, M. Duan, and X. Cheng, ‘‘Senti-
Rank: Cross-domain graph ranking for sentiment classification,’’ in Proc.
IEEE/WIC/ACM Int. Joint Conf. Web Intell. Intell. Agent Technol., vol. 1.
2009, pp. 309–314.
[76] J. Yu and J. Jiang, ‘‘Learning sentence embeddings with auxiliary tasks for
cross-domain sentiment classification,’’ in Proc. Conf. Empirical Methods
Natural Lang. Process., 2016, pp. 236–246.
[77] D. M. Blei and M. I. Jordan, ‘‘Modeling annotated data,’’ in Proc.
26th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2003,
pp. 127–134.
[78] J. Kranjc et al., ‘‘Active learning for sentiment analysis on data streams:
Methodology and workflow implementation in the ClowdFlows platform,’’
Inf. Process. Manage., vol. 51, no. 2, pp. 187–203, 2015.
[79] B. Settles, Active Learning Literature Survey. Madison, WI, USA:
Univ. Wisconsin, 2010.
[80] F. Olsson, ‘‘A literature survey of active machine learning in the context of
natural language processing,’’ in Proc. SODA, 2009, p. 36.
[81] D. Nozza, E. Fersini, and E. Messina, ‘‘Deep learning and ensemble
methods for domain adaptation,’’ in Proc. IEEE 28th Int. Conf. Tools Artif.
Intell. (ICTAI), Nov. 2016, pp. 184–189.
[82] M. Long, J. Wang, Y. Cao, J. Sun, and S. Y. Philip, ‘‘Deep learning of
transferable representation for scalable domain adaptation,’’ IEEE Trans.
Knowl. Data Eng., vol. 28, no. 8, pp. 2027–2040, Feb. 2016.
[83] T. Hofmann, ‘‘Unsupervised learning by probabilistic latent semantic anal-
ysis,’’ Mach. Learn., vol. 42, no. 1, pp. 177–196, Jan. 2001.
[84] P. Sanju and T. T. Mirnalinee, ‘‘Construction of enhanced sentiment sensi-
tive thesaurus for cross domain sentiment classification using wiktionary,’’
in Proc. 3rd Int. Conf. Soft Comput. Problem Solving, 2014, pp. 195–206.
[85] S. M. Jiménez-Zafra et al., ‘‘Domain adaptation of polarity lexicon com-
bining term frequency and bootstrapping,’’ in Proc. NAACL-HLT, 2016,
pp. 137–146.
[86] A. Aamodt and E. Plaza, ‘‘Case-based reasoning: Foundational issues,
methodological variations, and system approaches,’’ AI Commun., vol. 7,
no. 1, pp. 39–59, 1994.
[87] X. Zhu, J. Lafferty, and R. Rosenfeld, Semi-Supervised Learning With
Graphs. Pittsburgh, PA, USA: Carnegie Mellon Univ., 2005.
[88] X. Zhu and Z. Ghahramani, Learning From Labeled and Unlabeled Data
With Label Propagation. Seattle, WA, USA: Semantic Scholar, 2002.
[89] A. B. Goldberg and X. Zhu, ‘‘Seeing stars when there aren’t many stars:
Graph-based semi-supervised learning for sentiment categorization,’’ in
Proc. 1st Workshop Graph Based Methods Natural Lang. Process., 2006,
pp. 45–52.
[90] R. Navigli and S. P. Ponzetto, ‘‘BabelNet: The automatic construction,
evaluation and application of a wide-coverage multilingual semantic net-
work,’’ Artif. Intell., vol. 193, pp. 217–250, Dec. 2012.
[91] D. Moher, A. Liberati, J. Tetzlaff, and D. G. Altman, ‘‘Preferred reporting
items for systematic reviews and meta-analyses: The PRISMA statement,’’
Ann. Internal Med., vol. 151, no. 4, pp. 264–269, 2009.