Languages' Impact on Emotional Classification Methods

Alexander Christoffer Eilertsen; Dennis Højbjerg Rose; Peter Langballe Erichsen; Rasmus Engesgaard Christensen; Rudra Pratap Deb Nath

Languages' Impact on Emotional Classification Methods

Alexander Christoffer Eilertsen, Dennis Højbjerg Rose, Peter Langballe Erichsen, Rasmus Engesgaard Christensen, Rudra Pratap Deb Nath

DOI: http://dx.doi.org/10.15439/2019F143

Citation: Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 18, pages 277–286 (2019)

Full text

Abstract. There is currently a lack of research concerning whether Emotional Classification (EC) research on a language is applicable to other languages. If this is the case then we can greatly reduce the amount of research needed for different languages. Therefore, we propose a framework to answer the following null hypothesis: The change in classification accuracy for Emotional Classification caused by changing a single preprocessor or classifier is independent of the target language within a significance level of p = 0.05. We test this hypothesis using an English and a Danish data set, and the classification algorithms: Support-Vector Machine, Naive Bayes, and Random Forest. From our statistical test, we got a p-value of 0.12852 and could therefore not reject our hypothesis. Thus, our hypothesis could still be true. More research is therefore needed within the field of cross-language EC in order to benefit EC for different languages.

References

M. V. Mäntylä, D. Graziotin, and M. Kuutila, “The evolution of sentiment analysis—a review of research topics, venues, and top cited papers,” Computer Science Review, vol. 27, pp. 16–32, 2018. http://dx.doi.org/10.1016/j.cosrev.2017.10.002
R. Plutchik, “The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice,” American Scientist, vol. 89, no. 4, pp. 344–350, 2001. http://dx.doi.org/10.1511/2001.4.344
G. Angiani et al., “A comparison between preprocessing techniques for sentiment analysis in twitter,” in KDWeb, 2016. http://dx.doi.org/10.1007/978-3-319-67008-9_31
W. Wang, L. Chen, K. Thirunarayan, and A. P. Sheth, “Harnessing twitter "big data" for automatic emotion identification,” in 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing, Sep. 2012. http://dx.doi.org/10.1109/SocialCom-PASSAT.2012.119 pp. 587–592.
A. Balahur, “Sentiment analysis in social media texts,” in WASSA@NAACL-HLT, 2013. doi: 10.1.1.310.4764
B. Gokulakrishnan, P. Priyanthan, T. Ragavan, N. Prasath, and A. Perera, “Opinion mining and sentiment analysis on a twitter data stream,” in International Conference on Advances in ICT for Emerging Regions (ICTer2012), Dec 2012. http://dx.doi.org/10.1109/ICTer.2012.6423033 pp. 182–188.
V. K. Jain, S. Kumar, and S. L. Fernandes, “Extraction of emotions from multilingual text using intelligent text processing and computational linguistics,” Journal of Computational Science, vol. 21, pp. 316 – 326, 2017. http://dx.doi.org/10.1016/j.jocs.2017.01.010
M. Asad, N. Afroz, L. Dey, R. P. D. Nath, and M. A. Azim, “Introducing active learning on text to emotion analyzer,” in 2014 17th International Conference on Computer and Information Technology (ICCIT), Dec 2014. http://dx.doi.org/10.1109/ICCITechn.2014.7073079 pp. 35–40.
J. R. Quinlan, “Induction of decision trees,” MACH. LEARN, vol. 1, pp. 81–106, 1986. http://dx.doi.org/10.1007/BF00116251
C. Cortes and V. Vapnik, “Support-vector networks,” in Machine Learning, 1995. http://dx.doi.org/10.1007/BF00994018 pp. 273–297.
J. C. Platt, “Sequential minimal optimization: A fast algorithm for training support vector machines,” in Advances in Kernel Methods-Support Vector Learning, 1999.
T. Ho, “Random decision forests,” in Document Analysis and Recognition, International Conference on, vol. 1, 09 1995. http://dx.doi.org/10.1109/IC-DAR.1995.598994. ISBN 0-8186-7128-9 pp. 278 – 282 vol. 1.
G. F. Cooper and E. HERSKOVITS, “A bayesian method for the induction of probabilistic networks from data,” in MACHINE LEARNING, 1992. http://dx.doi.org/10.1007/BF00994110 pp. 309–347.
F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics Bulletin, vol. 1, no. 6, pp. 80–83, 1945.