Personality Prediction Based on Twitter Information in Bahasa Indonesia
Derwin Suhartono, Veronica Ong, Anneke D. S. Rahmanto, Williem, Aryo E. Nugroho, Esther W. Andangsari, Muhamad N. Suprayogi
DOI: http://dx.doi.org/10.15439/2017F359
Citation: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 11, pages 367–372 (2017)
Abstract. The sheer usage of social media presents an opportunity for an automated analysis of a social media user based on his/her information, activities, or status updates. This opportunity is due to the abundant amount of information shared by the user. This fact is especially true for countries with high number of active users such as Indonesia. Extraction of information from social media can yield insightful results if done correctly. Recent studies have managed to leverage associations between language and personality and build a personality prediction system based on those associations. The current study attempts to build a personality prediction system based on a Twitter user's information for Bahasa Indonesia, the native language of Indonesia. The personality prediction system is built on Support Vector Machine and XGBoost trained with 329 instances. Evaluation results using 10-fold cross validation shows that the system managed to reach highest average accuracy of 76.2310\% with Support Vector Machine and 97.9962\% with XGBoost.
References
- GlobalWebIndex, “GlobalWebIndex Social Report Q4/2016,” 2016.
- Twitter Investor Relations, “Q414 Selected Company Metrics and Financials,” 2014.
- Twitter Investor Relations, “Q216 Selected Company Metrics and Financials,” 2016.
- K. M. Carley, M. M. Malik, M. Kowalchuck, J. Pfeffer, and P. Landwehr, “Twitter usage in Indonesia,” 2015.
- CNN Indonesia, “Twitter Rahasiakan Jumlah Pengguna di Indonesia,” 2016.
- eMarketer, “Southeast Asia Has Among the Highest Social Network Usage in the World,” 2015.
- J. Golbeck, C. Robles, M. Edmondson, and K. Turner, “Predicting personality from twitter,” in Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on, 2011, pp. 149–156.
- A. Wijaya, I. Prasetia, N. Febrianto, and D. Suhartono, “Sistem Prediksi Kepribadian ‘The Big Five Traits’ Dari Data Twitter,” Bina Nusantara University, 2016.
- C. Sumner, A. Byers, R. Boochever, and G. J. Park, “Predicting dark triad personality traits from twitter usage and a linguistic analysis of tweets,” in Machine Learning and Applications (ICMLA), 2012 11th International Conference on, 2012, vol. 2, pp. 386–393.
- G. Farnadi, S. Zoghbi, M. Moens, and M. De Cock, “Recognising Personality Traits Using Facebook Status Updates,” Work. Comput. Personal. Recognit. Int. AAAI Conf. weblogs Soc. media, pp. 14–18, 2013.
- M. Arroju, A. Hassan, and G. Farnadi, “Age, Gender and Personality Recognition using Tweets in a Multilingual Setting,” in 6th Conference and Labs of the Evaluation Forum (CLEF 2015): Experimental IR meets multilinguality, multimodality, and interaction, 2015.
- D. Wan, C. Zhang, M. Wu, and Z. An, “Personality Prediction Based on All Characters of User Social Media Information,” pp. 220–230, 2014.
- V. Ong, A. D. S. Rahmanto, Williem, and D. Suhartono, “Exploring Personality Prediction from Text on Social Media: A Literature Review,” Internetworking Indones. J., vol. 9, no. 1, pp. 65–70, 2017.
- F. Iacobelli, A. J. Gill, S. Nowson, and J. Oberlander, “Large Scale Personality Classification of Bloggers,” in Affective Computing and Intelligent Interaction: Fourth International Conference, ACII 2011, Memphis, TN, USA, October 9--12, 2011, Proceedings, Part II, S. D’Mello, A. Graesser, B. Schuller, and J.-C. Martin, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 568–577.
- T. Yarkoni, “Personality in 100,000 words: A large-scale analysis of personality and word use among bloggers,” J. Res. Pers., vol. 44, no. 3, pp. 363–373, 2010.
- H. A. Schwartz, J. C. Eichstaedt, M. L. Kern, L. Dziurzynski, S. M. Ramones, M. Agrawal, A. Shah, M. Kosinski, D. Stillwell, M. E. P. Seligman, and L. H. Ungar, “Personality , Gender , and Age in the Language of Social Media : The Open-Vocabulary Approach,” vol. 8, no. 9, 2013.
- Y. Liu, J. Wang, and Y. Jiang, “PT-LDA: A Latent Variable Model to Predict Personality Traits of Social Network Users,” Neurocomputing, 2015.
- K.-H. Peng, L.-H. Liou, C.-S. Chang, and D.-S. Lee, “Predicting personality traits of Chinese users based on Facebook wall posts,” in Wireless and Optical Communication Conference (WOCC), 2015 24th, 2015, pp. 9–14.
- B. Y. Pratama and R. Sarno, “Personality classification based on Twitter text using Naive Bayes, KNN and SVM,” in 2015 International Conference on Data and Software Engineering (ICoDSE), 2015, pp. 170–174.
- Y. Amichai-Hamburger and G. Vinitzky, “Social network use and personality,” Comput. Human Behav., vol. 26, no. 6, pp. 1289– 1295, 2010.
- J. L. Skues, B. Williams, and L. Wise, “The effects of personality traits, self-esteem, loneliness, and narcissism on Facebook use among university students,” Comput. Human Behav., vol. 28, no. 6, pp. 2414–2419, 2012.
- T. Ryan and S. Xenos, “Who uses Facebook? An investigation into the relationship between the Big Five, shyness, narcissism, loneliness, and Facebook usage,” Comput. Human Behav., vol. 27, no. 5, pp. 1658–1664, 2011.
- T. Correa, A. W. Hinsley, and H. G. De Zuniga, “Who interacts on the Web?: The intersection of users’ personality and social media use,” Comput. Human Behav., vol. 26, no. 2, pp. 247–253, 2010.
- C. Ross, E. S. Orr, M. Sisic, J. M. Arseneault, M. G. Simmering, and R. R. Orr, “Personality and motivations associated with Facebook use,” Comput. Human Behav., vol. 25, no. 2, pp. 578– 586, 2009.
- R. R. McCrae and O. P. John, “An introduction to the five‐factor model and its applications,” J. Pers., vol. 60, no. 2, pp. 175–215, 1992.
- I. B. Weiner and R. L. Greene, “Revised NEO Personality Inventory,” Handb. Personal. Assess., pp. 315–342, 2008.
- A. R. Naradhipa and A. Purwarianti, “Sentiment classification for Indonesian message in social media,” in Cloud Computing and Social Networking (ICCCSN), 2012 International Conference on, 2012, pp. 1–5.
- G. A. Buntoro, T. B. Adji, and A. E. Purnamasari, “Sentiment Analysis Twitter dengan Kombinasi Lexicon Based dan Double Propagation,” pp. 7–8, 2014.
- F. Z. Tala, “A study of stemming effects on information retrieval in Bahasa Indonesia,” Inst. Logic, Lang. Comput. Univ. van Amsterdam, Netherlands, 2003.
- R. Rehurek and P. Sojka, “Software Framework for Topic Modelling with Large Corpora,” Proc. Lr. 2010 Work. New Challenges NLP Fram., pp. 45–50, 2010.
- V. Ayumi, “Pose-based human action recognition with Extreme Gradient Boosting,” in Research and Development (SCOReD), 2016 IEEE Student Conference on, 2016, pp. 1–5.
- I. Babajide Mustapha and F. Saeed, “Bioactive molecule prediction using extreme gradient boosting,” Molecules, vol. 21, no. 8, p. 983, 2016.
- S. Dey, Y. Kumar, S. Saha, and S. Basak, “Forecasting to Classification: Predicting the direction of stock market price using Xtreme Gradient Boosting.”
- T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794.