Logo PTI Logo FedCSIS

Proceedings of the 17th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 30

Personality Prediction from Social Media Posts using Text Embedding and Statistical Features


DOI: http://dx.doi.org/10.15439/2022F133

Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 235240 ()

Full text

Abstract. Recent advances in deep learning based languagemodels have boosted the performance in many downstream tasks such as sentiment analysis, text summarization, question answering, etc. Personality prediction from text is a relatively new task that has attracted researchers' attention due to the increased interest in personalized services as well as the availability of social media data. In this study, we propose a personality prediction system where text embeddings from large language models such as BERT are combined with multiple statistical features extracted from the input text. For the combination, we use the self-attention mechanism which is a popular choice when several information sources need to be merged together. Our experiments with the Kaggle dataset for MBTI clearly show that adding text statistical features improves the system performance relative to using only BERT embeddings. We also analyze the influence of the personality type words on the overall results.


  1. S. M. Sarsam, H. Al-Samarraie, and A. I. Alzahrani, “Influence of personality traits on users’ viewing behaviour,” Journal of Information Science, April 2021. [Online]. Available: https://doi.org/10.1177/0165551521998051
  2. S. C. Matz, M. Kosinski, G. Nave, and D. J. Stillwell, “Psychological targeting as an effective approach to digital mass persuasion,” Proceedings of the national academy of sciences, vol. 114, no. 48, pp. 12 714–12 719, 2017.
  3. S. T. Völkel, R. Schödel, D. Buschek, C. Stachl, Q. Au, B. Bischl, M. Bühner, and H. Hussmann, “Opportunities and challenges of utilizing personality traits for personalization in hci,” Personalized HumanComputer Interaction, vol. 31, 2019.
  4. J. M. Balmaceda, S. Schiaffino, and D. Godoy, “How do personality traits affect communication among users in online social networks?” Online Information Review, 2014.
  5. L. Yue, W. Chen, X. Li, W. Zuo, and M. Yin, “A survey of sentiment analysis in social media,” Knowledge and Information Systems, vol. 60, no. 2, pp. 617–663, 2019.
  6. O. P. John, L. P. Naumann, and C. J. Soto, “Paradigm shift to the integrative big five trait taxonomy: History, measurement, and conceptual issues.” 2008.
  7. P. D. Tieger, B. Barron, and K. Tieger, Do what you are: Discover the perfect career for you through the secrets of personality type. Hachette UK, 2014.
  8. H. A. Schwartz, J. C. Eichstaedt, M. L. Kern, L. Dziurzynski, S. M. Ramones, M. Agrawal, A. Shah, M. Kosinski, D. Stillwell, M. E. Seligman et al., “Personality, gender, and age in the language of social media: The open-vocabulary approach,” PloS one, vol. 8, no. 9, p. e73791, 2013.
  9. C. Suman, S. Saha, A. Gupta, S. K. Pandey, and P. Bhattacharyya, “A multi-modal personality prediction system,” Knowledge-Based Systems, vol. 236, p. 107715, 2022.
  10. F. Mairesse, M. A. Walker, M. R. Mehl, and R. K. Moore, “Using linguistic cues for the automatic recognition of personality in conversation and text,” Journal of artificial intelligence research, vol. 30, pp. 457–500, 2007.
  11. G. Carducci, G. Rizzo, D. Monti, E. Palumbo, and M. Morisio, “Twitpersonality: Computing personality traits from tweets using word embeddings and supervised learning,” Information, vol. 9, no. 5, p. 127, 2018.
  12. K. A. Nisha, U. Kulsum, S. Rahman, M. Hossain, P. Chakraborty, T. Choudhury et al., “A comparative analysis of machine learning approaches in personality prediction using mbti,” in Computational Intelligence in Pattern Recognition. Springer, 2022, pp. 13–23.
  13. A. Al Marouf, M. K. Hasan, and H. Mahmud, “Comparative analysis of feature selection algorithms for computational personality prediction from social media,” IEEE Transactions on Computational Social Systems, vol. 7, no. 3, pp. 587–599, 2020.
  14. M. H. Amirhosseini and H. Kazemian, “Machine learning approach to personality type prediction based on the myers–briggs type indicator®,” Multimodal Technologies and Interaction, vol. 4, no. 1, 2020. [Online]. Available: https://www.mdpi.com/2414-4088/4/1/9
  15. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint https://arxiv.org/abs/1301.3781, 2013.
  16. J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.
  17. Y. Mehta, S. Fatehi, A. Kazameini, C. Stachl, E. Cambria, and S. Eetemadi, “Bottom-up and top-down: Predicting personality with psycholinguistic and language model features,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020, pp. 1184–1189.
  18. H. Jun, L. Peng, J. Changhui, L. Pengzheng, W. Shenke, and Z. Kejia, “Personality classification based on bert model,” in 2021 IEEE International Conference on Emergency Science and Information Technology (ICESIT). IEEE, 2021, pp. 150–152.
  19. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
  20. L. Xia, D. Luo, C. Zhang, and Z. Wu, “A survey of topic models in text classification,” in 2019 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD). IEEE, 2019, pp. 244–250.
  21. L. Zhou, Z. Zhang, L. Zhao, and P. Yang, “Attention-based bilstm models for personality recognition from user-generated content,” Information Sciences, vol. 596, pp. 460–471, 2022.
  22. B. Škrlj, S. Džeroski, N. Lavrač, and M. Petkovič, “Feature importance estimation with self-attention networks,” arXiv preprint https://arxiv.org/abs/2002.04464, 2020.
  23. M. J, “(MBTI) myers-briggs personality type dataset,” 2017. [Online]. Available: https://www.kaggle.com/datasnaek/mbti-type