Logo PTI Logo rice

Proceedings of the 2024 Ninth International Conference on Research in Intelligent Computing in Engineering

Annals of Computer Science and Information Systems, Volume 42

A Survey on Sentiment Analysis in Tamil: Critical Analysis

,

DOI: http://dx.doi.org/10.15439/2024R28

Citation: Proceedings of the 2024 Ninth International Conference on Research in Intelligent Computing in Engineering, Vijender Kumar Solanki, Tran Duc Tan, Pradeep Kumar, Manuel Cardona (eds). ACSIS, Vol. 42, pages 4962 ()

Full text

Abstract. The review paper delves into the methodologies, techniques, and challenges specific to sentiment analysis and opinion extraction within the Tamil language. As the digital landscape continues to expand, the ability to comprehend sentiments and opinions expressed in Tamil across diverse online platforms has grown increasingly vital. The paper traces the evolution of sentiment analysis techniques tailored for Tamil, covering essential components such as feature extraction, lexicon creation, and the applications of various algorithms. Special attention is given to the distinct details of the Tamil language, encompassing its linguistic complexities, codeswitching, and the expression of sentiment in informal contexts. A critical analysis has been conducted to compare different models. Moreover, the review explores strategies for opinion extraction and provides insightful suggestions for potential areas for future research and development.

References

  1. Hassan, Asif, Mohammad Rashedul Amin, Abul Kalam Al Azad, and Nabeel Moham- med. “Sentiment analysis on bangla and romanized bangla text using deep recurrent models.” In 2016 International Workshop on Computational Intelligence (IWCI), pp. 51- 56. IEEE, 2016.
  2. Sazzed, Salim, and Sampath Jayarathna. “A sentiment classification in bengali and machine translated english corpus.” In 2019 IEEE 20th international conference on infor- mation reuse and integration for data science (IRI), pp. 107-114. IEEE, 2019.
  3. Bansal, Barkha, and Sangeet Srivastava. “Sentiment classification of online consumer reviews using word vector representations.” Procedia computer science 132 (2018): 1147-1153.
  4. Chowdhury, Shaika, and Wasifa Chowdhury. “Performing sentiment analysis in Bangla microblog posts.” In 2014 International Conference on Informatics, Electronics & Vision (ICIEV), pp. 1-6. IEEE, 2014.
  5. Balahur, Alexandra, and Marco Turchi. “Multilingual sentiment analysis using ma- chine translation?.” In Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis, pp. 52-60. 2012.
  6. Can, E. F., A. Ezen-Can, and F. Can. “Multilingual sentiment analysis: an RNN-based framework for limited data (2018).” arXiv preprint https://arxiv.org/abs/1806.04511.
  7. Banik, Nayan, and Md Hasan Hafizur Rahman. “Evaluation of naive bayes and support vector machines on bangla textual movie reviews.” In 2018 international conference on Bangla speech and language processing (ICBSLP), pp. 1-6. IEEE, 2018.
  8. Das, A. M. I. T. A. V. A. “Opinion Extraction and Summarization from Text Documents in Bengali.” Kolkata, India (2011).
  9. Kanayama, Hiroshi, and Tetsuya Nasukawa. “Fully automatic lexicon expansion for domainoriented sentiment analysis.” In Proceedings of the 2006 conference on empirical methods in natural language processing, pp. 355-363. 2006.
  10. Pradhan, Vidisha M., Jay Vala, and Prem Balani. “A survey on sentiment analysis al- gorithms for opinion mining.” International Journal of Computer Applications 133, no. 9 (2016): 7-11.
  11. Tripto, Nafis Irtiza, and Mohammed Eunus Ali. “Detecting multilabel sentiment and emotions from bangla youtube comments.” In 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), pp. 1-6. IEEE, 2018.
  12. Paul, Animesh Kumar, and Pintu Chandra Shill. “Sentiment mining from bangla data using mutual information.” In 2016 2nd international conference on electrical, computer & telecommunication engineering (ICECTE), pp. 1-4. IEEE, 2016.
  13. Das, Amitava, and Sivaji Bandyopadhyay. “Opinion-polarity identification in bengali.” In International conference on computer processing of oriental languages, pp. 169-182. California, USA: Chinese and Oriental Languages Computer Society, 2010.
  14. Joshi, Anju, and Anubhooti Papola. “Aspect Level Opinion Mining on Customer Re- views using Support Vector Machine.” International Journal of Advanced Research in Computer and Communication Engineering (2017).
  15. Islam, Md Saiful, Md Ashiqul Islam, Md Afjal Hossain, and Jagoth Jyoti Dey. “Super- vised approach of sentimentality extraction from bengali facebook status.” In 2016 19th international conference on computer and information technology (ICCIT), pp. 383-387. IEEE, 2016.
  16. Al-Kabi, Mohammed N., Amal H. Gigieh, Izzat M. Alsmadi, Heider A. Wahsheh, and Mohamad M. Haidar. “Opinion mining and analysis for Arabic language.” IJACSA) Inter- national Journal of Advanced Computer Science and Applications 5, no. 5 2014: 181-195.
  17. Pal, Moumita, and Rajesh Prasad. “Sarcasm Detection followed by Sentiment Analysis for Bengali Language: Neural Network & Supervised Approach.” In 2023 International Conference on Advances in Intelligent Computing and Applications (AICAPS), pp. 1-7. IEEE, 2023.
  18. Balahur, Alexandra, and Marco Turchi. Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Computer Speech & Language 28, no. 1 2014.
  19. Abdalla, Mohamed, and Graeme Hirst. “Cross-lingual sentiment analysis without (good) translation.” arXiv preprint https://arxiv.org/abs/1707.01626 (2017).
  20. Habernal, Ivan, Tom´aˇs Pt´aˇcek, and Josef Steinberger. “Sentiment analysis in czech social media using supervised machine learning.” In Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp. 65- 74. 2013.
  21. Samha, Amani Khalaf. “Aspect-based opinion mining from customer reviews.” PhD diss., Queensland University of Technology, 2016.
  22. Das, Dipankar. “Analysis and tracking of emotions in english and bengali texts: a com- putational approach.” In Proceedings of the 20th international conference companion on World wide web, pp. 343-348. 2011.
  23. Rahman, Md Atikur, and Emon Kumar Dey. “Datasets for aspect-based sentiment analysis in bangla and its baseline evaluation.” Data 3, no. 2 (2018): 15.
  24. Chakravarthi, Bharathi Raja, Vigneshwaran Muralidaran, Ruba Priyadharshini, and John P. McCrae. "Corpus creation for sentiment analysis in code-mixed Tamil-English text." arXiv preprint https://arxiv.org/abs/2006.00206 (2020).
  25. Thilagavathi, R., and K. Krishnakumari. “Tamil english language sentiment analysis system.” International Journal of Engineering Research & Technology (IJERT) 4, no. 16 (2016).
  26. Demirtas, Erkin. “Cross-lingual sentiment analysis with machine translation.” (2013).
  27. Asghar, Muhammad Zubair. “Opinion Extraction From Online Blogs And Public Re- views.” PhD diss., GOMAL UNIVERSITY DI KHAN, 2014.
  28. Tripathy, Abinash. “Sentiment Analysis Using Machine Learning Techniques.” PhD diss., 2017.
  29. Ramanathan, Vallikannu, T. Meyyappan, and S. M. Thamarai. “Sentiment analysis: an approach for analysing tamil movie reviews using Tamil tweets.” Recent Advances in Mathematical Research and Computer Science 3 (2021): 28-39.
  30. Kannan, Abishek, Gaurav Mohanty, and Radhika Mamidi. “Towards building a Senti- WordNet for Tamil.” In Proceedings of the 13th International Conference on Natural Lan- guage Processing, pp. 30-35. 2016.
  31. Sharmista, Ramaswami, and M. Ramaswami. “Rough set based opinion mining in Tamil.” International Journal of Engineering Research and Development (2017).
  32. Sean, Benhur. “Findings of the shared task on Emotion Analysis in Tamil.” In Proceed- ings of the Second Workshop on Speech and Language Technologies for Dravidian Lan- guages, pp. 279-285. 2022.
  33. Ravishankar, Nadana, and Shriram Raghunathan. “Corpus based sentiment classifica- tion of tamil movie tweets using syntactic patterns.” IIOAB Journal: A Journal of Multi- disciplinary Science and Technology 8, no. 2 (2017): 172-178.
  34. Se, Shriya, R. Vinayakumar, M. Anand Kumar, and K. P. Soman. “Predicting the senti- mental reviews in tamil movie using machine learning algorithms.” Indian journal of sci- ence and technology 9, no. 45 (2016): 1-5.
  35. Sharmista, A., and Dr M. Ramaswami. “Sentiment Analysis on Tamil Reviews as Prod- ucts in Social Media Using Machine Learning Techniques: A Novel Study.” Madurai Kama- raj University Madurai-625 21 (2020).
  36. Anish, D., and V. Sumathy. “Sentiment Extraction for Tamil Political reviews” (2016).
  37. Anbukkarasi, S., and S. Varadhaganapathy. “Analyzing sentiment in Tamil tweets us- ing deep neural network.” In 2020 Fourth International Conference on Computing Meth- odologies and Communication (ICCMC), pp. 449-453. IEEE, 2020.
  38. Thavareesan, Sajeetha, and Sinnathamby Mahesan. “Sentiment lexicon expansion using Word2vec and fastText for sentiment prediction in Tamil texts.” In 2020 Moratuwa engineering research conference (MERCon), pp. 272-276. IEEE, 2020.
  39. Padmamala, R., and V. Prema. “Sentiment analysis of online Tamil contents using re- cursive neural network models approach for Tamil language.” In 2017 IEEE International conference on smart technologies and management for computing, communication, controls, energy and materials (ICSTM), pp. 28-31. IEEE, 2017.
  40. Mouthami, K., K. Nirmala Devi, and V. Murali Bhaskaran. “Sentiment analysis and classification based on textual reviews.” In 2013 international conference on Information communication and embedded systems (ICICES), pp. 271-276. IEEE, 2013.
  41. Prasad, Sudha Shanker, Jitendra Kumar, Dinesh Kumar Prabhakar, and Sachin Tripa- thi. “Sentiment mining: An approach for Bengali and Tamil tweets.” In 2016 Ninth Inter- national Conference on Contemporary Computing (IC3), pp. 1-4. IEEE, 2016.
  42. Raveendirarasa, Vidyapiratha, and C. R. J. Amalraj. “Sentiment analysis of tamil-eng- lish codeswitched text on social media using sub-word level lstm.” In 2020 5th Interna- tional Conference on Information Technology Research (ICITR), pp. 1-5. IEEE, 2020.
  43. Mandalam, Asrita Venkata, and Yashvardhan Sharma. “Sentiment analysis of Dravid- ian code mixed data.” In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 46-54. 2021.
  44. Kalaivani, Adaikkan, and Durairaj Thenmozhi. “Multilingual Sentiment Analysis in Tamil Malayalam and Kannada code-mixed social media posts using MBERT.” In FIRE (Working Notes), pp. 1020-1028. 2021.
  45. Roy, Pradeep Kumar, and Abhinav Kumar. “Sentiment Analysis on Tamil Code-Mixed Text using Bi-LSTM.” In Working Notes of FIRE 2021-Forum for Information Retrieval Eval- uation (Online). CEUR. 2021.
  46. Chakravarthi, Bharathi Raja, Ruba Priyadharshini, Vigneshwaran Muralidaran, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, and John P. McCrae. “Dravidiancodemix: Sentiment analysis and offensive language identification dataset for dravidian languages in code-mixed text.” Language Resources and Evaluation 56, no. 3 (2022): 765-806.
  47. Chakravarthi, Bharathi Raja, Ruba Priyadharshini, Sajeetha Thavareesan, Dhivya Chin- nappa, Durairaj Thenmozhi, Elizabeth Sherly, John P. McCrae et al. “Findings of the sen- timent analysis of dravidian languages in code-mixed text.” arXiv preprint https://arxiv.org/abs/2111.09811 (2021).
  48. Sunitha, P. B., Shelbi Joseph, and P. V. Akhil. “A study on the performance of super- vised algorithms for classification in sentiment analysis.” In TENCON 2019-2019 IEEE Re- gion 10 Conference (TENCON), pp. 1351-1356. IEEE, 2019.
  49. Hande, Adeep, Siddhanth U. Hegde, Ruba Priyadharshini, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Sajeetha Thavareesan, and Bharathi Raja Chakravarthi. “Benchmarking multi-task learning for sentiment analysis and offensive language identi- fication in under-resourced dravidian languages.” arXiv preprint https://arxiv.org/abs/2108.03867 (2021).
  50. Gupta, Akshat, Sargam Menghani, Sai Krishna Rallabandi, and Alan W. Black. “Unsu- pervised self-training for sentiment analysis of code-switched data.” arXiv preprint https://arxiv.org/abs/2103.14797 (2021).
  51. Srinivasan, R., and C. N. Subalalitha. “Sentimental analysis from imbalanced code- mixed data using machine learning approaches.” Distributed and Parallel Databases (2021): 1-16.
  52. Jada, Pawan Kalyan, D. Sashidhar Reddy, Konthala Yasaswini, Arunaggiri Pandian K, Prabakaran Chandran, Anbukkarasi Sampath, and Sathiyaraj Thangasamy. “Transformer based Sentiment Analysis in Dravidian Languages.” In FIRE (Working Notes), pp. 926-938. 2021
  53. Seshadri, Shriya, Anand Kumar Madasamy, Soman Kotti Padannayil, and M. Anand Kumar. “Analyzing sentiment in indian languages micro text using recurrent neural net- work.” IIOAB J 7 (2016): 313-318
  54. Thavareesan, Sajeetha, and Sinnathamby Mahesan. “Sentiment analysis in Tamil texts: A study on machine learning techniques and feature representation.” In 2019 14th Conference on industrial and information systems (ICIIS), pp. 320-325. IEEE, 2019.
  55. Varsha, Josephine, B. Bharathi, and A. Meenakshi. “Sentiment Analysis and Homo- phobia detection of YouTube comments in Code-Mixed Dravidian Languages using ma- chine learning and transformer models.” In Working Notes of FIRE 2022-Forum for Infor- mation Retrieval Evaluation (Hybrid). CEUR. 2022.
  56. Kumar, Abhinav, Sunil Saumya, and Jyoti Prakash Singh. “An ensemble-based model for sentiment analysis of Dravidian code-mixed social media posts.” In Working Notes of FIRE 2021-Forum for Information Retrieval Evaluation (Online). CEUR. 2021.
  57. Ramesh Babu, Suba Sri. “Sentiment Analysis In Tamil Language Using Hybrid Deep Learning Approach.” PhD diss., Dublin, National College of Ireland, 2022.
  58. Mahata, Sainik, Dipankar Das, and Sivaji Bandyopadhyay. “Sentiment classification of codemixed tweets using bi-directional rnn and language tags.” In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 28-35. 2021.
  59. Babu, Yandrapati Prakash, and Rajagopal Eswari. “Sentiment Analysis on Dravidian CodeMixed YouTube Comments using Paraphrase XLM-RoBERTa Model.” Working Notes of FIRE (2021).
  60. SR, Mithun Kumar, Lov Kumar, and Aruna Malapati. “Sentiment Analysis on Code- Switched Dravidian Languages with Kernel Based Extreme Learning Machines.” In Pro- ceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pp. 184-190. 2022.
  61. Pavan Kumar, P. H. V., B. Premjith, J. P. Sanjanasri, and K. P. Soman. “Deep Learning Based Sentiment Analysis for Malayalam, Tamil and Kannada Languages.” (2021).
  62. Bravo-Marquez, Felipe. “Acquiring and exploiting lexical knowledge for twitter senti- ment analysis.” PhD diss., University of Waikato, 2017.
  63. Sumathy, B., Anand Kumar, D. Sungeetha, Arshad Hashmi, Ankur Saxena, Piyush Ku- mar Shukla, and Stephen Jeswinde Nuagah. “Machine Learning Technique to Detect and Classify Mental Illness on Social Media Using Lexicon-Based Recommender System.” Computational Intelligence and Neuroscience 2022 (2022).
  64. Priyadharshini, Ruba, Bharathi Raja Chakravarthi, Subalalitha Cn, Thenmozhi Durairaj, Malliga Subramanian, Kogilavani Shanmugavadivel, Siddhanth U. Hegde, and Prasanna Kumaresan. “Overview of abusive comment detection in Tamil-ACL 2022.” In Proceed- ings of the Second Workshop on Speech and Language Technologies for Dravidian Lan- guages, pp. 292-298. 2022.
  65. Roy, Sanjiban Sekhar, Akash Roy, Pijush Samui, Mostafa Gandomi, and Amir H. Gan- domi. “Hateful Sentiment Detection in Real-Time Tweets: An LSTM-Based Comparative Approach.” IEEE Transactions on Computational Social Systems (2023).
  66. Swaminathan, Krithika, K. Divyasri, G. L. Gayathri, Thenmozhi Durairaj, and B. Bhara- thi. “PANDAS@ Abusive Comment Detection in Tamil Code-Mixed Data Using Custom Embeddings with LaBSE.” In Proceedings of the Second Workshop on Speech and Lan- guage Technologies for Dravidian Languages, pp. 112-119. 2022.
  67. Chakravarthi, Bharathi Raja, Ruba Priyadharshini, Navya Jose, Thomas Mandl, Prasanna Kumar Kumaresan, Rahul Ponnusamy, R. L. Hariharan, John Philip McCrae, and Elizabeth Sherly. “Findings of the shared task on offensive language identification in Tamil, Malayalam, and Kannada.” In Proceedings of the first workshop on speech and language technologies for Dravidian languages, pp. 133-145. 2021.
  68. Sharif, Omar, Eftekhar Hossain, and Mohammed Moshiul Hoque. “Nlp-cuet@ dravid- ianlangtech-eacl2021: Offensive language detection from multilingual code-mixed text using transformers.” arXiv preprint https://arxiv.org/abs/2103.00455 (2021).
  69. Keshtkar, Fazel. A computational approach to the analysis and generation of emotion in text. University of Ottawa (Canada), 2011