Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 9

Position Papers of the 2016 Federated Conference on Computer Science and Information Systems

Daily Touristic Plan Recommendation Using Text Mining

,

DOI: http://dx.doi.org/10.15439/2016F594

Citation: Position Papers of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 9, pages 4148 ()

Full text

Abstract. This study focuses on the proposal of a recommender system for daily touristic plans. In order to construct such a system it is further examined that there is a need of text mining applications. Moreover, Sentiment Analysis and Keyword Extraction techniques are evaluated by developing and testing different approaches. Sentiment Analysis approaches are examined step-by-step in order to pick the best among them to score restaurant data. Similarly, Keyword Extraction is evaluated from various perspectives of statistics, visualization and machine learning. By the end of the paper the structure and the flow of the proposed system is illustrated upon the chosen approaches which were tested throughout this paper

References

  1. TripAdvisor, retrieved from https://www.tripadvisor.com.tr/Tourism-g188590-Amsterdam_North_Holland_Province-Vacations.html (Last access: 22.05.2016)
  2. Pang, B., Lee, L., Vaithyanathan, S. Thumbs up?: Sentiment Classification Using Machine Learning Techniques, Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, 79-86, July 2002.
  3. Metsis, V., Androutsopoulos, I., & Paliouras, G. Spam filtering with naive Bayes – Which naive Bayes? Third Conference on Email and Anti-Spam (CEAS), 2006.
  4. Schneider, K.M. On Word Frequency Information and Negative Evidence in Naive Bayes Text Classification, in Proceedings of the 4th International Conference on Advances in Natural Language Processing, Alicante, Spain, October 2004, 474-485.
  5. Rennie, J.D.M., Shih, L., Teevan, J., Karger, D.R. Tackling the Poor Assumptions of Naive Bayes Text Classifiers, Proceedings of the 20th ThInternational Conference on Machine Learning (ICML-2003), Washington DC, 2003.
  6. Stanford Natural Language Processing on Coursera: https://www.coursera.org/course/nlp (Last access: 22.05.2016)
  7. Raschka, Sebastian. (2015) Pyhton Machine Learning. Birmingham, UK: Packt Publishing
  8. Paltoglou , G., Thelwall, M. A study of information retrieval weighting schemes for sentiment analysis, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 1386-1395, July 2010, Uppsala, Sweden
  9. Ghag, K. and Shah, K. (2014). SentiTFIDF – Sentiment Classification using Relative Term Frequency Inverse Document Frequency. (IJACSA) International Journal of Advanced Computer Science and Applications
  10. Performance Measures for Machine Learning, retrieved from http://www.cs.cornell.edu/courses/cs578/2003fa/performance_measures.pdf (Last access: 22.05.2016)
  11. Kimono + MonkeyLearn: sentiment analysis with machine learning and web scraped data, retrieved from https://blog.monkeylearn.com/kimono-monkeylearn-sentiment-analysis-with-machine-learning-and-web-scraped-data/ (Last access: 22.05.2016)
  12. Witten, H., Ian. Text Mining. Computer Science, University of Waikato, Hamilton, New Zealand.
  13. Bansod, R., Mangrulkar, R. & Bhujade,, G. Text and Image based Spam Email Classification using an ANN Model- an Approach. International Journal on Recent and Innovation Trends in Computing and Communication.
  14. Part-of-Speech Tagging [PowerPoint Slides]. Retrieved from https://www.cs.umd.edu/~nau/cmsc421/part-of-speech-tagging.pdf (Last access: 22.05.2016)
  15. Part-of-Speech Tagging [PowerPoint Slides]. Retrieved from http://www.computational- logic.org/iccl/master/lectures/summer06/nlp/part-of-speech-tagging.pdf (Last access: 22.05.2016)
  16. Brants, Thorsten. TnT A Statistical Part-of-Speech Tagger. In Proceedings of the Sixth Applied Natural Language Processing Conference ANLP -2000, April 29 – May 3, 2000, Seattle, WA.
  17. Ritter, A and et al. Named Entity Recognition in Tweets: An Experimental Study. Computer Science and Engineering University of Washington Seattle, WA 98125, USA.
  18. Introduction to Sentiment Analysis [PowerPoint Slides]. Retrieved from http://lct-master.org/files/MullenSentimentCourseSlides.pdf (Last access: 22.05.2016)
  19. Clark., J., H. and Gonzales-Brenes, J., P. Coreference Resolution: Current Trends and Future Directions. November 24, 2008.
  20. Searle, J., R. (2010) Making The Social World: The Structure of Human Civilization. New York, NY: Oxford University Press.
  21. Word Sense Disambiguation. Retrieved from http://www.scholarpedia.org/article/Word_sense_disambiguation (Last access: 22.05.2016)
  22. Tripathi, S. and Sarkhel, J., K. Approaches to Machine Translation. Annals of Library and Information Studies Vol. 57, December 2010, pp388-393.
  23. Information Extraction. Retrieved from https://en.wikipedia.org/wiki/Information_extraction (Last access: 22.05.2016)
  24. Ohsawa, Y., Benson, N., E. & Yachida, M. KeyGraph: Automatic Indexing by Co-occurrence Graph based on Building Construction Metaphor. Graduate School of Engineering Science Osaka University, Toyonaka, Osaka 560-8531, Japan.
  25. Chen, C. Using Random Forest to Learn Imbalanced Data. Department of Statistics, UC Berkeley.