Pitfalls in users’ evaluation of algorithms for text-based similarity detection in medical education
Jakub Ščavnický, Matěj Karolyi, Petra Růžičková, Andrea Pokorná, Hana Harazim, Petr Štourač, Martin Komenda
DOI: http://dx.doi.org/10.15439/2018F163
Citation: Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 15, pages 109–116 (2018)
Abstract. This paper introduces a user evaluation of several approaches for an automated similarity detection between study materials and curriculum description in the field of medical and healthcare education. Our objective is to present an effective methodology of getting relevant feedback from medical students and teachers. Two various data sets (electronic study materials represented by interactive educational algorithms on the AKUTNE.CZ platform and the curriculum of the General Medicine study programme) are processed. For the purposes of this work, text similarity between two data sets is expressed lexically, i.e. character-based (n-gram) similarity as well as term-based similarity methods are used. We present the comparison of five selected approaches to similarity calculation as well as an objective discussion covering our experience with and pitfalls of user evaluation.