EpiDoc Data Matching for Federated Information Retrieval in the Humanities

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 10691074

Abstract. The importance of federated information retrieval (FIR) is growing in humanities research. Unlike traditional centralized information retrieval methods, where searches are conducted within a logically centralised collection of documents, FIR treats each information system as an independent source with its own unique characteristics. Searching these systems together as a centralised source results in lower precision in humanities research, even when the research data itself is structured and stored according to standardised guidelines such as EpiDoc, and requires the need to be able to trace the origin of records to avoid incorrect historical conclusions. Matching of queries against all data sets in each source is proving less effective. A global search index that enables traceable matching of key values deemed relevant would provide a more robust solution here. In this paper, we propose a solution that introduces a novel EpiDoc data matching procedure, facilitating traceable FIR across distinct epigraphic sources.


