Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 13

Communication Papers of the 2017 Federated Conference on Computer Science and Information Systems

Data Mining with Trusted Knowledge

DOI: http://dx.doi.org/10.15439/2017F216

Citation: Communication Papers of the 2017 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 13, pages 916 ()

Full text

Abstract. In this paper, a new concept of Trusted Knowledge (TK) is introduced. Trusted Knowledge are data from trusted organizations such as ministries, statistical offices and so on which can replace a domain expert in the evaluation phase of the data mining task. Two approaches to applying Trusted Knowledge are introduced. The first one called ``Explanation system'' offers additional information relevant to the resulting patterns which can help the user to better understand results of the task. The second one called ``A/TK-formulas'' filters out the resulting patterns which are consequences of Trusted Knowledge and thus enables the user to concentrate on the interesting patterns. Conversely, the user can request to be shown only the resulting patterns which are consequences of TK to see which of them are in line with TK. Feasibility of the newly proposed framework is demonstrated in a case study.

References

  1. Qiang, Y., Xindong, W., 2006. 10 Challenging Problems in Data Mining Research, International Journal of Information Technology & Decision Making, Vol. 5, No. 4, 2006, 597-604. http://dx.doi.org/10.1142/S0219622006002258
  2. Mansingh, G., Osei-Bryson, K.-M., Reichgelt. H.: Using ontologies to facilitate post-processing of association rules by domain experts, Information Sciences, 181(3), 2011, 419–434. http://dx.doi.org/10.1016/j.ins.2010.09.027
  3. Rauch, J., 2015. Formal Framework for Data Mining with Association Rules and Domain Knowledge – Overview of an Approach. Fundamenta Informaticae, 137 No 2, pp. 1–47. http://dx.doi.org/10.3233/FI-2015-1175
  4. Silberschatz, A., Tuzhilin, A., 1995. On subjective measures of interestingness in knowledge discovery. In Proc. of the 1st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 275-281, 1995. DOI: 10.1.1.88.146
  5. Padmanabhan, B., Tuzhilin, A., 1998. A belief-driven method for discovering unexpected patterns. In Proc. of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 94-100, 1998. DOI: 10.1.1.28.728
  6. De Bie, T., 2013. Subjective interestingness in exploratory data mining. In Advances in Intelligent Data Analysis XII: 12th International Symposium, IDA 2013, London, UK, October 17-19, 2013. http://dx.doi.org/10.1007/978-3-642-41398-8_3
  7. Paulheim, H., Ristoski, P., Mitichkin, E., Bizer, C., 2014. Data Mining with Background Knowledge from the Web. RapidMiner World, At Boston, USA. August 2014
  8. Paulheim, H., 2012. Generating possible interpretations for statistics from linked open data, in: 9th Extended Semantic Web Conference, ESWC, 2012.
  9. Z. Huang, H. Chen, T. Yu, H. Sheng, Z. Luo, Y. Mao, 2009. Semantic text mining with linked data, in: INC, IMS and IDC, 2009. NCM’09. Fifth International Joint Conference on, 2009, pp. 338–343. http://dx.doi.org/10.1109/NCM.2009.131
  10. Tiddi I., d’Aquin M., Motta E. 2014. Dedalo: Looking for Clusters Explanations in a Labyrinth of Linked Data. In: Presutti V., d’Amato C., Gandon F., d’Aquin M., Staab S., Tordai A. (eds) The Semantic Web: Trends and Challenges. ESWC 2014. Lecture Notes in Computer Science, vol 8465. Springer, pp. 333-348. http://dx.doi.org/10.1007/978-3-319-07443-6_23
  11. Czech Statistical Office (CSO), 2015. Výsledky sčítání lidu, domů a bytů 2011 (Census 2011 – in Czech) [online]. https://www.czso.cz/csu/czso/otevrena_data_pro_vysledky_scitani_lidu_domu_a_bytu_2011_-sldb_2011- Last modified on 14 th April 2015.
  12. Buchanan, B. G., Smith, R. G., 1988. Fundamentals of expert systems. Annual review of computer science, 1988, 3.1: 23-58.
  13. Rauch, Jan. Observational Calculi and Association Rules [online]. 1. ed. Berlin : Springer-Verlag, 2013. ISBN 978-3-642-11736-7. Available at: http://link.springer.com/book/10.1007/978-3-642-11737-4
  14. Šimůnek, Milan. 2014. LISp-Miner Control Language – description of scripting language implementation. Journal of Systems Integration [online], Vol 5, No 2 (2014), p. 28-44. ISSN 1804-2724. URL: http://www.si-journal.org/index.php/JSI/article/view/193 http://dx.doi.org/http://dx.doi.org/10.20470/jsi.v5i2.193
  15. Deloitte Real Index Q3 2016, (in Czech) [online]. Available at https://www2.deloitte.com/content/dam/Deloitte/cz/Documents/real-estate/Deloitte_Real_Index_Q3_2016_CZ.pdf
  16. Czech Ministry of Regional Development. Stav hypotečních úvěrů v krajích za leden až prosinec 2016 (in Czech). Available at http://www.mmr.cz/getmedia/a5bd12f0-2322-4037-80d4-648163c28e50/Stav-hypotecnich-uveru-v-krajich-za-leden-az-prosinec-2016,-s-logem.pdf
  17. Vanschoren, J. 2012. The Experiment Database for Machine Learning (demo) [electronic document]. Workshop PlanLearn 2012. Available from http://datamining.liacs.nl/planlearnpapers/ planlearn2012_submission_7.pdf
  18. Rauch, Jan, Šimůnek, Milan. 2015. Data Mining with Histograms – A Case Study. In: Foundations of Intelligent Systems [online]. Lyon, 21.10.2015 – 23.10.2015. Cham : Springer International Publishing, 2015, s. 3–8. ISBN 978-3-319-25251-3. http://dx.doi.org/10.1007/978-3-319-25252-0.