Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 21

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems

Data Mining for Process Modeling: A Clustered Process Discovery Approach

, , , , ,

DOI: http://dx.doi.org/10.15439/2020F95

Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 587590 ()

Full text

Abstract. Process mining has emerged as a new scientific research topic on the interface between process modeling and event data gathering. In the search for process models that best fit to reality, the process discovery approach of creating referential processes from observed behavior. However, despite these methods showing relevant results, when faced with noisy and divergent tendencies they end up producing limited results. This work proposes the application of process discovery technique, combined to cluster technique k-means, to generate new process models, considering its conformance checking measures. The proposed solution is applied to an \emph{ad hoc} workflow. And as a result, the use of the clustering techniques coupled with process discovery showed significant gains in the generation of process models, unlike the standard approach.


  1. ABPMP, BPM CBOK VERSION 4.0 - A Guide to Business Process Management - Common Body of Knowledge. 2019.
  2. W. M. P. van der Aalst, ‘Process mining in the large: A tutorial’, Lect. Notes Bus. Inf. Process., vol. 172 LNBIP, pp. 33–76, 2014, https://doi.org/10.1007/978-3-319-05461-2_2.
  3. W. van der Aalst, Process Mining: Data Science in Action, 2nd ed. Berlin Heidelberg: Springer-Verlag, 2016.
  4. P. Markowski and M. R. Przybyłek, ‘Process mining methods for post-delivery validation’, in 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), Sep. 2017, pp. 1199–1202, https://doi.org/10.15439/2017F372.
  5. W. van der Aalst, T. Weijters, and L. Maruster, ‘Workflow Mining: Discovering Process Models from Event Logs’, IEEE Trans. Knowl. Data Eng., vol. 16, no. 9, pp. 1128–1142, Sep. 2004, https://doi.org/10.1109/TKDE.2004.47.
  6. R. P. J. C. Bose, R. S. Mans, and V. D. W. M.P. Aalst, Wanna improve process mining results?: it’s high time we consider data quality issues seriously’, 2013 IEEE Symp. Comput. Intell. Data Min. CIDM13 Singap. April 16-19 2013, pp. 127–134, 2013, https://doi.org/10.1109/CIDM.2013.6597227.
  7. M. Hinkka, T. Lehto, K. Heljanko, and A. Jung, ‘Structural Feature Selection for Event Logs’, ArXiv171002823 Cs Stat, vol. 308, pp. 20–35, 2018, https://doi.org/10.1007/978-3-319-74030-0_2.
  8. A. Rozinat, ‘Process mining: conformance and extension’, 2010, https://doi.org/10.6100/IR690060.
  9. A. Rozinat and W. M. P. van der Aalst, ‘Conformance Testing: Mea- suring the Fit and Appropriateness of Event Logs and Process Models’, in Business Process Management Workshops, Berlin, Heidelberg, 2006, pp. 163–176, https://doi.org/10.1007/11678564_15.
  10. T. M. Kodinariya and P. R. Makwana, ‘Review on determining number of Cluster in K-Means Clustering’, 2013.
  11. A. K. A. de Medeiros et al., ‘Process Mining Based on Clustering: A Quest for Precision’, in Business Process Management Workshops, Berlin, Heidelberg, 2008, pp. 17–29, https://doi.org/10.1007/978-3-540-78238-4_4.
  12. G. Greco, A. Guzzo, L. Pontieri, and D. Sacca, ‘Discovering expressive process models by clustering log traces’, IEEE Trans. Knowl. Data Eng., vol. 18, no. 8, pp. 1010–1027, Aug. 2006, https://doi.org/10.1109/TKDE.2006.123.
  13. M. Fani Sani, S. J. van Zelst, and W. M. P. van der Aalst, ‘The Impact of Event Log Subset Selection on the Performance of Process Discovery Algorithms’, in New Trends in Databases and Information Systems, Cham, 2019, pp. 391–404, https://doi.org/10.1007/978-3-030-30278-8_39.
  14. H. M. W. Verbeek, J. C. A. M. Buijs, B. F. van Dongen, and W. M. P. van der Aalst, ‘XES, XESame, and ProM 6’, in Information Systems Evolution, Berlin, Heidelberg, 2011, pp. 60–75, https://doi.org/10.1007/978-3-642-17722-4_5.
  15. van Dongen, B.F., ‘Dataset BPI Challenge 2019’. 4TU.Centre for Research Data., 2019, https://doi.org/10.4121/uuid:d06aff4b-79f0-45e6-8ec8-e19730c248f1.