Logo PTI
Polish Information Processing Society
Logo RICE

Annals of Computer Science and Information Systems, Volume 10

Proceedings of the Second International Conference on Research in Intelligent and Computing in Engineering

A Conjoint Analysis of Road Accident Data using K-modes Clustering and Bayesian Networks (Road Accident Analysis using clustering and classification)

, , , ,

DOI: http://dx.doi.org/10.15439/2017R44

Citation: Proceedings of the Second International Conference on Research in Intelligent and Computing in Engineering, Vijender Kumar Solanki, Vijay Bhasker Semwal, Rubén González Crespo, Vishwanath Bijalwan (eds). ACSIS, Vol. 10, pages 5356 ()

Full text

Abstract. Road and traffic accidents are one of the important concerns in today's world. Every country receives a huge damage from road accidents in terms of public health and property loss. Therefore, road accident analysis plays an important role in public health domain. Road accident analysis is performed in order to identify the associated factors that are responsible for road accidents. Knowledge of these factors would be very useful to understand the circumstances of road accidents and can be used to avoid the road accidents. One of the problems in accident analysis is that most of the road accident data is of biased nature. For example, the critical road accidents are very few in comparison to slight/minor injury accidents. Various studies has focused that clustering prior to analysis can increase the efficiency and accuracy of classification. The motive of this study is to perform a conjoint analysis on road accident data, to investigate improvement in the performance of classification of unbiased data after clustering.

References

  1. World Health Organization. Global Status Report on Road Safety 2015. Available online: http://www.who.int/violence _injury_ prevention/ road_safety_status /2015/GSRRS2015_ Summary_ EN_final2.pdf?ua=1 (accessed on 01.07.2016).
  2. Mussone, L., Ferrari, A. and Oneta, M. An analysis of urban collisions using an artificial intelligence model. Accid Anal Prev 1999, vol. 31, pp. 705-718.
  3. Kumar, S. and Toshniwal, D. A data mining framework to analyze road accident data. Journal of Big Data, vol. 2, No. 26, pp. 1-18.
  4. Chang, L. Y. and Chen, W. C. Data mining of tree based models to analyze freeway accident frequency. J Saf Res Elsevier. 2005; vol. 36.
  5. Kumar, S. and Toshniwal, D. A novel framework to analyze road accident time series data. Journal of Big Data, vol. 3, No. 8, pp. 1-11.
  6. Kumar, S and Toshniwal, D. A data mining approach to characterize road accident locations. Journal of Modern Transportation, vol. 24, Issue-1, pp. 62-72.
  7. Kohavi, Ron (1995). "A study of cross-validation and bootstrap for accuracy estimation and model selection". Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (San Mateo, CA: Morgan Kaufmann) 2 (12): 1137–1143.
  8. Matthews, B.W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme". Biochimica et Biophysica Acta (BBA) - Protein Structure, 1975, 405 (2): 442–451.
  9. Fawcett, Tom (2006). "An Introduction to ROC Analysis". Pattern Recognition Letters, 2006, 27 (8): 861 – 874.
  10. Kumar, S.; Toshniwal, D. Analysis of road accident counts using hierarchical clustering and cophenetic correlation coefficient (CPCC). Journal of Big Data,3, 13:1-11.
  11. A. Montella, M. Aria, A. D’Ambrosio, F. Mauriello, Data mining techniques for exploratory analysis of pedestrian crashes, Transportation Research Record, 2237, 2011, pp. 107–116.
  12. Kumar S and Toshniwal D, Analysing road accident data using association rule mining, Proceedings in IEEE International Conference on Computing, Communication and Security(ICCCS2015) held in Mauritius. India: IEEE Xplore; 2015
  13. Kumar S, Toshniwal D and Parida M. A comparative analysis of heterogeneity in road accident data using data mining techniques. Evolving Systems. Springer, 2016. http://dx.doi.org/10.1007/s12530-016-9165-5.
  14. Geurts K, Wets G, Brijs T, Vanhoof K (2003). Profiling of high frequency accident locations by use of association rules. Transportation Research Record-1840. http://dx.doi.org/10.3141/1840-14.
  15. Tesema TB, Abraham A, Grosan C, Rule mining and classification of road accidents using adaptive regression trees. Int J Simulation, vol. 6, 2005, pp. 80–94.
  16. Abellan J, Lopez G, Ona J, Analysis of Traffic Accident Severity using Decision Rules via Decision Trees. Expert System with Applications. Vol. 40, 2013, pp. 6047-6054.
  17. Kashani T, Mohaymany AS, Rajbari A, A Data Mining Approach to Identify Key Factors of Traffic Injury Severity. Promet-Traffic & Transportation. Vol. 23, 2011, pp. 11-17.
  18. Kwon OH, Rhee W, Yoon Y, Application of Classification Algorithms for Analysis of Road Safety Risk Factor Dependencies. Accident Analysis and Prevention. Vol. 75, 2015, pp.1-15. http://dx.doi.org/10.1016/j.aap.2014.11.005.
  19. MacQueen J, Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 281--297, University of California Press, Berkeley, Calif., 1967. http://projecteuclid.org/euclid.bsmsp/1200512992.
  20. Tibshirani R, Walther G, Hastie T, Estimating the Number of Clusters in a Data set via the Gap Statistic, J. R. Statist. Soc. B. vol. 63, 2001, pp.411-423. http://dx.doi.org/10.1111/1467-9868.00293.
  21. Data Source: https://data.gov.uk/dataset/road-traffic-accidents, accessed on 01.Oct.2016.
  22. Tan, P. N.; Steinbach, M.; Kumar, V. Introduction to data mining. Pearson Addison-Wesley; 2006
  23. Chaturvedi A, Green P, Carroll J, K-modes Clustering, Journal of Classification, vol. 18, 2001, pp. 35-55.
  24. Raftery AE. A note on Bayes factors for log-linear contingency table models with vague prior information. J Roy Stat Soc B, vol. 48, 1986; pp. 249–50.
  25. M. G. Madden, “On the classification performance of TAN and general Bayesian networks”, Journal of Knowledge-Based Systems, vol. 22, 2009, pp. 489–495.
  26. H. Helai, C.H. Chor, M.M. Haque, Severity of driver injury and vehicle damage in traffic crashes at intersections: a Bayesian hierarchical analysis, Accident Analysis and Prevention, vol. 40, 2008, pp. 45–54
  27. J. M. Pardillo-Mayora, C. A. Domínguez-Lira, R. Jurado-Piña, “Empirical calibration of a roadside hazardousness index for Spanish two-lane rural roads”, Accident Analysis and Prevention, vol. 42, 2010, pp. 2018–2023.
  28. Depaire B, Wets G, Vanhoof K, Traffic Accident Segmentation by means of Latent Class Clustering. Accident Analysis and Prevention. Vol. 40, 2008, pp.1257-1266. http://dx.doi.org/10.1016/j.aap.2008.01.007.