Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

Gradient boosting models for cybersecurity threat detection with aggregated time series features

, ,

DOI: http://dx.doi.org/10.15439/2023F4457

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 13111315 ()

Full text

Abstract. The rapid proliferation of Internet of Things (IoT) devices has revolutionized the way we interact with and manage our surroundings. However, this widespread adoption has also brought forth significant cybersecurity challenges. IoT devices, with their interconnectedness and varying functionalities, present a unique threat landscape that requires tailored detection techniques. Traditional approaches to cybersecurity, primarily focused on network monitoring and anomaly detection, often fall short in effectively identifying threats originating from IoT devices due to their dynamic and complex behaviors. This paper addresses our solution for FedCSIS 2023 Challenge: Cybersecurity Threat Detection in the behavior of IoT Devices. First, we aggregated time series features, and then at the feature selection stage, we filtered and combined different categorical and numerical features to generate four different feature sets. The Gradient boosting models, i.e. lightgbm, catboost and xgboost, are applied and trained individually with hyper-parameter tuning. The final three submissions are two best individual lightgbm models with the AUC scores of 0.9999 and 0.9998, respectively on the different feature sets, which secured the 4th place with a final score of 0.9993, and one ensemble result with a AUC score of 0.9998 from combination of xgboost, catboost and lightgbm, which has the final score of 0.9997 while unluckily was missing in the final three evaluation entries.

References

  1. F. Alaba, M. Othman, I. Hashem, F. Alotaibi, Internet of Things Security: A Survey, Journal of Network and Computer Applications, vol. 88, pp. 10-28, 2017.
  2. M. Mahdavinejad, M. Rezvan, M. Barekatain, P. Adibi, P. Barnaghi, A. Sheth, Machine learning for internet of things data analysis: a survey, Digital Communications and Networks, 2018.
  3. M. Garadi, A. Mohamed, A. Ali, X. Du, I. Ali and M. Guizani, A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security, IEEE Communications Surveys & Tutorials, 2020.
  4. A. Janusz, A. Kozłowski, B. Adamczyk, D. Iwanicki, M. Brzęczek, M. Michalak, M. Tynda, M. Czerwiński, P. Biczyk, Predicting the Cybersecurity Threat Detection in the Behavior of IoT Devices: Analysis of Data Mining Competition Results, Proceedings of the 18th Conference on Computer Science and Intelligent Systems (FedCSIS), 2023.
  5. L. Mason, J. Baxter, P.L. Bartlett, and M. Frean. Boosting Algorithms as Gradient Descent In S.A. Solla and T.K. Leen and K. Müller. Advances in Neural Information Processing Systems 12: 512–518, MIT Press, 1999.
  6. J.H. Friedman. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29(5): 1189-1232, 2001.
  7. D. Ruta, M. Liu, L. Cen. FEATURE ENGINEERING FOR PREDICTING FRAGS IN TACTICAL GAMES. Proc. Int. Conf. 2023 IEEE International Conference on Multimedia and Expo, 2023. FEATURE ENGINEERING FOR PREDICTING FRAGS IN TACTICAL GAMES
  8. D. Ruta, M. Liu, L. Cen and Q. Hieu Vu. Diversified gradient boosting ensembles for prediction of the cost of forwarding contracts. Proc. Int. Conf. 2022 17th Conference on Computer Science and Intelligence Systems, 2022.
  9. Q. Hieu Vu, L. Cen, D. Ruta and M. Liu. Key Factors to Consider when Predicting the Costs of Forwarding Contracts. Proc. Int. Conf. 2022 17th Conf. on Computer Science and Intelligence Systems, 2022.
  10. D. Ruta, L. Cen, M. Liu and Q. Hieu Vu. Automated feature engineering for prediction of victories in online computer games. Proc. Int. Conf on Big Data, 2021.
  11. Q. Hieu Vu, D. Ruta, L. Cen and M. Liu. A combination of general and specific models to predict victories in video games. Proc. Int. Conf. on Big Data, 2021.
  12. D. Ruta, L. Cen and Q. Hieu Vu. Deep Bi-Directional LSTM Networks for Device Workload Forecasting. Proc. 15th Int. Conf. Comp. Science and Inf. Sys., 2020.
  13. L. Cen, D. Ruta and Q. Hieu Vu. Efficient Support Vector Regression with Reduced Training Data. Proc. Fed. Conf. on Comp. Science and Inf. Sys., 2019.
  14. D. Ruta, L. Cen and Q. Hieu Vu. Greedy Incremental Support Vector Regression. Proc. Fed. Conf. on Computer Science and Inf. Sys., 2019.
  15. Q. Hieu Vu, D. Ruta and L. Cen. Gradient boosting decision trees for cyber security threats detection based on network events logs. Proc. IEEE Int. Conf. Big Data, 2019.
  16. L. Cen, A. Ruta, D. Ruta and Q. Hieu Vu. Regression networks for robust win-rates predictions of AI gaming bots. Int. Symp. Advances in AI and Apps (AAIA), 2018.
  17. Q. Hieu Vu, D. Ruta, A. Ruta and L. Cen. Predicting Win-rates of Hearthstone Decks: Models and Features that Won AAIA’2018 Data Mining Challenge. Int. Symp. Advances in Artificial Intelligence and Apps (AAIA), 2018.
  18. L. Cen, D. Ruta and A. Ruta. Using Recommendations for Trade Returns Prediction with Machine Learning. Int. Symp. on Methodologies for Intelligent Sys. (ISMIS), 2017.
  19. A. Ruta, D. Ruta and L. Cen. Algorithmic Daily Trading Based on Experts’ Recommendations. Int. Symp. on Methodologies for Intelligent Systems (ISMIS), 2017.
  20. Q. Hieu Vu, D. Ruta and L. Cen. An ensemble model with hierarchical decomposition and aggregation for highly scalable and robust classification. 12th Int. Symposium Advances in AI and Applications (AAIA), 2017.
  21. L. Cen and D. Ruta. A Map based Gender Prediction Model for Big E-Commerce Data. The 3rd IEEE Int. Conf. on Smart Data, 2017.
  22. D. Ruta and L. Cen. Self-Organized Predictor of Methane Concentration Warnings in Coal Mines. Proc. Int. Joint Conf. Rough Sets, LNCS, Springer, 2015.
  23. https://machinelearningmastery.com/hyperparameter-optimization-with-random-search-and-grid-search/.