Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

Improving the Efficiency of Meta AutoML via Rule-based Training Strategies

, ,

DOI: http://dx.doi.org/10.15439/2023F708

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 235246 ()

Full text

Abstract. Meta Automated Machine Learning (Meta AutoML) platforms support data scientists and domain experts by automating the ML model search. A Meta AutoML platform utilizes multiple AutoML solutions searching in parallel for their best ML model. Using multiple AutoML solutions requires a substantial amount of energy. While AutoML solutions utilize different training strategies to optimize their energy efficiency and ML model effectiveness, no research has yet addressed optimizing the Meta AutoML process. This paper presents a survey of 14 AutoML training strategies that can be applied to Meta AutoML. The survey categorizes these strategies by their broader goal, their advantage and Meta AutoML adaptability. This paper also introduces the concept of rule-based training strategies and a proof-of-concept implementation in the Meta AutoML platform OMA-ML. This concept is based on the blackboard architecture and uses a rule-based reasoner system to apply training strategies. Applying the training strategy ``top-3'' can save up to 70\% of energy, while maintaining a similar ML model performance.

References

  1. S. J. Russell and P. Norvig, Artificial intelligence: A modern approach, fourth edition ed., ser. Pearson Series in Artificial Intelligence. Hoboken, NJ: Pearson, 2021. ISBN 9780134610993
  2. M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, and F. Hutter, Eds., Efficient and Robust Automated Machine Learning. MIT Press, 2015. http://dx.doi.org/10.5555/2969442.2969547
  3. M.-A. Zöller and M. F. Huber, “Benchmark and survey of automated machine learning frameworks,” Journal of Artificial Intelligence Research, vol. 70, pp. 409–472, 2021. http://dx.doi.org/10.1613/jair.1.11854
  4. C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Auto-weka: Combined selection and hyperparameter optimization of classification algorithms,” in KDD ’13 : the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining : August 11-14, 2013, Chicago, Illinois, USA, I. S. Dhillon, Ed. ACM, 2013. doi: 10.1145/2487575.2487629. ISBN 9781450321747 pp. 847–855.
  5. C. H. N. Larcher and H. J. C. Barbosa, “Auto-cve: a coevolutionary approach to evolve ensembles in automated machine learning,” in Proceedings of the Genetic and Evolutionary Computation Conference, ser. ACM Digital Library, M. López-Ibáñez, Ed. New York,NY,United States: Association for Computing Machinery, 2019. http://dx.doi.org/10.1145/3321707.3321844. ISBN 9781450361118 pp. 392–400.
  6. A. Zender and B. G. Humm, “Ontology-based meta automl,” Integrated Computer-Aided Engineering, vol. 29, no. 4, pp. 351–366, 2022. http://dx.doi.org/10.3233/ICA-220684
  7. L. Kotthoff, C. Thornton, H. H. Hoos, F. Hutter, and K. Leyton-Brown, “Auto-weka: Automatic model selection and hyperparameter optimization in weka,” in Automated machine learning, ser. The Springer Series on Challenges in Machine Learning, F. Hutter, L. Kotthoff, and J. Vanschoren, Eds. Cham: Springer International Publishing, 2019, pp. 81–95. ISBN 978-3-030-05317-8
  8. Y. Poulakis, C. Doulkeridis, and D. Kyriazis, “Autoclust: A framework for automated clustering based on cluster validity indices,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020. http://dx.doi.org/10.1109/ICDM50108.2020.00153. ISBN 978-1-7281-8316-9 pp. 1220–1225.
  9. H. Jin, Q. Song, and X. Hu, “Auto-keras: An efficient neural architecture search system,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ser. ACM Digital Library, A. Teredesai, Ed. New York,NY,United States: Association for Computing Machinery, 2019. http://dx.doi.org/10.1145/3292500.3330648. ISBN 9781450362016 pp. 1946–1956.
  10. B. G. Humm and A. Zender, “An ontology-based concept for meta automl,” in Artificial Intelligence Applications and Innovations, ser. Springer eBook Collection, I. Maglogiannis, J. Macintyre, and L. Iliadis, Eds. Cham: Springer International Publishing and Imprint Springer, 2021, vol. 627, pp. 117–128. ISBN 978-3-030-79149-0
  11. G. Montavon, G. B. Orr, and K.-R. Müller, Eds., Neural Networks: Tricks of the Trade, ser. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. ISBN 978-3-642-35288-1
  12. S. Thrun and L. Pratt, “Learning to learn: Introduction and overview,” in Learning to Learn, S. Thrun and L. Pratt, Eds. Boston, MA and s.l.: Springer US, 1998, pp. 3–17. ISBN 978-1-4613-7527-2
  13. L. Prechelt, “Early stopping — but when?” in Neural Networks: Tricks of the Trade, ser. Lecture Notes in Computer Science, G. Montavon, G. B. Orr, and K.-R. Müller, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, vol. 7700, pp. 53–67. ISBN 978-3-642-35288-1
  14. R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, “Green ai,” Communications of the ACM, vol. 63, no. 12, pp. 54–63, 2020. http://dx.doi.org/10.1145/3381831
  15. A. Doke and M. Gaikwad, “Survey on automated machine learning (automl) and meta learning,” in 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2021. http://dx.doi.org/10.1109/ICCCNT51525.2021.9579526 pp. 1–5.
  16. Y.-W. Chen, Q. Song, and X. Hu, “Techniques for automated machine learning,” ACM SIGKDD Explorations Newsletter, vol. 22, no. 2, pp. 35–50, 2021. http://dx.doi.org/10.1145/3447556.3447567
  17. P. Ge, “Analysis on approaches and structures of automated machine learning frameworks,” in 2020 International Conference on Communications, Information System and Computer Engineering. Piscataway, NJ: IEEE, 2020. http://dx.doi.org/10.1109/CISCE50729.2020.00106. ISBN 978-1-7281-9761-6 pp. 474–477.
  18. X. He, K. Zhao, and X. Chu, “Automl: A survey of the state-of-the-art,” Knowledge-Based Systems, vol. 212, p. 106622, 2021. http://dx.doi.org/10.1016/j.knosys.2020.106622
  19. Q. Yao, M. Wang, Y. Chen, W. Dai, Y.-F. Li, W.-W. Tu, Q. Yang, and Y. Yu, “Taking human out of learning applications: A survey on automated machine learning.” [Online]. Available: https://arxiv.org/pdf/1810.13306
  20. N. Erickson, J. Mueller, A. Shirkov, H. Zhang, P. Larroy, M. Li, and A. Smola, “Autogluon-tabular: Robust and accurate automl for structured data.” [Online]. Available: https://arxiv.org/pdf/2003.06505
  21. T. T. Le, W. Fu, and J. H. Moore, “Scaling tree-based automated machine learning to biomedical big data with a feature set selector,” Bioinformatics (Oxford, England), vol. 36, no. 1, pp. 250–256, 2020. http://dx.doi.org/10.1093/bioinformatics/btz470
  22. E. LeDell and S. Poirier, “H2o automl: Scalable automatic machine learning,” 7th ICML Workshop on Automated Machine Learning (AutoML), 2020. [Online]. Available: https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf
  23. L. Zimmer, M. Lindauer, and F. Hutter, “Auto-pytorch: Multi-fidelity metalearning for efficient and robust autodl,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 9, pp. 3079–3090, 2021. http://dx.doi.org/10.1109/TPAMI.2021.3067763
  24. A. I. Forrester, A. Sóbester, and A. J. Keane, “Multi-fidelity optimization via surrogate modelling,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 463, no. 2088, pp. 3251–3269, 2007. http://dx.doi.org/10.1098/rspa.2007.1900
  25. S. Falkner, A. Klein, and F. Hutter, “Bohb: Robust and efficient hyperparameter optimization at scale,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 2018, pp. 1437–1446. [Online]. Available: https://proceedings.mlr.press/v80/falkner18a.html
  26. M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, and F. Hutter, “Auto-sklearn: Efficient and robust automated machine learning,” in Automated machine learning, ser. The Springer Series on Challenges in Machine Learning, F. Hutter, L. Kotthoff, and J. Vanschoren, Eds. Cham: Springer International Publishing, 2019, pp. 113–134. ISBN 978-3-030-05317-8
  27. S. García, J. Luengo, and F. Herrera, Data Preprocessing in Data Mining. Cham: Springer International Publishing, 2015, vol. 72. ISBN 978-3-319-10246-7
  28. J. Brownlee, Machine learning mastery with Python: Understand your data, create accurate models and work projects end-to-end, edition: v1.20 ed. [Australia]: [Jason Brownlee], 2021. ISBN 979-8540446273
  29. E. Panjei, Le Gruenwald, E. Leal, C. Nguyen, and S. Silvia, “A survey on outlier explanations,” The VLDB journal : very large data bases: a publication of the VLDB Endowment, vol. 31, no. 5, pp. 977–1008, 2022. http://dx.doi.org/10.1007/s00778-021-00721-1
  30. G. B. Rabinowitz, “An introduction to nonmetric multidimensional scaling,” American Journal of Political Science, vol. 19, no. 2, p. 343, 1975. http://dx.doi.org/10.2307/2110441
  31. P. Keerin, W. Kurutach, and T. Boongoen, “Cluster-based knn missing value imputation for dna microarray data,” in 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2012. http://dx.doi.org/10.1109/ICSMC.2012.6377764. ISBN 978-1-4673-1714-6 pp. 445–450.
  32. I. A. Gheyas and L. S. Smith, “A neural network-based framework for the reconstruction of incomplete data sets,” Neurocomputing, vol. 73, no. 16-18, pp. 3039–3065, 2010. http://dx.doi.org/10.1016/j.neucom.2010.06.021
  33. J. Yoo, T. Joseph, D. Yung, S. A. Nasseri, and F. Wood, “Ensemble squared: A meta automl system.” [Online]. Available: https://arxiv.org/pdf/2012.05390
  34. M. Kalisch, M. Michalak, M. Sikora, Ł. Wróbel, and P. Przystałka, “Influence of outliers introduction on predictive models quality,” in Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery, ser. Communications in Computer and Information Science, S. Kozielski, D. Mrozek, P. Kasprowski, B. Małysiak-Mrozek, and D. Kostrzewa, Eds. Cham: Springer International Publishing, 2016, vol. 613, pp. 79–93. ISBN 978-3-319-34098-2
  35. J. Asher, D. Resnick, J. Brite, R. Brackbill, and J. Cone, “An introduction to probabilistic record linkage with a focus on linkage processing for wtc registries,” International journal of environmental research and public health, vol. 17, no. 18, 2020. http://dx.doi.org/10.3390/ijerph17186937
  36. C. Wang, Q. Wu, M. Weimer, and E. Zhu, “Flaml: A fast and lightweight automl library,” in Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica, Eds., vol. 3, 2021, pp. 434–447. [Online]. Available: https://proceedings.mlsys.org/paper/2021/file/92cc227532d17e56e07902b254dfad10-Paper.pdf
  37. K. Potdar, T. S., and C. D., “A comparative study of categorical variable encoding techniques for neural network classifiers,” International Journal of Computer Applications, vol. 175, no. 4, pp. 7–9, 2017. http://dx.doi.org/10.5120/ijca2017915495
  38. J. Grus, Data science from Scratch: First principles with Python, 1st ed. Beijing and Köln: O’Reilly, 2015. ISBN 978-1-491-90142-7
  39. M. Ahsan, M. Mahmud, P. Saha, K. Gupta, and Z. Siddique, “Effect of data scaling methods on machine learning algorithms and model performance,” Technologies, vol. 9, no. 3, p. 52, 2021. http://dx.doi.org/10.3390/technologies9030052
  40. S. Khalid, T. Khalil, and S. Nasreen, “A survey of feature selection and feature extraction techniques in machine learning,” in 2014 Science and Information Conference. IEEE, 2014. http://dx.doi.org/10.1109/SAI.2014.6918213. ISBN 978-0-9893193-1-7 pp. 372–378.
  41. J. M. Kanter and K. Veeramachaneni, “Deep feature synthesis: Towards automating data science endeavors,” in 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2015. http://dx.doi.org/10.1109/DSAA.2015.7344858. ISBN 978-1-4673-8272-4 pp. 1–10.
  42. A. Zheng and A. Casari, Feature engineering for machine learning: Principles and techniques for data scientists. Beijing and Boston and Farnham and Sebastopol and Tokyo and Beijing and Boston and Farnham and Sebastopol and Tokyo: O’Reilly, 2018. ISBN 978-1491953242
  43. A. Jain and D. Zongker, “Feature selection: evaluation, application, and small sample performance,” IEEE transactions on pattern analysis and machine intelligence, vol. 19, no. 2, pp. 153–158, 1997. http://dx.doi.org/10.1109/34.574797
  44. H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. http://dx.doi.org/10.1002/wics.101
  45. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science (New York, N.Y.), vol. 313, no. 5786, pp. 504–507, 2006. http://dx.doi.org/10.1126/science.1127647
  46. Q. Zhou, M. Zhao, J. Hu, and M. Ma, Multi-fidelity Surrogates: Modeling, Optimization and Applications, 1st ed., ser. Engineering Applications of Computational Methods. Singapore: Springer Nature Singapore and Imprint Springer, 2023, vol. 12. ISBN 978-981-19-7212-6
  47. S. Thrun and L. Pratt, Eds., Learning to Learn. Boston, MA and s.l.: Springer US, 1998. ISBN 978-1-4613-7527-2
  48. H. Penny Nii, “An introduction to knowledge engineering, blackboard model, and age,” 03.1980. [Online]. Available: https://purl.stanford.edu/cq570jp5428
  49. L. D. ERMAN, F. HAYES-ROTH, V. R. LESSER, and D. R. REDDY, “The hearsay-ii speech-understanding system: Integrating knowledge to resolve uncertainty,” in Readings in Artificial Intelligence. Elsevier, 1981, pp. 349–389. ISBN 9780934613033
  50. “Iso/iec 19510:2013(en), information technology — object management group business process model and notation,” 31.03.2022. [Online]. Available: https://www.iso.org/obp/ui/#iso:std:iso-iec:19510:ed-1:v1:en
  51. R. Kohavi, “Census income,” 1996.
  52. P. Gijsbers, E. LeDell, J. Thomas, S. Poirier, B. Bischl, and J. Vanschoren, “An open source automl benchmark.” [Online]. Available: https://arxiv.org/pdf/1907.00909
  53. M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Information Processing & Management, vol. 45, no. 4, pp. 427–437, 2009. http://dx.doi.org/10.1016/j.ipm.2009.03.002