Improving the Efficiency of Meta AutoML via Rule-based Training Strategies

Alexander Zender; Bernhard G. Humm; Tim Pachmann

Improving the Efficiency of Meta AutoML via Rule-based Training Strategies

Alexander Zender, Bernhard G. Humm, Tim Pachmann

DOI: http://dx.doi.org/10.15439/2023F708

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 235–246 (2023)

Full text

Abstract. Meta Automated Machine Learning (Meta AutoML) platforms support data scientists and domain experts by automating the ML model search. A Meta AutoML platform utilizes multiple AutoML solutions searching in parallel for their best ML model. Using multiple AutoML solutions requires a substantial amount of energy. While AutoML solutions utilize different training strategies to optimize their energy efficiency and ML model effectiveness, no research has yet addressed optimizing the Meta AutoML process. This paper presents a survey of 14 AutoML training strategies that can be applied to Meta AutoML. The survey categorizes these strategies by their broader goal, their advantage and Meta AutoML adaptability. This paper also introduces the concept of rule-based training strategies and a proof-of-concept implementation in the Meta AutoML platform OMA-ML. This concept is based on the blackboard architecture and uses a rule-based reasoner system to apply training strategies. Applying the training strategy ``top-3'' can save up to 70\% of energy, while maintaining a similar ML model performance.

References

S. J. Russell and P. Norvig, Artificial intelligence: A modern approach, fourth edition ed., ser. Pearson Series in Artificial Intelligence. Hoboken, NJ: Pearson, 2021. ISBN 9780134610993
M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, and F. Hutter, Eds., Efficient and Robust Automated Machine Learning. MIT Press, 2015. http://dx.doi.org/10.5555/2969442.2969547
M.-A. Zöller and M. F. Huber, “Benchmark and survey of automated machine learning frameworks,” Journal of Artificial Intelligence Research, vol. 70, pp. 409–472, 2021. http://dx.doi.org/10.1613/jair.1.11854
C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Auto-weka: Combined selection and hyperparameter optimization of classification algorithms,” in KDD ’13 : the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining : August 11-14, 2013, Chicago, Illinois, USA, I. S. Dhillon, Ed. ACM, 2013. doi: 10.1145/2487575.2487629. ISBN 9781450321747 pp. 847–855.
C. H. N. Larcher and H. J. C. Barbosa, “Auto-cve: a coevolutionary approach to evolve ensembles in automated machine learning,” in Proceedings of the Genetic and Evolutionary Computation Conference, ser. ACM Digital Library, M. López-Ibáñez, Ed. New York,NY,United States: Association for Computing Machinery, 2019. http://dx.doi.org/10.1145/3321707.3321844. ISBN 9781450361118 pp. 392–400.
A. Zender and B. G. Humm, “Ontology-based meta automl,” Integrated Computer-Aided Engineering, vol. 29, no. 4, pp. 351–366, 2022. http://dx.doi.org/10.3233/ICA-220684
L. Kotthoff, C. Thornton, H. H. Hoos, F. Hutter, and K. Leyton-Brown, “Auto-weka: Automatic model selection and hyperparameter optimization in weka,” in Automated machine learning, ser. The Springer Series on Challenges in Machine Learning, F. Hutter, L. Kotthoff, and J. Vanschoren, Eds. Cham: Springer International Publishing, 2019, pp. 81–95. ISBN 978-3-030-05317-8
Y. Poulakis, C. Doulkeridis, and D. Kyriazis, “Autoclust: A framework for automated clustering based on cluster validity indices,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020. http://dx.doi.org/10.1109/ICDM50108.2020.00153. ISBN 978-1-7281-8316-9 pp. 1220–1225.
H. Jin, Q. Song, and X. Hu, “Auto-keras: An efficient neural architecture search system,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ser. ACM Digital Library, A. Teredesai, Ed. New York,NY,United States: Association for Computing Machinery, 2019. http://dx.doi.org/10.1145/3292500.3330648. ISBN 9781450362016 pp. 1946–1956.
B. G. Humm and A. Zender, “An ontology-based concept for meta automl,” in Artificial Intelligence Applications and Innovations, ser. Springer eBook Collection, I. Maglogiannis, J. Macintyre, and L. Iliadis, Eds. Cham: Springer International Publishing and Imprint Springer, 2021, vol. 627, pp. 117–128. ISBN 978-3-030-79149-0
G. Montavon, G. B. Orr, and K.-R. Müller, Eds., Neural Networks: Tricks of the Trade, ser. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. ISBN 978-3-642-35288-1
S. Thrun and L. Pratt, “Learning to learn: Introduction and overview,” in Learning to Learn, S. Thrun and L. Pratt, Eds. Boston, MA and s.l.: Springer US, 1998, pp. 3–17. ISBN 978-1-4613-7527-2
L. Prechelt, “Early stopping — but when?” in Neural Networks: Tricks of the Trade, ser. Lecture Notes in Computer Science, G. Montavon, G. B. Orr, and K.-R. Müller, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, vol. 7700, pp. 53–67. ISBN 978-3-642-35288-1
R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, “Green ai,” Communications of the ACM, vol. 63, no. 12, pp. 54–63, 2020. http://dx.doi.org/10.1145/3381831
A. Doke and M. Gaikwad, “Survey on automated machine learning (automl) and meta learning,” in 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2021. http://dx.doi.org/10.1109/ICCCNT51525.2021.9579526 pp. 1–5.
Y.-W. Chen, Q. Song, and X. Hu, “Techniques for automated machine learning,” ACM SIGKDD Explorations Newsletter, vol. 22, no. 2, pp. 35–50, 2021. http://dx.doi.org/10.1145/3447556.3447567
P. Ge, “Analysis on approaches and structures of automated machine learning frameworks,” in 2020 International Conference on Communications, Information System and Computer Engineering. Piscataway, NJ: IEEE, 2020. http://dx.doi.org/10.1109/CISCE50729.2020.00106. ISBN 978-1-7281-9761-6 pp. 474–477.
X. He, K. Zhao, and X. Chu, “Automl: A survey of the state-of-the-art,” Knowledge-Based Systems, vol. 212, p. 106622, 2021. http://dx.doi.org/10.1016/j.knosys.2020.106622
Q. Yao, M. Wang, Y. Chen, W. Dai, Y.-F. Li, W.-W. Tu, Q. Yang, and Y. Yu, “Taking human out of learning applications: A survey on automated machine learning.” [Online]. Available: https://arxiv.org/pdf/1810.13306
N. Erickson, J. Mueller, A. Shirkov, H. Zhang, P. Larroy, M. Li, and A. Smola, “Autogluon-tabular: Robust and accurate automl for structured data.” [Online]. Available: https://arxiv.org/pdf/2003.06505
T. T. Le, W. Fu, and J. H. Moore, “Scaling tree-based automated machine learning to biomedical big data with a feature set selector,” Bioinformatics (Oxford, England), vol. 36, no. 1, pp. 250–256, 2020. http://dx.doi.org/10.1093/bioinformatics/btz470
E. LeDell and S. Poirier, “H2o automl: Scalable automatic machine learning,” 7th ICML Workshop on Automated Machine Learning (AutoML), 2020. [Online]. Available: https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf
L. Zimmer, M. Lindauer, and F. Hutter, “Auto-pytorch: Multi-fidelity metalearning for efficient and robust autodl,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 9, pp. 3079–3090, 2021. http://dx.doi.org/10.1109/TPAMI.2021.3067763
A. I. Forrester, A. Sóbester, and A. J. Keane, “Multi-fidelity optimization via surrogate modelling,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 463, no. 2088, pp. 3251–3269, 2007. http://dx.doi.org/10.1098/rspa.2007.1900
S. Falkner, A. Klein, and F. Hutter, “Bohb: Robust and efficient hyperparameter optimization at scale,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 2018, pp. 1437–1446. [Online]. Available: https://proceedings.mlr.press/v80/falkner18a.html
M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, and F. Hutter, “Auto-sklearn: Efficient and robust automated machine learning,” in Automated machine learning, ser. The Springer Series on Challenges in Machine Learning, F. Hutter, L. Kotthoff, and J. Vanschoren, Eds. Cham: Springer International Publishing, 2019, pp. 113–134. ISBN 978-3-030-05317-8
S. García, J. Luengo, and F. Herrera, Data Preprocessing in Data Mining. Cham: Springer International Publishing, 2015, vol. 72. ISBN 978-3-319-10246-7
J. Brownlee, Machine learning mastery with Python: Understand your data, create accurate models and work projects end-to-end, edition: v1.20 ed. [Australia]: [Jason Brownlee], 2021. ISBN 979-8540446273
E. Panjei, Le Gruenwald, E. Leal, C. Nguyen, and S. Silvia, “A survey on outlier explanations,” The VLDB journal : very large data bases: a publication of the VLDB Endowment, vol. 31, no. 5, pp. 977–1008, 2022. http://dx.doi.org/10.1007/s00778-021-00721-1
G. B. Rabinowitz, “An introduction to nonmetric multidimensional scaling,” American Journal of Political Science, vol. 19, no. 2, p. 343, 1975. http://dx.doi.org/10.2307/2110441
P. Keerin, W. Kurutach, and T. Boongoen, “Cluster-based knn missing value imputation for dna microarray data,” in 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2012. http://dx.doi.org/10.1109/ICSMC.2012.6377764. ISBN 978-1-4673-1714-6 pp. 445–450.
I. A. Gheyas and L. S. Smith, “A neural network-based framework for the reconstruction of incomplete data sets,” Neurocomputing, vol. 73, no. 16-18, pp. 3039–3065, 2010. http://dx.doi.org/10.1016/j.neucom.2010.06.021
J. Yoo, T. Joseph, D. Yung, S. A. Nasseri, and F. Wood, “Ensemble squared: A meta automl system.” [Online]. Available: https://arxiv.org/pdf/2012.05390
M. Kalisch, M. Michalak, M. Sikora, Ł. Wróbel, and P. Przystałka, “Influence of outliers introduction on predictive models quality,” in Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery, ser. Communications in Computer and Information Science, S. Kozielski, D. Mrozek, P. Kasprowski, B. Małysiak-Mrozek, and D. Kostrzewa, Eds. Cham: Springer International Publishing, 2016, vol. 613, pp. 79–93. ISBN 978-3-319-34098-2
J. Asher, D. Resnick, J. Brite, R. Brackbill, and J. Cone, “An introduction to probabilistic record linkage with a focus on linkage processing for wtc registries,” International journal of environmental research and public health, vol. 17, no. 18, 2020. http://dx.doi.org/10.3390/ijerph17186937
C. Wang, Q. Wu, M. Weimer, and E. Zhu, “Flaml: A fast and lightweight automl library,” in Proceedings of Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica, Eds., vol. 3, 2021, pp. 434–447. [Online]. Available: https://proceedings.mlsys.org/paper/2021/file/92cc227532d17e56e07902b254dfad10-Paper.pdf
K. Potdar, T. S., and C. D., “A comparative study of categorical variable encoding techniques for neural network classifiers,” International Journal of Computer Applications, vol. 175, no. 4, pp. 7–9, 2017. http://dx.doi.org/10.5120/ijca2017915495
J. Grus, Data science from Scratch: First principles with Python, 1st ed. Beijing and Köln: O’Reilly, 2015. ISBN 978-1-491-90142-7
M. Ahsan, M. Mahmud, P. Saha, K. Gupta, and Z. Siddique, “Effect of data scaling methods on machine learning algorithms and model performance,” Technologies, vol. 9, no. 3, p. 52, 2021. http://dx.doi.org/10.3390/technologies9030052
S. Khalid, T. Khalil, and S. Nasreen, “A survey of feature selection and feature extraction techniques in machine learning,” in 2014 Science and Information Conference. IEEE, 2014. http://dx.doi.org/10.1109/SAI.2014.6918213. ISBN 978-0-9893193-1-7 pp. 372–378.
J. M. Kanter and K. Veeramachaneni, “Deep feature synthesis: Towards automating data science endeavors,” in 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2015. http://dx.doi.org/10.1109/DSAA.2015.7344858. ISBN 978-1-4673-8272-4 pp. 1–10.
A. Zheng and A. Casari, Feature engineering for machine learning: Principles and techniques for data scientists. Beijing and Boston and Farnham and Sebastopol and Tokyo and Beijing and Boston and Farnham and Sebastopol and Tokyo: O’Reilly, 2018. ISBN 978-1491953242
A. Jain and D. Zongker, “Feature selection: evaluation, application, and small sample performance,” IEEE transactions on pattern analysis and machine intelligence, vol. 19, no. 2, pp. 153–158, 1997. http://dx.doi.org/10.1109/34.574797
H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. http://dx.doi.org/10.1002/wics.101
G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science (New York, N.Y.), vol. 313, no. 5786, pp. 504–507, 2006. http://dx.doi.org/10.1126/science.1127647
Q. Zhou, M. Zhao, J. Hu, and M. Ma, Multi-fidelity Surrogates: Modeling, Optimization and Applications, 1st ed., ser. Engineering Applications of Computational Methods. Singapore: Springer Nature Singapore and Imprint Springer, 2023, vol. 12. ISBN 978-981-19-7212-6
S. Thrun and L. Pratt, Eds., Learning to Learn. Boston, MA and s.l.: Springer US, 1998. ISBN 978-1-4613-7527-2
H. Penny Nii, “An introduction to knowledge engineering, blackboard model, and age,” 03.1980. [Online]. Available: https://purl.stanford.edu/cq570jp5428
L. D. ERMAN, F. HAYES-ROTH, V. R. LESSER, and D. R. REDDY, “The hearsay-ii speech-understanding system: Integrating knowledge to resolve uncertainty,” in Readings in Artificial Intelligence. Elsevier, 1981, pp. 349–389. ISBN 9780934613033
“Iso/iec 19510:2013(en), information technology — object management group business process model and notation,” 31.03.2022. [Online]. Available: https://www.iso.org/obp/ui/#iso:std:iso-iec:19510:ed-1:v1:en
R. Kohavi, “Census income,” 1996.
P. Gijsbers, E. LeDell, J. Thomas, S. Poirier, B. Bischl, and J. Vanschoren, “An open source automl benchmark.” [Online]. Available: https://arxiv.org/pdf/1907.00909
M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Information Processing & Management, vol. 45, no. 4, pp. 427–437, 2009. http://dx.doi.org/10.1016/j.ipm.2009.03.002