Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

The scalability in terms of the time and the energy for several matrix factorizations on a multicore machine

,

DOI: http://dx.doi.org/10.15439/2023F3506

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 895900 ()

Full text

Abstract. Scalability is an important aspect related to time and energy savings on modern multicore architectures. In this paper, we investigate and analyze scalability in terms of time and energy. We compare the execution time and consumption energy of the LU factorization (without pivoting) and Cholesky, both with Math Kernel Library (MKL) on a multicore machine. In order to save the energy of these multithreaded factorizations, the dynamic voltage and frequency scaling (DVFS) technique was used. This technique allows the clock frequency to be scaled without changing the implementation. An experimental scalability evaluation was performed on an Intel Xeon Gold multicore machine, depending on the number of threads and the clock frequency. Our test results show that scalability in terms of the execution time expressed by the Speedup metric has values close to a linear function with an increase in the number of threads. In contrast, scalability in terms of the energy consumed expressed by the Greenup metric has values close to a logarithmic function with an increase in the number of threads. Both kinds of scalability depend on the clock frequency settings and the number of threads.

References

  1. Intel Math Kernel Library, 2014. http://software.intel.com/en-us/articles/intel-mkl/.
  2. S. Abdulsalam, Z. Zong, Q. Gu, and Q. Meikang. Using the greenup, powerup, and speedup metrics to evaluate software energy efficiency. In 2015 Sixth International Green and Sustainable Computing Conference (IGSC), pages 1–8, 2015. http://dx.doi.org/10.1109/IGCC.2015.7393699.
  3. E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK users’ Guide. Society for Industrial and Applied Mathematics. SIAM, 1999. http://dx.doi.org/10.1137/1.9780898719604.
  4. J. W. Demmel. Applied Numerical Linear Algebra. SIAM, 1997. http://dx.doi.org/10.1137/1.9781611971446.
  5. J. Dongarra, J. DuCroz, I. S. Duff, and S. Hammarling. A set of level-3 Basic Linear Algebra Subprograms. ACM Trans. Math. Software, 16:1–28, 1990. http://dx.doi.org/10.1145/77626.79170.
  6. K. Khan, M. Hirki, T. Niemi, J. Nurminen, and Z. Ou. RAPL in action: Experiences in using RAPL for power measurements. ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS), 3, 01 2018. http://dx.doi.org/10.1145/3177754.
  7. Y. Ngoko and D. Trystram. Scalability in parallel processing. In S. K. Prasad, A. Gupta, A. L. Rosenberg, A. Sussman, and C. C. Weems, editors, Topics in Parallel and Distributed Computing, Enhancing the Undergraduate Curriculum: Performance, Concurrency, and Programming on Modern Platforms, pages 79–109. Springer, 2018. http://dx.doi.org/10.1007/978-3-319-93109-8_4.
  8. M. Weiser, B. Welch, A.J. Demers, and S. Shenker. Scheduling for reduced cpu energy. 1st OSDI, pages 13–23, 11 1994.