Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 15

Proceedings of the 2018 Federated Conference on Computer Science and Information Systems

Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors

, ,

DOI: http://dx.doi.org/10.15439/2018F177

Citation: Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 15, pages 339346 ()

Full text

Abstract. In the paper we present extensive results from analyzing energy/performance trade-offs with power capping observed on four different modern CPUs, for three different parallel applications such as 2D heat distribution, numerical integration and Fast Fourier transform. The CPU tested represent both multi-core type CPUs such as Intel Xeon E5, desktop and mobile i7 as well as many-core Intel Xeon Phi x200 but also server, desktop and mobile solutions used widely nowadays. We show that using enforced power caps we can find points of lower than default energy consumption but mostly for desktop and mobile solutions at the cost of increased execution time. We show with particular numbers how energy consumed, power consumption and execution time change for the point of minimum energy used versus the default configuration with no power limit, for each application and each tested CPU.


  1. M. Avgerinou, P. Bertoldi, and L. Castellazzi, “Trends in data centre energy consumption under the european code of conduct for data centre energy efficiency,” Energies, vol. 10, no. 10, 2017. http://dx.doi.org/10.3390/en10101470. [Online]. Available: http://www.mdpi.com/1996-1073/10/10/1470
  2. H. Krawczyk, M. Nykiel, and J. Proficz, “Mobile offloading framework: Solution for optimizing mobile applications using cloud computing,” in Computer Networks, P. Gaj, A. Kwiecień, and P. Stera, Eds. Cham: Springer International Publishing, 2015. ISBN 978-3-319-19419-6 pp. 293–305.
  3. K. N. Khan, M. Hirki, T. Niemi, J. K. Nurminen, and Z. Ou, “Rapl in action: Experiences in using rapl for power measurements,” ACM Trans. Model. Perform. Eval. Comput. Syst., vol. 3, no. 2, pp. 9:1–9:26, Mar. 2018. http://dx.doi.org/10.1145/3177754. [Online]. Available: http://doi.acm.org/10.1145/3177754
  4. B. Subramaniam and W. Feng, “Towards energy-proportional computing using subsystem-level power management,” CoRR, vol. abs/1501.02724, 2015. [Online]. Available: http://arxiv.org/abs/1501.02724
  5. C. Jin, B. R. de Supinski, D. Abramson, H. Poxon, L. DeRose, M. N. Dinh, M. Endrei, and E. R. Jessup, “A survey on software methods to improve the energy efficiency of parallel computing,” The International Journal of High Performance Computing Applications, vol. 31, no. 6, pp. 517–549, 2017. http://dx.doi.org/10.1177/1094342016665471. [Online]. Available: https://doi.org/10.1177/1094342016665471
  6. P. Czarnul, J. Kuchta, P. Rosciszewski, and J. Proficz, “Modeling energy consumption of parallel applications,” in 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), Sept 2016, pp. 855–864.
  7. J. Proficz and P. Czarnul, “Performance and Power-Aware Modeling of MPI Applications for Cluster Computing,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, vol. 9574, pp. 199–209. ISBN 9783319321516. [Online]. Available: http://link.springer.com/10.1007/978-3-319-32152-3_19
  8. D. Abdurachmanov, P. Elmer, G. Eulisse, R. Knight, T. Niemi, J. K. Nurminen, F. Nyback, G. Pestana, Z. Ou, and K. Khan, “Techniques and tools for measuring energy efficiency of scientific software applications,” Journal of Physics: Conference Series, vol. 608, no. 1, p. 012032, 2015. [Online]. Available: http://stacks.iop.org/1742-6596/608/i=1/a=012032
  9. H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le, “Rapl: Memory power estimation and capping,” in Proceedings of the 16th ACM/IEEE International Symposium on Low Power Electronics and Design, ser. ISLPED ’10. New York, NY, USA: ACM, 2010. http://dx.doi.org/10.1145/1840845.1840883. ISBN 978-1-4503-0146-6 pp. 189–194. [Online]. Available: http://doi.acm.org/10.1145/1840845.1840883
  10. S. Desrochers, C. Paradis, and V. M. Weaver, “A validation of dram rapl power measurements,” in Proceedings of the Second International Symposium on Memory Systems, ser. MEMSYS ’16. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2989081.2989088. ISBN 978-1-4503-4305-3 pp. 455–470. [Online]. Available: http://doi.acm.org/10.1145/2989081.2989088
  11. A. Mazouz, B. Pradelle, and W. Jalby, “Statistical validation methodology of cpu power probes,” in Revised Selected Papers, Part I, of the Euro-Par 2014 International Workshops on Parallel Processing - Volume 8805. New York, NY, USA: Springer-Verlag New York, Inc., 2014. http://dx.doi.org/10.1007/978-3-319-14325-5_42. ISBN 978-3-319-14324-8 pp. 487–498. [Online]. Available: http: //dx.doi.org/10.1007/978-3-319-14325-5_42
  12. M. Hirki, “Energy and performance profiling of scientific computing; tieteellisen laskennan energia- ja suorituskykyprofilointi,” G2 Pro gradu, diplomity, 2015. [Online]. Available: http://urn.fi/URN:NBN:fi:aalto-201512165699
  13. M. Hirki, Z. Ou, K. N. Khan, J. K. Nurminen, and T. Niemi, “Empirical study of the power consumption of the x86-64 instruction decoder,” in USENIX Workshop on Cool Topics on Sustainable Data Centers (CoolDC 16). Santa Clara, CA: USENIX Association, 2016. [Online]. Available: https://www.usenix.org/conference/cooldc16/workshop-program/presentation/hirki
  14. H. Zhang and H. Hoffmann, “Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques,” in Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS ’16. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2872362.2872375. ISBN 978-1-4503-4091-5 pp. 545–559. [Online]. Available: http://doi.acm.org/10.1145/2872362.2872375
  15. F. Sun, H. Li, Y. Han, G. Yan, and J. Ma, “Powercap: Leverage performance-equivalent resource configurations for power capping,” in 2016 Seventh International Green and Sustainable Computing Conference (IGSC), Nov 2016. http://dx.doi.org/10.1109/IGCC.2016.7892618 pp. 1–8.
  16. Q. Zhu, B. Wu, X. Shen, L. Shen, and Z. Wang, “Co-run scheduling with power cap on integrated cpu-gpu systems,” in 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2017. http://dx.doi.org/10.1109/IPDPS.2017.124 pp. 967–977.
  17. M. Travers, “Cpu power consumption experiments and results analysis of intel i7-4820k,” uSystems Research Group, School of Electrical and Electronic Engineering, Newcastle University, UK, Tech. Rep. NCL-EEE-MICRO-TR-2015-197, 2015, http://async.org.uk/tech-reports/NCL-EEE-MICRO-TR-2015-197.pdf.
  18. K. Pedretti, S. L. Olivier, K. B. Ferreira, G. Shipman, and W. Shu, “Early experiences with node-level power capping on the cray xc40 platform,” in Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing, ser. E2SC ’15. New York, NY, USA: ACM, 2015. http://dx.doi.org/10.1145/2834800.2834801. ISBN 978-1-4503-3994-0 pp. 1:1–1:10. [Online]. Available: http://doi.acm.org/10.1145/2834800.2834801
  19. A. Krzywaniak and P. Czarnul, “Parallelization of selected algorithms on multi-core cpus, a cluster and in a hybrid cpu+xeon phi environment,” in Information Systems Architecture and Technology: Proceedings of 38th International Conference on Information Systems Architecture and Technology - ISAT 2017 - Part I, Szklarska Por ̨eba, Poland, September 17-19, 2017, ser. Advances in Intelligent Systems and Computing, L. Borzemski, J. Swiatek, and Z. Wilimowska, Eds., vol. 655. Springer, 2017. http://dx.doi.org/10.1007/978-3-319-67220-5_27. ISBN 978-3-319-67219-9 pp. 292–301. [Online]. Available: https://doi.org/10.1007/978-3-319-67220-5_27
  20. “OpenMP home,” URL: https://www.openmp.org/, accessed: 2018-05-11.
  21. J. Sanders and E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming, 1st ed. Addison-Wesley Professional, 2010. ISBN 0131387685, 9780131387683
  22. M. Balducci, A. Choudary, and J. Hamaker, “Comparative analysis of FFT algorithms in sequential and parallel form,” Tech. Rep., 1996.
  23. P. Czarnul, J. Kuchta, M. Matuszek, J. Proficz, P. Rosciszewski, M. Wojcik, and J. Szymanski, “Merpsys: An environment for simulation of parallel application execution on large scale hpc systems,” Simulation Modelling Practice and Theory, vol. 77, pp. 124 – 140, 2017. http://dx.doi.org/https://doi.org/10.1016/j.simpat.2017.05.009. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1569190X17300916
  24. P. Orzechowski, J. Proficz, H. Krawczyk, and J. Szymanski, “Categorization of cloud workload types with clustering,” in Proceedings of the International Conference on Signal, Networks, Computing, and Systems, D. K. Lobiyal, D. P. Mohapatra, A. Nagar, and M. N. Sahoo, Eds. New Delhi: Springer India, 2017. ISBN 978-81-322-3592-7 pp. 303–313.