A Framework for Generating and Evaluating Parallelized Code
Jarosław Bylina
DOI: http://dx.doi.org/10.15439/2017F230
Citation: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 11, pages 493–496 (2017)
Abstract. The work describes a flexible framework built to generate various (parallel) software versions and to benchmark them. The framework is written with the use of the Python language with some support of the gnuplot plotting program. An example of the use of this tool shows the tuning of a matrix factorization on different architectures (Intel Haswell and Intel Knights Corner) with various parameters of prallelization, vectorization, blocking etc.
References
- S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl., 14(3):189–204, Aug. 2000.
- R. S. Chen and J. K. Hollingsworth. Towards fully automatic autotuning: Leveraging language features of chapel. Int. J. High Perform. Comput. Appl., 27(4):394–402, Nov. 2013.
- M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216–231, 2005. Special issue on “Program Generation, Optimization, and Platform Adaptation”.
- M. Geimer, F. Wolf, B. J. N. Wylie, E. Ábrahám, D. Becker, and B. Mohr. The scalasca performance toolset architecture. Concurr. Comput.: Pract. Exper., 22(6):702–719, Apr. 2010.
- W. E. Nagel, A. Arnold, M. Weber, H.-C. Hoppe, and K. Solchenbach. Vampir: Visualization and analysis of mpi resources. Supercomputer, 12:69–80, 1996.
- E. Peise, P. Bientinesi. The ELAPS Framework: Experimental Linear Algebra Performance Studies. https://arxiv.org/abs/1504.08035, 2015.
- C. Schaefer, V. Pankratius, and W. Tichy. Atune-IL: An instrumentation language for auto-tuning parallel applications. In H. Sips, D. Epema, and H.-X. Lin, editors, Euro-Par 2009 Parallel Processing, volume 5704 of Lecture Notes in Computer Science, pages 9–20. Springer Berlin Heidelberg, 2009.
- S. S. Shende and A. D. Malony. The tau parallel performance system. Int. J. High Perform. Comput. Appl., 20(2):287–311, May 2006.
- C. Ţãpuş, I.-H. Chung, and J. K. Hollingsworth. Active Harmony: Towards automated performance tuning. In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, SC ’02, pages 1–11, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.
- R. C. Whaley and J. J. Dongarra. Automatically Tuned Linear Algebra Software. In Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, SC ’98, pages 1–27, Washington, DC, USA, 1998. IEEE Computer Society.
- P. Yalamov and D. J. Evans. The WZ matrix factorisation method. Parallel Computing 21 (7), pages 1111–1120, 1995.
- https://software.intel.com/en-us/intel-advisor-xe
- https://software.intel.com/en-us/intel-vtune-amplifier-xe