Intel Iris Xe-LP as a platform for scientific computing

Filip Krużel; Mateusz Nytko

Intel Iris Xe-LP as a platform for scientific computing

Filip Krużel, Mateusz Nytko

DOI: http://dx.doi.org/10.15439/2022F132

Citation: Communication Papers of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 32, pages 121–128 (2022)

Full text

Abstract. In the present article, we describe the implementation of the finite element numerical integration algorithm for the Intel Iris Xe-LP Graphics Processing Unit. This GPU is a direct successor of a Xeon Phi accelerator architecture. Although it is used in integrated circuits and does not offer substantial performance, its test should be treated as a preview of the estimated performance for the Intel HPG Graphics Cards that are announced to be released in 2022. In the article, we use our previously developed auto-tuning Finite Element numerical integration OpenCL code on the Intel Iris Xe-LP GPU integrated into the Intel i7 11370H CPU and compare the results with the Nvidia GeForce RTX 3060 GPU. This article brings the answer to the question of whether the new Intel architecture can be a direct competitor to the more classic GPU architecture. It also allows showing if the new architecture can be used for the computation of complex engineering tasks.

References

“NVIDIA Tesla C870 Professional Graphics Card,” tech. rep., Video-Cardz, 2015.
E. Strohmaier, J. Dongarra, H. Simon, M. Meuer, and H. Meuer, Top500 The List, 2020.
L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, and P. Dubey, “Larrabee: a many-core x86 architecture for visual computing,” SIGGRAPH 08: ACM SIGGRAPH 2008 papers, pp. 1–15, 2008.
R. Goodwins, “Intel unveils many-core Knights platform for HPC,” ZdNet, 2010. Accessed on 27th November 2015.
Intel, “Product change notification 116378 - 00,” July 23, 2018.
Intel, “Intel Unveils New GPU Architecture with High-Performance Computing and AI Acceleration, and oneAPI Software Stack with Unified and Scalable Abstraction for Heterogeneous Architectures,” Intel Newsroom, 2019.
I. Cutress, “Intel’s Xe for HPC: Ponte Vecchio with Chiplets, EMIB, and Foveros on 7nm, Coming 2021,” AnandTech, 2019.
R. Smith, “The Intel Xe-LP GPU Architecture Deep Dive: Building Up The Next Generation,” AnandTech, 2020.
K. Banaś, F. Krużel, and J. Bielański, “Optimal kernel design for finite element numerical integration on GPUs,” Computing in Science and Engineering, vol. Volume 22, no. Issue 6, pp. 61–74, 2020.
K. Banaś, F. Krużel, J. Bielański, and K. Chłoń, “A comparison of performance tuning process for different generations of NVIDIA GPUs and an example scientific computing algorithm,” in Parallel Processing and Applied Mathematics (R. Wyrzykowski, J. Dongarra, E. Deelman, and K. Karczewski, eds.), (Cham), pp. 232–242, Springer International Publishing, 2018.
F. Krużel, “Vectorized implementation of the FEM numerical integration algorithm on a modern CPU,” in European Conference for Modelling and Simulation, vol. Volume 33, pp. 414–420, 2019.
F. Krużel and K. Banaś, “AMD APU systems as a platform for scientific computing,” Computer Methods in Materials Science, vol. 15, no. 2, pp. 362–369, 2015.
K. Banaś and F. Krużel, “Comparison of Xeon Phi and Kepler GPU performance for finite element numerical integration,” in High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014 IEEE Intl Conf on, pp. 145–148, Aug 2014.
F. Krużel and K. Banaś, “Finite element numerical integration on Xeon Phi coprocessor,” Annals of Computer Science and Information Systems, pp. 603–612, 10 2014.
K. Banaś and F. Krużel, “OpenCL performance portability for Xeon Phi coprocessor and NVIDIA GPUs: A case study of finite element numerical integration,” in Euro-Par 2014: Parallel Processing Workshops, vol. 8806 of Lecture Notes in Computer Science, pp. 158–169, Springer International Publishing, 2014.
Nvidia Corporation, NVIDIA AMPERE GA102 GPU ARCHITECTURE: Ampere GA10x, 2021. Whitepaper.
Intel Corporation, Intel Architecture Day 2020 Presentation Slides, 2020. Whitepaper.
Intel Corporation, oneAPI GPU Optimization Guide, 2022. Intel Developer Guide.
M. Geveler, D. Ribbrock, D. Göddeke, P. Zajac, and S. Turek, “Towards a complete FEM-based simulation toolkit on GPUs: Unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses,” Computers & Fluids, vol. 80, pp. 327 – 332, 2013. Selected contributions of the 23rd International Conference on Parallel Fluid Dynamics ParCFD2011.
L. Buatois, G. Caumon, and B. Levy, “Concurrent number cruncher: A GPU implementation of a general sparse linear solver,” Int. J. Parallel Emerg. Distrib. Syst., vol. 24, no. 3, pp. 205–223, 2009.
J. Mamza, P. Makyla, A. Dziekoński, A. Lamecki, and M. Mrozowski, “Multi-core and multiprocessor implementation of numerical integration in Finite Element Method,” in Microwave Radar and Wireless Communications (MIKON), 2012 19th International Conference on, vol. 2, pp. 457 – 461, 2012.
P. Plaszewski, P. Maciol, and K. Banas, “Finite element numerical integration on GPUs.,” in PPAM’09: Proceedings of the 8th international conference on Parallel processing and applied mathematics, (Berlin, Heidelberg), pp. 411–420, Springer-Verlag, 2010.
P. Macioł, P. Płaszewski, and K. Banaś, “3D finite element numerical integration on GPUs,” in Proceedings of the International Conference on Computational Science, ICCS 2010, University of Amsterdam, The Netherlands, May 31 - June 2, 2010 (P. M. A. Sloot, G. D. van Albada, and J. Dongarra, eds.), vol. 1 of Procedia Computer Science, pp. 1093–1100, Elsevier, 2010.
J. Filipovic, I. Peterlík, and J. Fousek, “GPU acceleration of equations assembly in finite elements method-preliminary results,” in SAAHPC: Symposium on Application Accelerators in HPC, 2009.
A. Dziekoński, P. Sypek, A. Lamecki, and M. Mrozowski, “Finite element matrix generation on a GPU,” Progress In Electromagnetics Research, vol. 128, pp. 249–265, 2012.
A. Dziekoński, P. Sypek, A. Lamecki, and M. Mrozowski, “Generation of large finite-element matrices on multiple graphics processors,” International Journal for Numerical Methods in Engineering, vol. 94, no. 2, pp. 204–220, 2013.
C. Cecka, A. J. Lew, and E. Darve, “Application of assembly of finite element methods on graphics processors for real-time elastodynamics,” in GPU Computing Gems. Jade edition, pp. 187–205, Morgan Kaufmann, 2011.
C. Cecka, A. J. Lew, and E. Darve, “Assembly of finite element methods on graphics processors,” International Journal for Numerical Methods in Engineering, vol. 85, no. 5, pp. 640–669, 2011.
A. Logg, K.-A. Mardal, G. N. Wells, et al., Automated Solution of Differential Equations by the Finite Element Method. Springer, 2012.
G. R. Markall, D. A. Ham, and P. H. Kelly, “Towards generating optimised finite element solvers for GPUs from high-level specifications,” Procedia Computer Science, vol. 1, no. 1, pp. 1815 – 1823, 2010. ICCS 2010.
G. R. Markall, A. Slemmer, D. A. Ham, P. H. J. Kelly, C. D. Cantwell, and S. J. Sherwin, “Finite element assembly strategies on multi-core and many-core architectures,” International Journal for Numerical Methods in Fluids, vol. 71, no. 1, pp. 80–97, 2013.
L. Tang, X. Hu, D. Chen, M. Niemier, R. Barrett, S. Hammond, and G. Hsieh, “GPU acceleration of data assembly in finite element methods and its energy implications,” in Application-Specific Systems, Architectures and Processors (ASAP), 2013 IEEE 24th International Conference on, pp. 321–328, June 2013.
A. Karatarakis, P. Karakitsios, and M. Papadrakakis, “Gpu accelerated computation of the isogeometric analysis stiffness matrix,” Computer Methods in Applied Mechanics and Engineering, vol. 269, pp. 334–355, 2014.
F. Cabral, C. Osthoff, G. Costa, D. Brandao, M. Kischinhevsky, and S. Gonzaga de Oliveira, “Tuning Up TVD HOPMOC Method on Intel MIC Xeon Phi Architectures with Intel Parallel Studio Tools,” 2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), Computer Architecture and High Performance Computing Workshops (SBAC-PADW), 2017 International Symposium on, SBAC-PADW, pp. 19 – 24, 2017.
I. W. C. Schneck, E. D. Gregory, and C. A. Leckey, “Optimization of elastodynamic finite integration technique on intel xeon phi knights landing processors.,” Journal of Computational Physics, vol. 374, pp. 550–562, 2018.
N. M. Atallah, C. Canuto, and G. Scovazzi, “The second-generation shifted boundary method and its numerical analysis.,” Computer Methods in Applied Mechanics and Engineering, vol. 372, 2020.
S. Muralikrishnan, M.-B. Tran, and T. Bui-Thanh, “An improved iterative hdg approach for partial differential equations.,” Journal of Computational Physics, vol. 367, pp. 295 – 321, 2018.
O. Zienkiewicz and R. Taylor, Finite element method. Vol 1-3. London: Butterworth Heinemann, 2000.
C. Johnson, Numerical Solution of Partial Differential Equations by the Finite Element Method. Cambridge University Press, 1987.
E. Becker, G. Carey, and J. Oden, Finite Elements. An Introduction. Englewood Cliffs: Prentice Hall, 1981.
L. Demkowicz, J. Kurtz, D. Pardo, M. Paszyński, W. Rachowicz, and A. Zdunek, Computing with Hp-Adaptive Finite Elements, Vol. 2: Frontiers Three Dimensional Elliptic and Maxwell Problems with Applications. Chapman & Hall/CRC, 2007.
J. N. Lyness, “Quadrature methods based on complex function values,” Mathematics of Computation, vol. 23, no. 107, pp. 601–619, 1969.
Y. Kallinderis, “Adaptive hybrid prismatic-tetrahedral grids,” International Journal for Numerical Methods in Fluids, vol. 20, pp. 1023–1037, 1995.
K. Michalik, K. Banaś, P. Płaszewski, and P. Cybułka, “ModFEM – a computational framework for parallel adaptive finite element simulations,” Computer Methods in Materials Science, vol. 13, no. 1, pp. 3–8, 2013.
A. Howes, L.; Munshi, The OpenCL Specification. Khronos OpenCL Working Group, 2014. version 2.0, revision 26.
S. Rul, H. Vandierendonck, J. D’Haene, and K. De Bosschere, “An experimental study on performance portability of OpenCL kernels,” in Application Accelerators in High Performance Computing, 2010 Symposium, Papers, (Knoxville, TN, USA), p. 3, 2010.
K. Banaś, F. Krużel, and J. Bielański, “Finite element numerical integration for first order approximations on multi- and many-core architectures,” Computer Methods in Applied Mechanics and Engineering, vol. 305, pp. 827 – 848, 2016.