Prof. Paolo Bientinesi, RWTH Aachen

Fast and Scalable Eigensolvers for Multicore and Hybrid Architectures.

Eigenproblems are at the core of a myriad of engineering and scientific applications. In many cases, the size of such problems is not determined by the physics of the application but is limited by the eigensolver's time and memory requirements. While on the one hand the demand is for larger problems, on the other hand the available numerical libraries are not exploiting the parallelism provided in the modern computing environments. In this talk I compare two approaches to parallelism --one that relies on fast multithreaded libraries (BLAS), and another that uses blocking and careful scheduling-- and show that the right choice depends, among other factors, on the specific operation performed. I will then introduce two eigensolvers specifically designed for high-performance architectures. MR3-SMP targets multicore architectures, while EleMRRR is well suited for both small clusters and massively parallel computers which may or may not use multithreading. Experiments on application matrices indicate that our algorithms are both faster and obtain better speedups than all the eigensolvers from LAPACK, ScaLAPACK and Intel's Math Kernel Library (MKL). Finally, I will discuss the use of graphics processing units.