Performance engineering as key technology to reach exascale

Date: 11/11/2017
speaker: G. Wellein

00:00:00 Introduction
00:05:59 Agenda
00:07:48 Performance Engineering
00:12:17 “Our” PE cycle — white boxes only
00:16:59 Node level performance models – roofline and beyond
00:17:29 Potential hardware capabilities, a.k.a. bottlenecks
00:19:34 The Roofline model*: Execution vs. Data transfer
00:25:13 Roofline model
00:27:52 First order correction: Execution Cache Memory [ECM] Model
00:29:06 Roofline vs. ECM model — a 3D 7pt stencil
00:31:06 Node level performance models – hardware models done right
00:31:30 Imbench: Determine basic machine characteristics
00:32:06 Imbench: frd [full read-only bandwidth]
00:36:13 frd performance — the problem
00:36:57 Read-only bandwidth — done better: ddot [a,b]
00:37:49 Performance engineering for sparse linear algebra
00:38:23 ESSEX project — background
00:39:53 PE in sparse linear algebra chebyshev polynomials
00:40:48 Compute Chebyshev polynomials / moments
00:46:05 Application model: Code Balance [Bc]
00:47:09 Roofline performance model: CPU
00:47:33 Single-nodeI-device performance & bottlenecks
00:48:42 Large-scale Heterogeneous Performance
00:49:49 Never trust a stranger
00:52:49 Tall & Skinny [TS] Matrix-Matrix-Multiplication
00:59:37 Multicoloring
00:59:52 Hardware efficient Kaczmarz solver
01:03:08 Summary

date of publication: 20180410