Automated performance tuning

Jeremy Johnson

doi:10.1145/1837210.1837215

Back

Conference proceeding

Automated performance tuning

Jeremy Johnson

Proceedings of the 4th International Workshop on parallel and symbolic computation, pp 20-21

21 Jul 2010

DOI: https://doi.org/10.1145/1837210.1837215

Additional Links

Abstract

autotuning

code generation and optimization

high-performance computing

parallelism

vectorization

This tutorial presents automated techniques for implementing and optimizing numeric and symbolic libraries on modern computing platforms including SSE, multicore, and GPU. Obtaining high performance requires effective use of the memory hierarchy, short vector instructions, and multiple cores. Highly tuned implementations are difficult to obtain and are platform dependent. For example, Intel Core i7 980 XE has a peak floating point performance of over 100 GFLOPS and the NVIDIA Tesla C870 has a peak floating point performance of over 500 GFLOPS, however, achieving close to peak performance on such platforms is extremely difficult. Consequently, automated techniques are now being used to tune and adapt high performance libraries such as ATLAS (math-atlas.sourceforge.net), PLASMA (icl.cs.utk.edu/plasma) and MAGMA (icl.cs.utk.edu/magma) for dense linear algebra, OSKI (bebop.cs.berkeley.edu/oski) for sparse linear algebra, FFTW (www.fftw.org) for the fast Fourier transform (FFT), and SPIRAL (www.spiral.net) for wide class of digital signal processing (DSP) algorithms. Intel currently uses SPIRAL to generate parts of their MKL and IPP libraries.

Metrics

2 Record Views

1 citations in Scopus

Details

Title: Automated performance tuning
Creators: Jeremy Johnson - Drexel University
Publication Details: Proceedings of the 4th International Workshop on parallel and symbolic computation, pp 20-21
Conference: 4th International Workshop on parallel and symbolic computation, 4th
Series: PASCO '10
Publisher: Association for Computing Machinery (ACM)
Number of pages: 1
Resource Type: Conference proceeding
Language: English
Academic Unit: Computer Science
Scopus ID: 2-s2.0-77956255147
Other Identifier: 991019174740004721

Automated performance tuning

Additional Links

Abstract

Metrics

Details

Drexel University Social media