Conference proceeding
High performance implementation of the TFT
Proceedings of the 39th International Symposium on symbolic and algebraic computation, pp 328-334
23 Jul 2014
Abstract
This paper reports on a high-performance implementation of the truncated Fourier transform (TFT). A general Cooley-Tukey like algorithm for the TFT is developed that allows the implementation to automatically adapt to the memory hierarchy. Then the algorithm introduces a small relaxation for larger transform sizes which trades off slightly higher arithmetic cost for improved data flow which allows full vectorization and parallelization. The implementation is automatically derived and tuned using the SPIRAL system for code generation and adaptation. The resulting arbitrary-size TFT library smooths out the staircase performance associated with power-of-two modular FFT implementations while retaining the performance associated with state-of-the-art FFT libraries. This provides significant performance improvement over approaches that pad to the next power of two even when using high-performance FFT libraries.
Metrics
16 Record Views
7 citations in Scopus
Details
- Title
- High performance implementation of the TFT
- Creators
- Lingchuan Meng - Drexel UniversityJeremy Johnson - Drexel UniversityJamila Danielle Johnson - Bennett S. LeBow College of Business (1999-)
- Publication Details
- Proceedings of the 39th International Symposium on symbolic and algebraic computation, pp 328-334
- Conference
- 39th International Symposium on symbolic and algebraic computation, 39th
- Series
- ISSAC '14
- Publisher
- Association for Computing Machinery (ACM)
- Number of pages
- 1
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Bennett S. LeBow College of Business; Computer Science
- Scopus ID
- 2-s2.0-84920035681
- Other Identifier
- 991019173851204721