Logo image
High performance implementation of the inverse TFT
Conference proceeding

High performance implementation of the inverse TFT

Lingchuan Meng, Jeremy Johnson and Jamila Danielle Johnson
Proceedings of the 2015 International Workshop on parallel symbolic computation, v 10, pp 87-94
10 Jul 2015

Abstract

The inverse truncated Fourier transform (ITFT) is a key component in the fast polynomial and large integer algorithms introduced by van der Hoeven. This paper reports a high performance implementation of the ITFT which poses additional challenges compared to that of the forward transform. A general-radix variant of the ITFT algorithm is developed to allow the implementation to automatically adapt to the memory hierarchy. Then a parallel ITFT algorithm is developed that trades off small arithmetic cost for full vectorization and improved multi-threaded parallelism. The algorithms are automatically generated and tuned to produce an arbitrary-size ITFT library. The new algorithms and the implementation smooths out the staircase performance associated with power-of-two modular FFT implementations, and provide significant performance improvement over zero-padding approaches even when high-performance FFT libraries are used.

Metrics

8 Record Views
2 citations in Scopus

Details

Logo image