ParaMonte Fortran 2.0.0 Parallel Monte Carlo and Machine Learning LibrarySee the latest version documentation.
pm_fftnr Module Reference

This module contains procedures and generic interfaces for computing the Discrete Fourier Transform of a real or complex sequence using radix-2 Cooley–Tukey Fast-Fourier Transform.
More...

## Data Types

interface  getFFTF
Generate and return the Forward Fourier Transform (a.k.a. Fourier Analysis) of a periodic sequence of type complex or real of arbitrary kind parameter. More...

interface  getFFTI
Generate and return the Inverse (normalized by 2 / size(data)) Fourier Transform of a periodic sequence of type complex or real of arbitrary kind parameter. More...

interface  getFFTR
Generate and return the Reverse (unnormalized) Fourier Transform of a periodic sequence of type complex or real of arbitrary kind parameter. More...

interface  setFFTF
Return the Forward Fourier Transform of a periodic sequence of type complex or real of arbitrary kind parameter. More...

interface  setFFTI
Return the Inverse (normalized by 2 / size(data)) Fourier Transform of a periodic sequence of type complex or real of arbitrary kind parameter. More...

interface  setFFTR
Return the Reverse (unnormalized) Fourier Transform of a periodic sequence of type complex or real of arbitrary kind parameter. More...

## Variables

character(*, SK), parameter MODULE_NAME = "@pm_fftnr"

## Detailed Description

This module contains procedures and generic interfaces for computing the Discrete Fourier Transform of a real or complex sequence using radix-2 Cooley–Tukey Fast-Fourier Transform.

The discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency.

Since DFT deals with a finite amount of data, it can be implemented in computers by numerical algorithms or even dedicated hardware.
These implementations usually employ efficient fast Fourier transform (FFT) algorithms, so much so that the terms FFT and DFT are used interchangeably.

The DFT transforms a sequence of $$N$$ complex numbers $$\{ x_j \} := x_0, x_1, \ldots, x_{N − 1}$$ into another sequence of complex numbers, $$\{z_k\} := z_0, z_1, \ldots, z_{N - 1}$$ which is defined by,

\begin{eqnarray} \large x_k &=& \sum_{j=0}^{N-1} z_j \cdot e^{-\frac {i 2\pi}{N}jk} ~, \\ &=& \sum_{j=0}^{N-1} z_j \cdot \left[\cos\left(\frac{2\pi}{N}jk\right) - i \cdot \sin\left(\frac{2 \pi}{N}jk\right)\right], \end{eqnarray}

where $$i$$ represents the imaginary unit.
The naive evaluation of the DFT is a matrix-vector multiplication that costs $$\mathcal{O}(n^2)$$ operations for $$N$$ data-points.
Fast Fourier Transform (FFT) algorithms use a divide-and-conquer strategy to factorize the matrix into smaller sub-matrices, corresponding to the integer factors of the length $$N$$.
If $$N$$ can be factorized into a product of integers $$f_1 f_2 \ldots f_m$$, then the DFT can be computed in $$\mathcal{O}(n \sum f_i)$$ operations.
For a radix-2 FFT this gives an operation count of $$\mathcal{O}(n \log_2 n)$$.

For every Forward DFT (setFFTF), which expresses the data sequence in terms of frequency coefficients), there is a corresponding Reverse (Backward) DFT.
The Reverse DFT (setFFTR) takes the output of the Forward DFT and returns the original data sequence given to the Forward DFT, multiplied by the length of the sequence,

\begin{eqnarray} \large N z_j &=& \sum_{k=0}^{N-1} x_k \cdot e^{\frac {i 2\pi}{N}jk} ~, \\ &=& \sum_{k=0}^{N-1} x_k \cdot \left[\cos\left(\frac{2\pi}{N}jk\right) + i \cdot \sin\left(\frac{2 \pi}{N}jk\right)\right], \end{eqnarray}

A third definition, the Inverse DFT (setFFTI), further normalizes the Reverse DFT and returns,

\begin{eqnarray} \large z_j &=& \frac{1}{N} \sum_{k=0}^{N-1} x_k \cdot e^{\frac {i 2\pi}{N}jk} ~, \\ &=& \frac{1}{N} \sum_{k=0}^{N-1} x_k \cdot \left[\cos\left(\frac{2\pi}{N}jk\right) + i \cdot \sin\left(\frac{2 \pi}{N}jk\right)\right], \end{eqnarray}

such that the resulting normalized Reverse (Backward) DFT or Inverse DFT becomes a true inverse of the Forward DFT.
The Reverse DFT becomes relevant when the overall scale of the result is unimportant.
In such cases, it is convenient to use the Reverse (Backward) DFT instead of the Inverse DFT to avoid the redundant divisions in the normalization step.
A 64-bit real division is typically on the order of $$\sim20$$ CPU cycles on the contemporary hardware.
Avoiding the redundant normalizations can therefore become a significant saving if the DFT is to computed repeatedly for an array of millions of elements.

# The Cooley–Tukey algorithm

The Cooley–Tukey algorithm, named after J. W. Cooley and John Tukey, is the most common fast Fourier transform (FFT) algorithm.
It re-expresses the discrete Fourier transform (DFT) of an arbitrary composite size $$N = N_{1} N_{2}$$ in terms of $$N_1$$ smaller DFTs of sizes $$N_2$$, recursively, to reduce the computation time to $$\mathcal{O}(N log N)$$ for highly composite $$N$$ (smooth numbers).
Because of the algorithm importance, specific variants and implementation styles have become known by their own names, as described below.
Because the Cooley–Tukey algorithm breaks the DFT into smaller DFTs, it can be combined arbitrarily with any other algorithm for the DFT.
For example, The Rader or Bluestein algorithm can be used to handle large prime factors that cannot be decomposed by Cooley–Tukey, or the prime-factor algorithm can be exploited for greater efficiency in separating out relatively prime factors.

The algorithm, along with its recursive application, was invented by Carl Friedrich Gauss.
Cooley and Tukey independently rediscovered and popularized it 160 years later.

## Intuition

The Cooley–Tukey algorithms recursively re-express a DFT of a composite size $$N = N_1 N_2$$ as:

1. Perform $$N_1$$ DFTs of size $$N_2$$.
2. Multiply by complex roots of unity (often called the twiddle factors).
3. Perform $$N_2$$ DFTs of size $$N_1$$.

Typically, either $$N_1$$ or $$N_2$$ is a small factor (not necessarily prime), called the radix (which can differ between stages of the recursion).
If $$N_1$$ is the radix, it is called a decimation in time (DIT) algorithm, whereas if $$N_2$$ is the radix, it is decimation in frequency (DIF) algorithm, also called the Sande–Tukey algorithm.

A radix-2 decimation-in-time (DIT) FFT is the simplest and most common form of the Cooley–Tukey algorithm, although highly optimized Cooley–Tukey implementations typically use other forms of the algorithm some of which are implemented in pm_fftpack.
Radix-2 DIT divides a DFT of size N into two interleaved DFTs (hence the name radix-2) of size $$N / 2$$ with each recursive stage.
Consider again the discrete Fourier transform (DFT) defined by the formula:

$$X_{k}=\sum_{n=0}^{N-1}x_{n}e^{-{\frac {2\pi i}{N}}nk},$$

where $$k$$ is an integer ranging from $$0$$ to $$N − 1$$.
Radix-2 DIT first computes the DFTs of the even-indexed inputs $$(x_{2m}=x_{0},x_{2},\ldots ,x_{N-2})$$ and of the odd-indexed inputs $$(x_{2m+1}=x_{1},x_{3},\ldots ,x_{N-1})$$, and then combines those two results to produce the DFT of the whole sequence.
This idea can then be performed recursively to reduce the overall runtime to $$\mathcal{O}(N log N)$$.
This simplified form assumes that $$N$$ is a power of two;
since the number of sample points N can usually be chosen freely by the application (e.g. by changing the sample rate or window, zero-padding, etcetera), this is often not an important restriction.
The radix-2 DIT algorithm rearranges the DFT of the function $$x_{n}$$ into two parts: a sum over the even-numbered indices $$n = {2m}$$ and a sum over the odd-numbered indices $$n = {2m+1}$$:

$${\begin{matrix}X_{k}&=&\sum \limits_{m=0}^{N/2-1}x_{2m}e^{-{\frac {2\pi i}{N}}(2m)k}+\sum \limits_{m=0}^{N/2-1}x_{2m+1}e^{-{\frac {2\pi i}{N}}(2m+1)k}\end{matrix}}$$

One can factor a common multiplier $$e^{-{\frac {2\pi i}{N}}k}$$ out of the second sum, as shown in the equation below.
It is then clear that the two sums are the DFT of the even-indexed part $$x_{{2m}}$$ and the DFT of odd-indexed part $$x_{{2m+1}}$$ of the function $$x_{n}$$.
Denote the DFT of the Even-indexed inputs $$x_{{2m}}$$ by $$E_{k}$$ and the DFT of the Odd-indexed inputs $$x_{2m+1}$$ by $$O_{k}$$ and we obtain:

$$\begin{matrix}X_{k} = \underbrace{\sum \limits_{m=0}^{N/2-1}x_{2m}e^{-{\frac {2\pi i}{N/2}}mk}}_{\mathrm {DFT\;of\;even-indexed\;part\;of\;} x_{n}}{} + e^{-{\frac {2\pi i}{N}}k} \underbrace{\sum \limits_{m=0}^{N/2-1}x_{2m+1}e^{-{\frac {2\pi i}{N/2}}mk}}_{\mathrm {DFT\;of\;odd-indexed\;part\;of\;} x_{n}} = E_{k}+e^{-{\frac {2\pi i}{N}}k}O_{k}\qquad {\text{ for }}k=0,\dots ,{\frac {N}{2}}-1.\end{matrix}$$

Note that the equalities hold for $$k = 0,\dots ,N-1$$ but the crux is that $$E_{k}$$ and $$O_{k}$$ are calculated in this way for $$k=0, \dots, {\frac {N}{2}}-1$$ only.
Thanks to the periodicity of the complex exponential, $$X_{k+{\frac {N}{2}}}$$ is also obtained from $$E_{k}$$ and $$O_{k}$$:

\begin{aligned} X_{k+{\frac {N}{2}}} &=\sum \limits_{m=0}^{N/2-1}x_{2m}e^{-{\frac {2\pi i}{N/2}}m(k+{\frac {N}{2}})}+e^{-{\frac {2\pi i}{N}}(k+{\frac {N}{2}})}\sum \limits_{m=0}^{N/2-1}x_{2m+1}e^{-{\frac {2\pi i}{N/2}}m(k+{\frac {N}{2}})} \\ &=\sum \limits_{m=0}^{N/2-1}x_{2m}e^{-{\frac {2\pi i}{N/2}}mk}e^{-2\pi mi}+e^{-{\frac {2\pi i}{N}}k}e^{-\pi i}\sum \limits_{m=0}^{N/2-1}x_{2m+1}e^{-{\frac {2\pi i}{N/2}}mk}e^{-2\pi mi} \\ &=\sum \limits_{m=0}^{N/2-1}x_{2m}e^{-{\frac {2\pi i}{N/2}}mk}-e^{-{\frac {2\pi i}{N}}k}\sum \limits_{m=0}^{N/2-1}x_{2m+1}e^{-{\frac {2\pi i}{N/2}}mk}\\ &=E_{k}-e^{-{\frac {2\pi i}{N}}k}O_{k} \end{aligned}

We can rewrite $$X_k$$ and $$X_{k+{\frac {N}{2}}}$$ as:

$$\begin{matrix} X_{k} &=& E_{k}+e^{-{\frac {2\pi i}{N}}{k}}O_{k}\\X_{k+{\frac {N}{2}}} &=& E_{k}-e^{-{\frac {2\pi i}{N}}{k}}O_{k} \end{matrix}$$

This result, expressing the DFT of length $$N$$ recursively in terms of two DFTs of size $$N/2$$, is the core of the radix-2 DIT fast Fourier transform.
The algorithm gains its speed by re-using the results of intermediate computations to compute multiple DFT outputs.
Note that final outputs are obtained by a $$+/−$$ combination of $$E_{k}$$ and $$O_{k}\exp(-2\pi ik/N)$$, which is simply a size-2 DFT (sometimes called a butterfly in this context);
when this is generalized to larger radices below, the size-2 DFT is replaced by a larger DFT (which itself can be evaluated with an FFT).

Note
The routines of this module provide a radix-2 implementation of the Cooley–Tukey algorithm, meaning that the can be used to compute the forward and reverse FFT of real and complex data sequences whose length is a power of 2.
This requires a potentially zero-padded sequence (with trailing zeros) such that the length of the input sequence is always a power of $$2$$.
One of the most famous such implementation is that of Numerical Recipes), require a potentially-padded sequence (with trailing zeros) The mixed-radix algorithms as implemented in pm_fftpack are generally faster than the radix-2 algorithms at the cost of requiring an extra memory storage of the same length as the input sequence.
Developer Remark:
The naming of this module is in honor of the Numerical Recipes book in Fortran that contains one of the most widely used implementations of radix-2 algorithm.
pm_fftpack.
Numerical Recipes in Fortran, Press et al. 1992.
Test:
test_pm_fftnr

Final Remarks

If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.

1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Amir Shahmoradi, September 1, 2017, 12:00 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin

## ◆ MODULE_NAME

 character(*, SK), parameter pm_fftnr::MODULE_NAME = "@pm_fftnr"

Definition at line 190 of file pm_fftnr.F90.