In this notebook, we introduce the discrete Fourier transform (DFT) and its basic properties. We then study the fast Fourier transform (FFT), which is an efficient algorithm to evaluate the DFT. In most parts, we closely follow Section 2.4 of [Müller, FMP, Springer 2015].
An important concept for understanding the Fourier transform is the inner product for a complex vector space $\mathbb{C}^N$ for $N \in \mathbb{N}$. Given two complex vectors $x, y \in \mathbb{C}^N$, the inner product between $x$ and $y$ is defined as follows:
$$ \langle x | y \rangle := \sum_{n=0}^{N-1} x(n) \overline{y(n)}.$$The absolute value of the inner product may be interpreted as a measure of similarity between $x$ and $y$: if $x$ and $y$ point to the same direction (i.e., $x$ and $y$ are similar), the inner product $|\langle x | y \rangle|$ is large. If $x$ and $y$ are orthogonal (i.e., $x$ and $y$ are dissimilar), the inner product $|\langle x | y \rangle|$ is zero.
Note that when using the function np.vdot
to compute the inner product, the complex conjugate is performed on the first argument. Therefore, for computing $\langle x | y \rangle$ as defined above, one has to call np.vdot(y, x)
.
import numpy as np
from numba import jit
x = np.array([ 1.0, 1j, 1.0 + 1.0j ])
y = np.array([ 1.1, 1j, 0.9 + 1.1j ])
print('Vectors of high similarity:', np.abs(np.vdot(y, x)))
x = np.array([ 1.0, 1j, 1.0 + 1j ])
y = np.array([ 1.1, -1j, 0.1 ])
print('Vectors of low similarity:', np.abs(np.vdot(y, x)))
Let $x\in \mathbb{C}^N$ be a vector of length $N\in\mathbb{N}$. In the music signal context, $x$ can be interpreted as a discrete signal with samples $x(0), x(1), ..., x(N-1)$. Note that we start indexing with the index $0$. The discrete Fourier transform (DFT) of $x$ is defined by:
$$ X(k) := \sum_{n=0}^{N-1} x(n) \exp(-2 \pi i k n / N) $$for $k \in [0:N-1]$. The vector $X\in\mathbb{C}^N$ can be interpreted as frequency representation of the time-domain signal $x$. To obtain a geometric interpretation of the DFT, we define the vector $\mathbf{u}_k\in\mathbb{C}^N$ by
$$\mathbf{u}_k(n) := \exp(2 \pi i k n / N) = \cos(2 \pi k n / N) + i \sin(2 \pi k n / N)$$for each $k \in [0:N-1]$. This vector can be regarded as a sampled version of the exponential function of frequency $k/N$. Then, the DFT can be expressed as inner products
$$ X(k) := \sum_{n=0}^{N-1} x(n) \overline{\mathbf{u}_k} = \langle x | \mathbf{u}_k \rangle$$of the signal $x$ and the sampled exponential functions $\mathbf{u}_k$. The absolute value $|X(k)|$ indicates the degree of similarity between the signal $x$ and $\mathbf{u}_k$.
In the case that $x\in \mathbb{R}^N$ is a real-valued vector (which is always the case for our music signal scenario), we obtain:
$$ X(k) := \langle x |\mathrm{Re}(\mathbf{u}_k) \rangle - i\langle x | \mathrm{Im}(\mathbf{u}_k) \rangle $$The following plot shows an example signal $x$ compared with functions $\overline{\mathbf{u}_k}$ for two different frequency parameters $k$. The real and imaginary part of $\overline{\mathbf{u}_k}$ are shown in red and blue, respectively.
from matplotlib import pyplot as plt
%matplotlib inline
N = 64
n = np.arange(N)
k = 3
x = np.cos(2 * np.pi * (k * n / N) + (1.2*np.random.rand(N) - 0.0))
plt.figure(figsize=(10, 5))
plt.subplot(2, 1, 1)
plt.plot(n, x, 'k', marker='.', markersize='10', linewidth=2.0, label='$x$')
plt.xlabel('Time (samples)')
k = 3
u_k_real = np.cos(2 * np.pi * k * n / N)
u_k_imag = -np.sin(2 * np.pi * k * n / N)
u_k = u_k_real + u_k_imag*1j
sim_complex = np.vdot(u_k, x)
sim_abs = np.abs(sim_complex)
plt.title(r'Signal $x$ and some $u_k$ (k=3) having high similarity: Re($X(k)$) = %0.2f, Im($X(k)$) = %0.2f, $|X(k)|$=%0.2f'%(sim_complex.real,sim_complex.imag,sim_abs))
plt.plot(n, u_k_real, 'r', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\mathrm{Re}(\overline{\mathbf{u}}_k)$');
plt.plot(n, u_k_imag, 'b', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\mathrm{Im}(\overline{\mathbf{u}}_k)$');
plt.legend()
plt.subplot(2, 1, 2)
plt.plot(n, x, 'k', marker='.', markersize='10', linewidth=2.0, label='$x$')
plt.xlabel('Time (samples)')
k = 5
u_k_real = np.cos(2 * np.pi * k * n / N)
u_k_imag = -np.sin(2 * np.pi * k * n / N)
u_k = u_k_real + u_k_imag*1j
sim_complex = np.vdot(u_k, x)
sim_abs = np.abs(sim_complex)
plt.title(r'Signal $x$ and some $u_k$ (k=5) having low similarity: Re($X(k)$) = %0.2f, Im($X(k)$) = %0.2f, $|X(k)|$=%0.2f'%(sim_complex.real,sim_complex.imag,sim_abs))
plt.plot(n, u_k_real, 'r', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\mathrm{Re}(\overline{\mathbf{u}}_k)$');
plt.plot(n, u_k_imag, 'b', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\mathrm{Im}(\overline{\mathbf{u}}_k)$');
plt.legend()
plt.tight_layout()
Being a linear operator $\mathbb{C}^N \to \mathbb{C}^N$, the DFT can be expressed by some $N\times N$-matrix. This leads to the famous DFT matrix $\mathrm{DFT}_N \in \mathbb{C}^{N\times N}$ matrix, which is given by
$$\mathrm{DFT}_N(n, k) = \mathrm{exp}(-2 \pi i k n / N)$$for $n\in[0:N-1]$ and $k\in[0:N-1]$. Let $\rho_N:=\exp(2 \pi i / N)$ be the primitive $N^\mathrm{th}$ root of unity as defined in the FMP notebook of the exponential function. Then
$$\sigma_N:= \overline{\rho_N} = \mathrm{exp}(-2 \pi i / N)$$also defines a primitive $N^\mathrm{th}$ root of unity. From the properties of exponential functions, one obtains that
$$ \sigma_N^{kn} = \mathrm{exp}(-2 \pi i / N)^{kn} = \mathrm{exp}(-2 \pi i k n / N)$$From this, one obtains:
$$ \mathrm{DFT}_N = \begin{pmatrix} 1 & 1 & 1 & \dots & 1 \\ 1 & \sigma_N & \sigma_N^2 & \dots & \sigma_N^{N-1} \\ 1 & \sigma_N^2 & \sigma_N^4 & \dots & \sigma_N^{2(N-1)} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & \sigma_N^{N-1} & \sigma_N^{2(N-1)} & \dots & \sigma_N^{(N-1)(N-1)} \\ \end{pmatrix} $$In the following visualization, the real and imaginary part of $\mathrm{DFT}_N$ are shown, where the values are encoded by suitable colors. Note that the $k^\mathrm{th}$ row of $\mathrm{DFT}_N$ corresponds to the vector $\mathbf{u}_k$ as defined above.
@jit(nopython=True)
def generate_matrix_dft(N, K):
"""Generates a DFT (discrete Fourier transfrom) matrix
Notebook: C2/C2_DFT-FFT.ipynb
Args:
N (int): Number of samples
K (int): Number of frequency bins
Returns:
dft (np.ndarray): The DFT matrix
"""
dft = np.zeros((K, N), dtype=np.complex128)
for n in range(N):
for k in range(K):
dft[k, n] = np.exp(-2j * np.pi * k * n / N)
return dft
N = 32
dft_mat = generate_matrix_dft(N, N)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.title('$\mathrm{Re}(\mathrm{DFT}_N)$')
plt.imshow(np.real(dft_mat), origin='lower', cmap='seismic', aspect='equal')
plt.xlabel('Time index $n$')
plt.ylabel('Frequency index $k$')
plt.colorbar()
plt.subplot(1, 2, 2)
plt.title('$\mathrm{Im}(\mathrm{DFT}_N)$')
plt.imshow(np.imag(dft_mat), origin='lower', cmap='seismic', aspect='equal')
plt.xlabel('Time index $n$')
plt.ylabel('Frequency index $k$')
plt.colorbar()
plt.tight_layout()
We now write a function that computes the discrete Fourier transform $X = \mathrm{DFT}_N \cdot x$:
@jit(nopython=True)
def dft(x):
"""Compute the disrcete Fourier transfrom (DFT)
Notebook: C2/C2_DFT-FFT.ipynb
Args:
x (np.ndarray): Signal to be transformed
Returns:
X (np.ndarray): Fourier transform of x
"""
x = x.astype(np.complex128)
N = len(x)
dft_mat = generate_matrix_dft(N, N)
return np.dot(dft_mat, x)
N = 128
n = np.arange(N)
k = 10
x = np.cos(2 * np.pi * (k * n / N) + 2 * (np.random.rand(N) - 0.5))
X = dft(x)
plt.figure(figsize=(10, 3))
plt.subplot(1, 2, 1)
plt.title('$x$')
plt.plot(x, 'k')
plt.xlabel('Time (index $n$)')
plt.subplot(1, 2, 2)
plt.title('$|X|$')
plt.plot(np.abs(X), 'k')
plt.xlabel('Frequency (index $k$)')
plt.tight_layout()
Next, we discuss the famous fast Fourier transform (FFT), which is a fast algorithm to compute the DFT. The FFT algorithm was originally found by Gauss in about 1805 and then rediscovered by Cooley and Tukey in 1965. The FFT algorithm is based on the observation that applying a DFT of even size $N=2M$ can be expressed in terms of applying two DFTs of half the size $M$. It exploits the fact that there are algebraic relations between the entries $\sigma_N^{kn} = \mathrm{exp}(-2 \pi i / N)^{kn}$ of DFT matrices. In particular, one has
$$\sigma_M = \sigma_N^2$$In the FFT algorithm, one computes the DFT of the even-indexed and the uneven-indexed entries of $x$:
\begin{align} (A(0), \dots, A(N/2-1)) &= \mathrm{DFT}_{N/2} \cdot (x(0), x(2), x(4), \dots, x(N-2))\\ (B(0), \dots, B(N/2-1)) &= \mathrm{DFT}_{N/2} \cdot (x(1), x(3), x(5), \dots, x(N-1)) \end{align}With these two DFTs of size $N/2$, one can compute the full DFT of size $N$ via:
\begin{eqnarray} C(k) &=& \sigma_N^k \cdot B(k)\\ X(k) &=& A(k) + C(k)\\ X(N/2 + k) &=& A(k) - C(k)\\ \end{eqnarray}for $k \in [0: N/2 - 1]$. The numbers $\sigma_N^k$ are also called twiddle factors. If $N$ is a power of two, this idea can be applied recursively until one reaches the computation of $\mathrm{DFT}_{1}$ (the case $N=1$), which is simply multiplication by one (i.e. just returning the signal of length $N=1$). For further details, we refer to Section 2.4.3 of [Müller, FMP, Springer 2015]) (see also Table 2.1).
In the following code, there is a function twiddle
that computes twiddle factors and a function fft
that implements the FFT algorithm.
@jit(nopython=True)
def twiddle(N):
"""Generate the twiddle factors used in the computation of the fast Fourier transform (FFT)
Notebook: C2/C2_DFT-FFT.ipynb
Args:
N (int): Number of samples
Returns:
sigma (np.ndarray): The twiddle factors
"""
k = np.arange(N // 2)
sigma = np.exp(-2j * np.pi * k / N)
return sigma
@jit(nopython=True)
def fft(x):
"""Compute the fast Fourier transform (FFT)
Notebook: C2/C2_DFT-FFT.ipynb
Args:
x (np.ndarray): Signal to be transformed
Returns:
X (np.ndarray): Fourier transform of x
"""
x = x.astype(np.complex128)
N = len(x)
log2N = np.log2(N)
assert log2N == int(log2N), 'N must be a power of two!'
X = np.zeros(N, dtype=np.complex128)
if N == 1:
return x
else:
this_range = np.arange(N)
A = fft(x[this_range % 2 == 0])
B = fft(x[this_range % 2 == 1])
C = twiddle(N) * B
X[:N//2] = A + C
X[N//2:] = A - C
return X
N = 16
n = np.arange(N)
k = 4
x = np.cos(2 * np.pi * (k * n / N) + 2 * (np.random.rand(N) - 0.5))
X_via_dft = dft(x)
X_via_fft = fft(x)
plt.figure(figsize=(10, 3))
plt.subplot(1, 2, 1)
plt.title('$x$')
plt.plot(x, 'k', marker='.', markersize=12)
plt.xlabel('Time (index $n$)')
plt.subplot(1, 2, 2)
plt.title('$|X|$')
plt.plot(np.abs(X_via_dft), 'k', marker='.', markersize=18, label='dft')
plt.plot(np.abs(X_via_fft), linestyle='--', color='pink', marker='.', markersize=6, label='fft')
plt.xlabel('Frequency (index $k$)')
plt.legend()
plt.tight_layout()
The FFT reduces the overall number of operations from the order of $N^2$ (needed when computing the usual matrix–vector product $\mathrm{DFT}_N \cdot x$) to the order of $N\log_2N$. The savings are enormous. For example, using $N=2^{10}=1024$, the FFT requires roughly $N\log_2N=10240$ instead of $N^2=1048576$ operations in the naive approach.
Using the module timeit
, which provides a simple way to time small bits of Python code, the following code compares the running time when using the naive approach and the FFT:
N = 512
n = np.arange(N)
x = np.sin(2 * np.pi * 5 * n / N )
print('Timing for DFT: ', end='')
%timeit dft(x)
print('Timing for FFT: ', end='')
%timeit fft(x)
Let us now compare the running times by successively increasing the size $N$ (this may take a couple of seconds):
import timeit
Ns = [2 ** n for n in range(5, 11)]
times_dft = []
times_fft = []
execuctions = 5
for N in Ns:
n = np.arange(N)
x = np.sin(2 * np.pi * 5 * n / N )
time_dft = timeit.timeit(lambda: dft(x), number=execuctions) / execuctions
time_fft = timeit.timeit(lambda: fft(x), number=execuctions) / execuctions
times_dft.append(time_dft)
times_fft.append(time_fft)
plt.figure(figsize=(10, 4))
plt.plot(Ns, times_dft, '-xk', label='DFT')
plt.plot(Ns, times_fft, '-xr', label='FFT')
plt.xticks(Ns)
plt.legend()
plt.grid()
plt.xlabel('$N$')
plt.ylabel('Runtime (seconds)');
The dft
and fft
functions have been included into libfmp
. In the following code cell, we call those libfmp
functions and time them again using the module timeit
.
import sys
sys.path.append('..')
import libfmp.c2
N = 16
n = np.arange(N)
k = 4
x = np.cos(2 * np.pi * (k * n / N) + 2 * (np.random.rand(N) - 0.5))
X_via_dft = libfmp.c2.dft(x)
X_via_fft = libfmp.c2.fft(x)
plt.figure(figsize=(10, 3))
plt.subplot(1, 2, 1)
plt.title('$x$')
plt.plot(x, 'k', marker='.', markersize=12)
plt.xlabel('Time (index $n$)')
plt.subplot(1, 2, 2)
plt.title('$|X|$')
plt.plot(np.abs(X_via_dft), 'k', marker='.', markersize=18, label='dft')
plt.plot(np.abs(X_via_fft), linestyle='--', color='pink', marker='.', markersize=6, label='fft')
plt.xlabel('Frequency (index $k$)')
plt.legend()
plt.tight_layout()
N = 512
n = np.arange(N)
x = np.sin(2 * np.pi * 5 * n / N )
print('Timing for DFT: ', end='')
%timeit libfmp.c2.dft(x)
print('Timing for FFT: ', end='')
%timeit libfmp.c2.fft(x)