Overview and Learning Objectives
The Fourier transform is one of the most important tools for a wide range of engineering and computer science applications. The general idea of Fourier analysis is to decompose a given signal into a weighted superposition of sinusoidal functions. Since these functions possess an explicit physical meaning regarding their frequencies, the decomposition is typically more accessible for subsequent processing steps than the original signal. Assuming that you are familiar with the Fourier transform and its applications in signal processing, we review in this unit the discrete variant of the Fourier transform known as Discrete Fourier Transform (DFT). We define the inner product that allows for comparing two vectors (e.g., discrete-time signals of finite length). The DFT can be thought of as comparing a given signal of finite length with a specific set of exponential signals (a complex variant of sinusoidal signals), each comparison yielding a complex-valued Fourier coefficient. Then, using suitable visualizations, we show how you can interpret the amplitudes and phases of these coefficients. Recall that one can express the DFT as a complex-valued square matrix. We show how separately plotting the real and imaginary parts leads to beautiful and insightful images. Applying a DFT boils down to computing a matrix–vector product, which we implement via the standard NumPy function np.dot
. Since the number of operations for computing a DFT via a simple matrix–vector product is quadratic in the input length, the runtime of this approach becomes problematic with increasing length. This issue is exactly where the fast Fourier transform (FFT) comes into the game. We present this famous divide-and-conquer algorithm and provide a Python implementation. Furthermore, we compare the runtime behavior between the FFT implementation and the naive DFT implementation. We will further deepen your understanding of the Fourier transform by considering further examples and visualization in the exercises. In Exercise 1, you will learn how to interpret and plot frequency indices in a physically meaningful way. In Exercise 2, we discuss the issue of loosing time information when applying the Fourier transform, which is the main motivation for the short-time Fourier transform. In Exercise 3, you will apply the DFT to a chirp signal, which yields another illustrative example of the DFT's properties. Finally, in Exercise 4, we will invite you to explore the relationship between the DFT and its inverse. Again, an overarching goal of this unit is to apply and deepen your Python programming skills within the context of a central topic for signal processing.
Inner Product¶
In this notebook, we consider discrete-time (DT) signals of finite length N∈N, which we represent as vector
x=(x(0),x(1),...,x(N−1))⊤∈RN
with samples x(n)∈RN for n∈[0:N−1]. Note that ⊤ indicates the transpose of a vector, thus converting a row vector into a column vector. Furthermore, note that we start indexing with the index 0 (thus adapting our mathematical notation to Python conventions). A general concept for comparing two vectors (or signals) is the inner product. Given two vectors x,y∈RN, the inner product between x and y is defined as follows:
⟨x|y⟩:=N−1∑n=0x(n)y(n).
The absolute value of the inner product may be interpreted as a measure of similarity between x and y. If x and y are similar (i.e., if they point to more or less the same direction), the inner product |⟨x|y⟩| is large. If x and y are dissimilar (i.e., if x and y are more or less orthogonal to each other), the inner product |⟨x|y⟩| is close to zero.
One can extend this concept to complex-valued vectors x,y∈CN, where the inner product is defined as
⟨x|y⟩:=N−1∑n=0x(n)¯y(n).
In the case of real-valued signals, the complex conjugate does not play any role and the definition of the complex-valued inner product reduces to the real-valued one. In the following code cell, we give some examples.
np.vdot
to compute the inner product. However, opposed to the mathematical convention that conjugates the second argument, this function applies complex conjugation on the first argument. Therefore, for computing ⟨x|y⟩ as defined above, one has to call np.vdot(y, x)
.
In the following, we generate and visualize three signals x1, x2, x3. Then, we compute and discuss different inner products using the signals.
import numpy as np
from matplotlib import pyplot as plt
import libpcp.signal
%matplotlib inline
Fs = 64
dur = 1
x1, t = libpcp.signal.generate_example_signal(Fs=Fs, dur=dur)
x2, t = libpcp.signal.generate_sinusoid(dur=dur, Fs=Fs, amp=1, freq=2, phase=0.3)
x3, t = libpcp.signal.generate_sinusoid(dur=dur, Fs=Fs, amp=1, freq=6, phase=0.1)
def plot_inner_product(ax, t, x, y, color_x='k', color_y='r', label_x='x', label_y='y'):
"""Plot inner product
Notebook: PCP_09_dft.ipynb
Args:
ax: Axis handle
t: Time axis
x: Signal x
y: Signal y
color_x: Color of signal x (Default value = 'k')
color_y: Color of signal y (Default value = 'r')
label_x: Label of signal x (Default value = 'x')
label_y: Label of signal y (Default value = 'y')
"""
ax.plot(t, x, color=color_x, linewidth=1.0, linestyle='-', label=label_x)
ax.plot(t, y, color=color_y, linewidth=1.0, linestyle='-', label=label_y)
ax.set_xlim([0, t[-1]])
ax.set_ylim([-1.5, 1.5])
ax.set_xlabel('Time (seconds)')
ax.set_ylabel('Amplitude')
sim = np.vdot(y, x)
ax.set_title(r'$\langle$ %s $|$ %s $\rangle = %.1f$' % (label_x, label_y, sim))
ax.legend(loc='upper right')
plt.figure(figsize=(8, 5))
ax = plt.subplot(2, 2, 1)
plot_inner_product(ax, t, x1, x1, color_x='k', color_y='k', label_x='$x_1$', label_y='$x_1$')
ax = plt.subplot(2, 2, 2)
plot_inner_product(ax, t, x1, x2, color_x='k', color_y='r', label_x='$x_1$', label_y='$x_2$')
ax = plt.subplot(2, 2, 3)
plot_inner_product(ax, t, x1, x3, color_x='k', color_y='b', label_x='$x_1$', label_y='$x_3$')
ax = plt.subplot(2, 2, 4)
plot_inner_product(ax, t, x2, x3, color_x='r', color_y='b', label_x='$x_2$', label_y='$x_3$')
plt.tight_layout()
In the above example, one can make the following observations:
- The signal x1 is similar to itself, leading to a large value of ⟨x1|x1⟩=40.0.
- The overall course of the signal x1 strongly correlates with the sinusoid x2, which is reflected by a relatively large value of ⟨x1|x2⟩=29.9.
- There are some finer oscillations of x1 that are captured by x3, leading to a still noticeable value of ⟨x1|x3⟩=14.7.
- The two sinusoids x2 and x3 are more or less uncorrelated, which is revealed by the value of ⟨x2|x3⟩≈0.
In other words, the above comparison reveals that the signal x1 has a strong signal component of 2 Hz (frequency of x2) and 6 Hz (frequency of x3). Measuring correlations between an arbitrary signal and sinusoids of different frequencies is exactly the idea of performing a Fourier (or spectral) analysis.
Definition of DFT¶
Let x∈CN be a vector of length N∈N. The discrete Fourier transform (DFT) of x is defined by:
X(k):=N−1∑n=0x(n)exp(−2πikn/N)
for k∈[0:N−1]. The vector X∈CN can be interpreted as a frequency representation of the time-domain signal x. To obtain a geometric interpretation of the DFT, we define the vector ek∈CN with real part ck=Re(ek) and imaginary part sk=Im(ek) by
ek(n):=exp(2πikn/N)=cos(2πikn/N)+isin(2πikn/N)=ck(n)+isk(n)
for each k∈[0:N−1].
This vector can be regarded as a sampled version of the exponential function of frequency k/N. Using inner products, the DFT can be expressed as
X(k)=N−1∑n=0x(n)¯ek(n)=⟨x|ek⟩,
thus measuring the similarity between the signal x and the sampled exponential functions ek. The absolute value |X(k)| indicates the degree of similarity between the signal x and ek. In the case that x∈RN is a real-valued vector (which is typically the case for audio signals), we obtain:
X(k)=⟨x|Re(ek)⟩−i⟨x|Im(ek)⟩=⟨x|ck⟩−i⟨x|sk⟩
The following plot shows an example signal x compared with functions ¯ek for various frequency parameters k. The real and imaginary part of ¯ek are shown in red and blue, respectively.
def plot_signal_e_k(ax, x, k, show_e=True, show_opt=False):
"""Plot signal and k-th DFT sinusoid
Notebook: PCP_09_dft.ipynb
Args:
ax: Axis handle
x: Signal
k: Index of DFT
show_e: Shows cosine and sine (Default value = True)
show_opt: Shows cosine with optimal phase (Default value = False)
"""
N = len(x)
time_index = np.arange(N)
ax.plot(time_index, x, 'k', marker='.', markersize='10', linewidth=2.0, label='$x$')
plt.xlabel('Time (samples)')
e_k = np.exp(2 * np.pi * 1j * k * time_index / N)
c_k = np.real(e_k)
s_k = np.imag(e_k)
X_k = np.vdot(e_k, x)
plt.title(r'k = %d: Re($X(k)$) = %0.2f, Im($X(k)$) = %0.2f, $|X(k)|$=%0.2f' %
(k, X_k.real, X_k.imag, np.abs(X_k)))
if show_e is True:
ax.plot(time_index, c_k, 'r', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\mathrm{Re}(\overline{\mathbf{u}}_k)$')
ax.plot(time_index, s_k, 'b', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\mathrm{Im}(\overline{\mathbf{u}}_k)$')
if show_opt is True:
phase_k = - np.angle(X_k) / (2 * np.pi)
cos_k_opt = np.cos(2 * np.pi * (k * time_index / N - phase_k))
d_k = np.sum(x * cos_k_opt)
ax.plot(time_index, cos_k_opt, 'g', marker='.', markersize='5',
linewidth=1.0, linestyle=':', label='$\cos_{k, opt}$')
plt.grid()
plt.legend(loc='lower right')
N = 64
x, t = libpcp.signal.generate_example_signal(Fs=N, dur=1)
plt.figure(figsize=(8, 15))
for k in range(1, 8):
ax = plt.subplot(7, 1, k)
plot_signal_e_k(ax, x, k=k)
plt.tight_layout()
DFT Phase¶
At first sight, the DFT may be a bit confusing: Why is a real-valued signal x compared with a complex-valued sinusoid ek? What does the resulting complex-valued Fourier coefficient
ck:=X(k):=⟨x|Re(ek)⟩−i⟨x|Im(ek)⟩.
encode? To understand this, we represent the complex number ck in form of its polar representation
ck=|ck|⋅exp(iγk),
where γk is the angle (given in radians). Furthermore, let cosk,φ:[0:N−1]→R be a sampled sinusoid with frequency parameter k and phase φ∈[0,1), defined by
cosk,φ(n)=cos(2π(kn/N−φ))
for n∈[0,N−1]. Defining φk:=−γk2π, one obtains the following remarkable property of the Fourier coefficient ck:
|ck|=maxφ∈[0,1)⟨x|cosk,φ⟩,φk=argmaxφ∈[0,1)⟨x|cosk,φ⟩.
In other words, the phase φk maximizes the correlation between x and all possible sinusoids cosk,φ with φ∈[0,1). Furthermore, the magnitude |ck| yields this maximal value. Thus, computing a single correlation between x and the complex-valued function ek (which real part coincides with cosk,0, and its imaginary part with cosk,0.25) solves an optimization problem. In the following code cell, we demonstrate this optimality property, where the cosk,φ with optimal phase φ=φk is shown in green.
plt.figure(figsize=(8, 15))
for k in range(1, 8):
ax = plt.subplot(7, 1, k)
plot_signal_e_k(ax, x, k=k, show_e=False, show_opt=True)
plt.tight_layout()
DFT Matrix¶
Being a linear operator CN→CN, the DFT can be expressed by some N×N-matrix. This leads to the famous DFT matrix DFTN∈CN×N matrix, which is given by
DFTN(n,k)=exp(−2πikn/N)
for n∈[0:N−1] and k∈[0:N−1]. Let ρN:=exp(2πi/N) be the primitive Nth root of unity. Then
σN:=¯ρN=exp(−2πi/N)
also defines a primitive Nth root of unity. From the properties of exponential functions, one obtains that
σknN=exp(−2πi/N)kn=exp(−2πikn/N)
From this, one obtains:
DFTN=(111…11σNσ2N…σN−1N1σ2Nσ4N…σ2(N−1)N⋮⋮⋮⋱⋮1σN−1Nσ2(N−1)N…σ(N−1)(N−1)N)
In the following visualization, the real and imaginary part of DFTN are shown, where the values are encoded by suitable colors. Note that the kth row of DFTN corresponds to the vector ek as defined above.
def generate_matrix_dft(N, K):
"""Generate a DFT (discete Fourier transfrom) matrix
Notebook: PCP_09_dft.ipynb
Args:
N: Number of samples
K: Number of frequency bins
Returns:
dft: The DFT matrix
"""
dft = np.zeros((K, N), dtype=np.complex128)
time_index = np.arange(N)
for k in range(K):
dft[k, :] = np.exp(-2j * np.pi * k * time_index / N)
return dft
N = 32
dft_matrix = generate_matrix_dft(N, N)
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.title('$\mathrm{Re}(\mathrm{DFT}_N)$')
plt.imshow(np.real(dft_matrix), origin='lower', cmap='seismic', aspect='equal')
plt.xlabel('Time (sample, index $n$)')
plt.ylabel('Frequency (index $k$)')
plt.colorbar()
plt.subplot(1, 2, 2)
plt.title('$\mathrm{Im}(\mathrm{DFT}_N)$')
plt.imshow(np.imag(dft_matrix), origin='lower', cmap='seismic', aspect='equal')
plt.xlabel('Time (samples, index $n$)')
plt.ylabel('Frequency (index $k$)')
plt.colorbar()
plt.tight_layout()
We now write a function that computes the discrete Fourier transform X=DFTN⋅x of a signal x∈CN. We apply the function from above sampled at N=64 time points. The peaks of the magnitude Fourier transform |X| correspond to the main frequency components the signal is composed of. Note that the magnitude Fourier transform is symmetrical around the center. Why? For the interpretation of the time and frequency axis, see also Exercise 1: Interpretation of Frequency Indices
def dft(x):
"""Compute the discete Fourier transfrom (DFT)
Notebook: PCP_09_dft.ipynb
Args:
x: Signal to be transformed
Returns:
X: Fourier transform of x
"""
x = x.astype(np.complex128)
N = len(x)
dft_mat = generate_matrix_dft(N, N)
return np.dot(dft_mat, x)
N = 64
x, t = libpcp.signal.generate_example_signal(Fs=N, dur=1)
X = dft(x)
def plot_signal_dft(t, x, X, ax_sec=False, ax_Hz=False, freq_half=False, figsize=(10, 2)):
"""Plotting function for signals and its magnitude DFT
Notebook: PCP_09_dft.ipynb
Args:
t: Time axis (given in seconds)
x: Signal
X: DFT
ax_sec: Plots time axis in seconds (Default value = False)
ax_Hz: Plots frequency axis in Hertz (Default value = False)
freq_half: Plots only low half of frequency coefficients (Default value = False)
figsize: Size of figure (Default value = (10, 2))
"""
N = len(x)
if freq_half is True:
K = N // 2
X = X[:K]
else:
K = N
plt.figure(figsize=figsize)
ax = plt.subplot(1, 2, 1)
ax.set_title('$x$ with $N=%d$' % N)
if ax_sec is True:
ax.plot(t, x, 'k', marker='.', markersize='3', linewidth=0.5)
ax.set_xlabel('Time (seconds)')
else:
ax.plot(x, 'k', marker='.', markersize='3', linewidth=0.5)
ax.set_xlabel('Time (samples)')
ax.grid()
ax = plt.subplot(1, 2, 2)
ax.set_title('$|X|$')
if ax_Hz is True:
Fs = 1 / (t[1] - t[0])
ax_freq = Fs * np.arange(K) / N
ax.plot(ax_freq, np.abs(X), 'k', marker='.', markersize='3', linewidth=0.5)
ax.set_xlabel('Frequency (Hz)')
else:
ax.plot(np.abs(X), 'k', marker='.', markersize='3', linewidth=0.5)
ax.set_xlabel('Frequency (index)')
ax.grid()
plt.tight_layout()
plt.show()
plot_signal_dft(t, x, X)
plot_signal_dft(t, x, X, ax_sec=True, ax_Hz=True)
plot_signal_dft(t, x, X, ax_sec=True, ax_Hz=True, freq_half=True)
Fast Fourier Transform (FFT)¶
Next, we discuss the famous fast Fourier transform (FFT), which is a fast algorithm to compute the DFT. The FFT algorithm was originally found by Gauss in about 1805 and then rediscovered by Cooley and Tukey in 1965. The FFT algorithm is based on the observation that applying a DFT of even size N=2M can be expressed in terms of applying two DFTs of half the size M. It exploits the fact that there are algebraic relations between the entries σknN=exp(−2πi/N)kn of DFT matrices. In particular, one has
σM=σ2N
In the FFT algorithm, one computes the DFT of the even-indexed and the uneven-indexed entries of x:
(A(0),…,A(N/2−1))=DFTN/2⋅(x(0),x(2),x(4),…,x(N−2))(B(0),…,B(N/2−1))=DFTN/2⋅(x(1),x(3),x(5),…,x(N−1))
With these two DFTs of size N/2, one can compute the full DFT of size N via:
C(k)=σkN⋅B(k)X(k)=A(k)+C(k)X(N/2+k)=A(k)−C(k)
for k∈[0:N/2−1]. The numbers σkN are also called twiddle factors. If N is a power of two, this idea can be applied recursively until one reaches the computation of DFT1 (the case N=1), which is simply multiplication by one (i.e. just returning the signal of length N=1). For further details, we refer to Section 2.4.3 of [Müller, FMP, Springer 2015]) (see also Table 2.1).
In the following code, we provide a function fft
that implements the FFT algorithm. We test the function fft
by comparing its output with the one when applying the dft
on a test signal x
. For the comparison of result matrices, we use the NumPy functions np.array_equal
and np.allclose
.
def fft(x):
"""Compute the fast Fourier transform (FFT)
Notebook: PCP_09_dft.ipynb
Args:
x: Signal to be transformed
Returns:
X: Fourier transform of x
"""
x = x.astype(np.complex128)
N = len(x)
log2N = np.log2(N)
assert log2N == int(log2N), 'N must be a power of two!'
X = np.zeros(N, dtype=np.complex128)
if N == 1:
return x
else:
this_range = np.arange(N)
A = fft(x[this_range % 2 == 0])
B = fft(x[this_range % 2 == 1])
range_twiddle_k = np.arange(N // 2)
sigma = np.exp(-2j * np.pi * range_twiddle_k / N)
C = sigma * B
X[:N//2] = A + C
X[N//2:] = A - C
return X
N = 64
x, t = libpcp.signal.generate_example_signal(Fs=N, dur=1)
X_via_dft = dft(x)
X_via_fft = fft(x)
X_via_fft_numpy = np.fft.fft(x)
is_equal = np.array_equal(X_via_dft, X_via_fft)
is_equal_tol = np.allclose(X_via_dft, X_via_fft)
is_equal_tol_np = np.allclose(X_via_dft, X_via_fft_numpy)
print('Equality test for dft(x) and fft(x) using np.array_equal: ', is_equal)
print('Equality test for dft(x) and fft(x) using np.allclose: ', is_equal_tol)
print('Equality test for dft(x) and np.fft.fft(x) using np.allclose:', is_equal_tol_np)
dft
and fft
implementations do not yield the same result (due to numerical issues). However, the results are numerically very close, which is verified by the test using np.allclose
.
The FFT reduces the overall number of operations from the order of N2 (needed when computing the usual matrix–vector product DFTN⋅x) to the order of Nlog2N. The savings are enormous. For example, using N=210=1024, the FFT requires roughly Nlog2N=10240 instead of N2=1048576 operations in the naive approach. Using the module timeit
, which provides a simple way to time small bits of Python code, the following code compares the running time when using the naive approach and the FFT. Furthermore, we compare the running time with the highly optimized NumPy implementation np.fft.fft
.
import timeit
rep = 3
for N in [256, 512, 1024, 2048, 4096]:
time_index = np.arange(N)
x = np.sin(2 * np.pi * time_index / N )
t_DFT = 1000 * timeit.timeit(lambda: dft(x), number=rep)/rep
t_FFT = timeit.timeit(lambda: fft(x), number=rep*5)/(rep*5)
t_FFT_np = timeit.timeit(lambda: np.fft.fft(x), number=rep*100)/(rep*100)
print(f'Runtime (ms) for N = {N:4d} : DFT {t_DFT:10.2f}, FFT {t_FFT:.5f}, FFT_np {t_FFT_np:.8f}')
Exercises and Results¶
import libpcp.dft
show_result = True
Given a dimension N∈N, the DFTN transform a vector x∈CN into another vector X∈CN. Assuming that x represents a time-domain signal sampled with a sampling rate Fs, one can associate the index n∈[0:N−1] of the sample x(n) with the physical time point t=n/Fs given in seconds. In case of the vector X, the index k∈[0:N−1] of the coefficient X(k) can be associated to a physical frequency value
ω=k⋅FsN.
Furthermore, using a real-valued signal x∈RN, the upper part of X∈CN becomes redundant, and it suffices to consider the first K coefficients with K=N/2.
- Find explanations why these properties apply.
- Find out how the function
plot_signal_dft
uses these properties to convert and visualize the time and frequency axes. - Using the signal
x, t = libpcp.signal.generate_example_signal(Fs=64, dur=2)
, plot the signal and its magnitude Fourier transform once using axes given in indices and once using axes given in physical units (seconds, Hertz). Discuss the results. - Do the same for the signal
x, t = libpcp.signal.generate_example_signal(Fs=32, dur=2)
. What is going wrong and why?
#<solution>
# Your Solution
#</solution>
libpcp.dft.exercise_freq_index(show_result=show_result)
The Fourier transform yields frequency information that is averaged over the entire time axis. However, the information on when these frequencies occur is hidden in the transform. To demonstrate this phenomenon, construct the following two different signals defined on a common time axis [0,T] with T given in seconds (e.g., T=6 sec).
- A superposition of two sinusoids f1+f2 defined over the entire time interval [0,T], where the first sinusoid f1 has a frequency ω1=1 Hz and an amplitude of 1, while the second sinusoid f2 has a frequency ω2=5 Hz and an amplitude of 0.5.
- A concatenation of two sinusoids, where f1 (specified as before) is now defined only on the subinterval [0,T/2], and f2 is defined on the subinterval [T/2,T].
Sample the interval [0,T] to obtain N samples (use np.linspace
), with N∈N being power of two (e.g., N=256). Define DT-signals of the superposition and the concatenation and compute the DFT for each of the signals. Plot the signals as well as the resulting magnitude Fourier transforms and discuss the result.
#<solution>
# Your Solution
#</solution>
libpcp.dft.exercise_missing_time(show_result=show_result)
The function f(t)=sin(πt2) defines a chirp signal (also called sweep signal), in which the frequency increases with time. The instantaneous frequency ωt of the chirp signal at time t is the derivate of the sinusoid's argument divided by 2π, thus ωt=t.
- Let [t0,t1] be a time interval (given in seconds) with $0\leq t_0
generate_chirp that outputs a sampled chirp signal x
over the interval [t0,t1] with N samples (usenp.linspace
). - Compute the DFT of
x
for various input parameters t0, t1, and N. Plot the chirp signal as well as the resulting magnitude Fourier transform. Discuss the result.
#<solution>
# Your Solution
#</solution>
libpcp.dft.exercise_chirp(show_result=show_result)
The discrete Fourier transform given by the matrix DFTN∈CN×N is an invertible operation, given by the inverse DFT matrix DFT−1N.
- There is an explicit relation between DFTN and its inverse DFT−1N. Which one?
- Write a function
generate_matrix_dft_inv
that explicitly generates DFT−1N. - Check your function by computing DFTN⋅DFT−1N and DFT−1N⋅DFTN (using
np.matmul
) and comparing these products with the identity matrix (usingnp.eye
andnp.allclose
). - Furthermore, compute the inverse DFT by using
np.linalg.inv
. Compare the result with your function usingnp.allclose
. - Similar to
fft
, implement a fast inverse Fourier transformfft_inv
#<solution>
# Your Solution
#</solution>
libpcp.dft.exercise_inverse(show_result=show_result)