Why Your FFT Frequencies Are Wrong: A NumPy Linspace Guide
Struggling with incorrect frequency peaks from your FFT? Your frequency axis is likely the culprit. This guide demystifies FFT bins and shows how to use NumPy's linspace and fftfreq to get accurate results every time.
Dr. Alex Carter
A computational physicist and Python expert specializing in scientific computing and data analysis.
You’ve done everything right. You captured your signal, loaded it into a NumPy array, and ran it through the Fast Fourier Transform using np.fft.fft
. A beautiful plot emerges, showing a clear, sharp peak exactly where you expected it. But when you go to label the frequency, something’s off. The number you calculate doesn’t match the reality of your signal. Is it 50.1 Hz instead of 50? Or maybe it’s a completely nonsensical value? If this sounds familiar, you’re not alone. It’s one of the most common stumbling blocks in digital signal processing.
The magic of the FFT is its ability to decompose a signal from the time domain into its constituent frequencies. It’s an incredibly powerful tool, but it comes with a crucial caveat: the raw output of an FFT doesn’t give you frequencies in Hertz. It gives you amplitudes at discrete “frequency bins.” The real work, and the source of most errors, lies in correctly mapping these integer bins to the actual, physical frequencies they represent.
In this guide, we’ll demystify this process. We’ll explore why your intuition might be leading you astray and show you how to use NumPy’s tools—including linspace
—to build a correct frequency axis. More importantly, we’ll reveal the gold-standard function that makes this process foolproof. Let’s get those frequencies right, once and for all.
What is the FFT *Actually* Giving You?
When you call np.fft.fft(my_signal)
on a signal of length N
, you get back a NumPy array of N
complex numbers. It’s tempting to think of this as a direct frequency representation, but it's more abstract than that. Here’s the breakdown:
- The Index: The index of each element in the output array, from
0
toN-1
, is called a frequency bin. - The Value: The complex number at each bin holds two pieces of information: its magnitude tells you the strength (amplitude) of that frequency component, and its angle tells you the phase offset.
The key takeaway is that the bin index k
is not a frequency in Hertz. It’s just a placeholder. To find the real frequency, you need to know two things about your original signal: the total number of samples (N
) and the sampling rate (fs
, in Hz).
The Common Mistake: Misinterpreting the Bins
The most frequent error arises from an intuitive but flawed calculation. One might reason: "My signal has N
points and was sampled at fs
Hz. So, the frequency resolution is fs / N
. I'll just multiply each bin index by that resolution."
This leads to code like this:
# The common but incorrect approach
N = len(signal)
sampling_rate = 1000 # in Hz
# This creates an array from 0 up to (but not including) sampling_rate
frequency_axis_wrong = np.arange(N) * (sampling_rate / N)
# The peak at index k is then interpreted as k * (sampling_rate / N) Hz.
While this seems logical, it hides a major problem. The FFT spectrum is symmetrical. The highest frequency a signal can represent is the Nyquist frequency, which is sampling_rate / 2
. The frequencies in the second half of the FFT output are actually aliases of the negative frequencies. Treating the whole axis as a linear progression from 0 to sampling_rate
will give you correct values for the first half, but completely wrong and misleading values for the second half.
The Manual Fix: Building a Frequency Axis with `np.linspace`
This brings us to the title of our post. Can we use np.linspace
to fix this? Yes, by being clever with its arguments. The function np.linspace(start, stop, num, endpoint=True)
generates an array of evenly spaced numbers over a specified interval. The key is the endpoint
argument.
The N
frequency bins of an FFT correspond to frequencies from 0 Hz up to, but not including, the sampling rate fs
. The frequency fs
would be the start of the *next* signal chunk. Therefore, we want N
points in the interval [0, fs)
.
This is achieved perfectly by setting endpoint=False
:
# The correct manual approach
N = len(signal)
sampling_rate = 1000 # in Hz
# Generates N points from 0 up to (but not including) sampling_rate
frequency_axis_correct = np.linspace(0, sampling_rate, N, endpoint=False)
This creates the exact same array as the `np.arange` method shown before, but it makes the intention clearer: we are defining an interval and the number of points we want within it. However, it still leaves us with the problem of interpreting the second half of the array, which represents negative frequencies.
Step-by-Step Example: From Signal to Correct Frequencies
Let's make this concrete. We'll generate a signal with two known frequencies (40 Hz and 90 Hz) and see if we can recover them accurately.
import numpy as np
import matplotlib.pyplot as plt
# 1. Signal Parameters
sampling_rate = 500 # Hz
duration = 4 # seconds
N = sampling_rate * duration # Total samples
# 2. Create the time axis and the signal
# Use endpoint=False for the time axis as well!
t = np.linspace(0, duration, N, endpoint=False)
signal = 1.5 * np.sin(2 * np.pi * 40 * t) + 0.8 * np.sin(2 * np.pi * 90 * t)
# 3. Compute the FFT
yf = np.fft.fft(signal)
# 4. Create the frequency axis using our linspace method
xf = np.linspace(0, sampling_rate, N, endpoint=False)
# 5. Plot the results (only the positive frequency part)
# We only need the first N/2 points due to symmetry
N_half = N // 2
plt.figure(figsize=(12, 6))
# Normalize the amplitude
plt.plot(xf[:N_half], 2.0/N * np.abs(yf[:N_half]))
plt.grid()
plt.xlabel("Frequency (Hz)")
plt.ylabel("Amplitude")
plt.title("FFT of a Two-Tone Signal")
plt.show()
If you run this code, you'll see two sharp peaks located precisely at 40 Hz and 90 Hz. Success! We correctly generated the full frequency axis and then plotted the first half to visualize the positive frequencies we care about.
The Pro Move: Why `np.fft.fftfreq` is Your Best Friend
While building the frequency axis manually with linspace
is a great way to understand the underlying mechanics, NumPy provides a dedicated function that is more robust, less error-prone, and purpose-built for this exact task: np.fft.fftfreq
.
This function does all the heavy lifting for you, including handling the negative frequencies correctly. Its signature is np.fft.fftfreq(n, d=1.0)
, where n
is the number of samples and d
is the sample spacing (the inverse of the sampling rate, 1 / fs
).
Let's see it in action:
# The best and most robust way
from numpy.fft import fft, fftfreq
N = len(signal)
sampling_rate = 500
yf = fft(signal)
# The magic function!
x_pro = fftfreq(N, d=1/sampling_rate)
# The output of fftfreq is ordered for optimal computation:
# [0, f1, f2, ..., f_nyquist, ..., -f2, -f1]
# We can use np.fft.fftshift() to reorder it for plotting if we want to see the classic -fs/2 to +fs/2 view.
Using fftfreq
removes any ambiguity. It returns an array where the first half contains the positive frequencies and the second half contains the negative frequencies, in the exact order that corresponds to the output of np.fft.fft
. No more guesswork. This is the idiomatic NumPy way.
Comparison: `linspace` vs. `arange` vs. `fftfreq`
Let's summarize the options in a table.
Method | Syntax Example | Use Case | Gotcha |
---|---|---|---|
np.linspace |
linspace(0, fs, N, endpoint=False) |
Good for understanding the mechanics. Explicitly defines the interval. | You must remember endpoint=False and manually handle the negative frequency half. |
np.arange |
arange(N) * (fs / N) |
Also works for manual creation, emphasizes the step size (frequency resolution). | Can suffer from floating point errors on large ranges. Also requires manual handling of the spectrum's second half. |
np.fft.fftfreq |
fftfreq(N, d=1/fs) |
The recommended method. Purpose-built for FFTs. | The output array order (with negative frequencies at the end) can be surprising if you're not expecting it. |
Handling Nyquist and Symmetrical Spectra
As mentioned, for a real-valued input signal (which is almost always the case), the FFT spectrum is symmetrical around the DC component (0 Hz). The magnitude of the component at frequency +f
is the same as at -f
.
This is why we typically only care about and plot the first half of the FFT output, from bin 0
to N/2
. This corresponds to the frequencies from 0 Hz up to the Nyquist frequency (fs / 2
).
Pro Tip: Use `rfft` for Real Signals
Since this symmetry is guaranteed, NumPy provides an optimized set of functions for real signals: np.fft.rfft
and np.fft.rfftfreq
. These functions are faster because they only compute the first half of the spectrum (the positive frequencies), saving you both computation time and the manual step of slicing the array. For any real-world signal analysis, these are the functions you should be using.
Conclusion: FFTs You Can Finally Trust
That nagging feeling that your FFT plots might be wrong can undermine your entire analysis. The root of the problem is almost always a misunderstanding of what the FFT returns: not frequencies, but amplitudes in bins that you must map to frequencies.
While you can build a correct frequency axis manually using np.linspace(0, sampling_rate, N, endpoint=False)
, this requires careful handling of the spectrum's symmetrical nature. The most direct, reliable, and idiomatic NumPy solution is to let the library do the work for you. By embracing np.fft.fftfreq
(or np.fft.rfftfreq
for real signals), you eliminate ambiguity and ensure your results are accurate every time.
Now you can analyze your signals with confidence, knowing your frequency plots are not just beautiful—they're correct.