FM Synthesis Part 2

Chris Tralie

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
import librosa
import librosa.display

Now we're going to explore in a little more detail what happesn when we change the modulation frequency. Here's the equation for FM synthesis again, for reference.

$y(t) = \cos( 2 \pi f_c t + I \sin(2 \pi f_m t)) $

  • $f_c$: The center frequency or "carrier frequency" of vibrato

  • $f_m$: The modulation frequency, or how quickly we're going back and forth around the center

  • $I$: Modulation index: the ratio of the modulation deviation to $f_m$

We're also going to take a sneak peek at a tool called a "spectrogram," which we'll talk about more in the next unit. A spectrogram plots the strength of different frequencies over time in our audio. It's a more powerful technique than the zero crossings counting we looked at last week, though it takes heavier machinery to describe.

What we see is quite interesting: if we choose $f_m = f_c$, then we actually get harmonics of $f_c$ in integer ratios.

In [2]:
sr = 22050
t = np.arange(sr*2)/sr
fc = 440
fm = 440
I = 2
y = np.cos(2*np.pi*fc*t + I*np.sin(2*np.pi*fm*t))

D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(np.abs(D), y_axis='linear', x_axis='time', sr=sr)

ipd.Audio(y, rate=sr)
Out[2]:

If we then choose $f_m = kf_c$ for some integer $k > 1$, we jump by $k$ times the base frequency, skipping harmonics in between. The sound becomes "brighter" with the higher harmonics

In [7]:
sr = 22050
t = np.arange(sr*2)/sr
fc = 440
fm = 2*fc
I = 2
y = np.cos(2*np.pi*fc*t + I*np.sin(2*np.pi*fm*t))

D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(np.abs(D), y_axis='linear', x_axis='time', sr=sr)

ipd.Audio(y, rate=sr)
Out[7]:

Something pretty interesting also happens if we choose odd $k$. For $k = 3$, for example, we do get spacing by factors of $3f_c$ from the base frequency, but we also get harmonics that are $f_c$ above each of these. This is because we get reflections of negative frequencies back around 0. But regardless of the math there, the takehome is we can generate lots of different and complex set of harmonics all in the same framework

In [3]:
sr = 22050
t = np.arange(sr*2)/sr
fc = 440
fm = 3*fc
I = 2
y = np.cos(2*np.pi*fc*t + I*np.sin(2*np.pi*fm*t))

D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(np.abs(D), y_axis='linear', x_axis='time', sr=sr)
plt.ylim([0, 4000])

ipd.Audio(y, rate=sr)
Out[3]:

In general, the frequencies are going to jump by a factor of $f_m$. So if we choose the ratio between $f_m$ and $f_c$ to be a non-integer, then we hear some very interesting inharmonic sounds. In other words, we end up with a bunch of frequencies that are not in integer multiples of each other. A good example of a real-world object that sounds like this is a bell.

In [8]:
sr = 22050
t = np.arange(sr*2)/sr
fc = 440
fm = fc*np.sqrt(2)
I = 2
y = np.cos(2*np.pi*fc*t + I*np.sin(2*np.pi*fm*t))

D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(np.abs(D), y_axis='linear', x_axis='time', sr=sr)

ipd.Audio(y, rate=sr)
Out[8]:

Plucked String

If we recall our simple plucked string model from module 3, we remember that we set up a base frequency and all of its harmonics supported by the string geometry, but that the higher harmonics died out faster than the lower ones. We modeled this with an exponential decay, with a higher exponential factor for higher frequencies

In [5]:
sr = 44100
t = np.arange(sr*2)/sr
f0 = 440
y = np.zeros_like(t)
decay = np.exp(-2*t)
for i in range(1, 11):
    y += np.cos(2*np.pi*f0*i*t)*(decay)**i

D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(np.abs(D), y_axis='linear', x_axis='time', sr=sr)
plt.ylim([0, 5000])
    
ipd.Audio(y, rate=sr)
Out[5]:

We can do something very similar with FM synthesis simply by changing the modulation index over time! As we noticed in the last video, a higher modulation index generally introduces more harmonics. So we can start with a higher modulation index, but gradually move it down over time, so that only lower frequencies will be left. This boils down to replacing I with a numpy array that varies with time

(Also note that we keep $f_m = f_c$ so that the harmonics will occur in all integer ratios with $f_c$)

In [10]:
sr = 22050
t = np.arange(sr)/sr
fc = 440
fm = 440
I = 5*np.exp(-4*t)
y = np.cos(2*np.pi*fc*t + I*np.sin(2*np.pi*fm*t))

plt.figure(figsize=(10, 5))
plt.subplot(121)
plt.plot(t, I)
plt.title("Modulation index")
plt.subplot(122)
D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
librosa.display.specshow(np.abs(D), y_axis='linear', x_axis='time', sr=sr)
plt.tight_layout()

ipd.Audio(y, rate=sr)
Out[10]:

Now we are well on our way to creating instrument sounds with FM synthesis!

In [ ]: