The spectrogram is a fundamental tool in the study of acoustic communication in vertebrates. They are basically a visual representation of the sound where the variation in energy (or power spectral density) is shown on both the frequency and the time domains. Spectrograms allow us to visually explore acoustic variation in our study systems, which makes it easy to distinguish structural differences at small temporal/spectral scales that our ears cannot detect.

We will use the seewave package and its sample data:

library(seewave)

# load examples
data(tico)

data(orni)

Fourier transformation

In order to understand the information contained in a spectrogram it is necessary to understand, at least briefly, the Fourier transformation. In simple words, this is a mathematical transformation that detects the periodicity in time series, identifying the different frequencies that compose them and their relative energy. Therefore it is said that it transforms the signals from the time domain to the frequency domain.

To better understand how it works, we can simulate time series composed of pre-defined frequencies. In this example we simulate 3 frequencies and join them in a single time series:

# freq
f <- 11025

# time sequence
t <- seq(1/f, 1, length.out=f)

# period
pr <- 1/440
w0 <- 2 * pi/pr

# frec 1
h1 <- 5 * cos(w0*t)

plot(h1[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")

# frec 2
h2 <- 10 * cos(2 * w0 * t)

plot(h2[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")

# frec 3
h3 <- 15 * sin(3 * w0 * t)

plot(h3[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")

This is what the union of the three frequencies looks like:

H0 <- 0.5 + h1 + h2 + h3

plot(H0[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")

Now we can apply the Fourier transform to this time series and graph the frequencies detected using a periodogram:

fspc <- Mod(fft(H0))

plot(fspc, type="h", col="blue",
xlab="Frecuency (Hz)",
ylab="Amplitude (no units)")

abline(v = f / 2, lty = 2)

text(x = (f / 2) + 1650, y = 8000, "Nyquist Frequency")

We can make zoom in to frequencies below the Nyquist frequency:

plot(fspc[1:(length(fspc) / 2)], type="h", col="blue",
xlab="Frecuency (Hz)",
ylab="Amplitude (no units)")

This diagram (taken from Sueur 2018) summarizes the process we just simulated:

Tomado de Sueur 2018

The periogram next to the spectrogram of these simulated sounds looks like this:

 

From the Fourier transformation to the spectrogram

The spectrograms are constructed of the spectral decomposition of discrete time segments of amplitude values. Each segment (or window) of time is a column of spectral density values in a frequency range. Take for example this simple modulated sound, which goes up and down in frequency:

 

If we divide the sound into 10 segments and make periodograms for each of them we can see this pattern in the frequencies:

Spectrogram resolution

This animation shows in a very simple way the logic behind the spectrograms: if we calculate Fourier transforms for short segments of time through a sound (e.g. amplitude changes in time) and concatenate them, we can visualize the variation in frequencies over time.  

Overlap

When frequency spectra are combined to produce a spectrogram, the frequency and amplitude modulations are not gradual:

 

There are several “tricks” to smooth out the contours of signals with high modulation in a spectrogram, although the main and most common is window overlap. The overlap recycles a percentage of the amplitude samples of a window to calculate the next window. For example, the sound used as an example, with a window size of 512 points divides the sound into 15 segments:

A 50% overlap generates windows that share 50% of the amplitude values with the adjacent windows. This has the visual effect of making modulations much more gradual:

Which increases (in some way artificially) the number of time windows, without changing the resolution in frequency. In this example, the number of time windows is doubled:

Therefore, the greater the overlap the greater the smoothing of the contours of the sounds:

Spectrogram overlap

This increases the number of windows as a function of the overlap for this particular sound:

 

This increase in spectrogram sharpness does not come without a cost. The longer the time windows, the greater the number of Fourier transformations to compute, and therefore, the greater the duration of the process. This graphic shows the increase in duration as a function of the number of windows on my computer:

 

It is necessary to take this cost into account when producing spectrograms of long sound files (> 1 min).

 

Limitations

However, there is a trade-off between the resolution between the 2 domains: the higher the frequency resolution, the lower the resolution in time. The following animation shows, for the sound of the previous example, how the resolution in frequency decreases as the resolution in time increases:

Spectrogram resolution 2

 

This is the relationship between frequency resolution and time resolution for the example signal:

 

 

Creating spectrograms in R

There are several R packages with functions that produce spectrograms in the graphical device. This chart (taken from Sueur 2018) summarizes the functions and their arguments: Spectrogram functions

 

We will focus on making spectrograms using the spectro () function of seewave:

tico2 <- cutw(tico, from = 0.55, to = 0.9, output = "Wave")

spectro(tico2,  f = 22050, wl = 512, ovlp = 90,
        collevels = seq(-40, 0, 0.5),
        flim = c(2, 6), scale = FALSE)

 

Exercise

  • How can I increase the overlap between time windows?

  • How much longer it takes to create a 99%-overlap spectrogram compare to a 5%-overlap spectrogram?

  • What does the argument ‘collevels’ do? Increase the range and look at the spectrogram.

  • What do the ‘flim’ and ‘tlim’ arguments determine?

  • Run the examples that come in the spectro() function documentation

 

Almost all components of a spectrogram in seewave can be modified. We can add scales:

spectro(tico2,  f = 22050, wl = 512, ovlp = 90,
        collevels = seq(-40, 0, 0.5),
        flim = c(2, 6), scale = TRUE)

Change the color palette:

spectro(tico2,  f = 22050, wl = 512, ovlp = 90,
        collevels = seq(-40, 0, 0.5),
        flim = c(2, 6), scale = TRUE,
          palette = reverse.cm.colors)

spectro(tico2,  f = 22050, wl = 512, ovlp = 90,
        collevels = seq(-40, 0, 0.5),
        flim = c(2, 6), scale = TRUE,
          palette = reverse.gray.colors.1)

Remove the vertical lines:

spectro(tico2,  f = 22050, wl = 512, ovlp = 90,
        collevels = seq(-40, 0, 0.5),
        flim = c(2, 6), scale = TRUE,
          palette = reverse.gray.colors.1,
        grid = FALSE)

Add oscillograms (waveforms):

spectro(tico2,  f = 22050, wl = 512, ovlp = 90,
        collevels = seq(-40, 0, 0.5),
        flim = c(2, 6), scale = TRUE,
          palette = reverse.gray.colors.1,
        grid = FALSE, 
        osc = TRUE)

Use contours instead of colors:

blanc <- colorRampPalette("white")

spectro(tico2, contlevels=seq(-30, 0, 4),
cont=TRUE, colcont=temp.colors(8),
palette=blanc, scale=FALSE,  flim = c(2, 6))

 

Exercise

  • Change the color of the oscillogram to green

  • These are some of the color palettes that fit well the gradients in spectrograms:  

Spectrogram palletes

From Sueur 2018

 

Use at least 3 palettes to generate the “tico2” spectrogram

 

  • Change the relative height of the oscillogram so that it corresponds to 1/6 of the height of the spectrogram

  • Change the relative width of the amplitude scale so that it corresponds to 1/8 of the spectrogram width

  • What does the “zp” argument do? (hint: try zp = 100 and notice the effect on the spectrogram)

  • Which value of “wl” (window size) generates smoother spectrograms for the example “orni” object?

 

The spectrogram() function of the soundgen package produces spectrograms slightly different from those of other packages:

library(soundgen)
## Loading required package: shinyBS
spectrogram(x = as.numeric(tico2@left), samplingRate = [email protected], windowLength = 30, overlap = 90, ylim = c(2, 6))

 

It also allows you to use spectral derivatives to produce spectrograms (similar to the program Sound Analysis Pro):

spectrogram(x = as.numeric(tico2@left), 
            samplingRate = [email protected], windowLength = 30,
            overlap = 90, method = "spectralDerivative", 
            ylim = c(2, 6))

 

It has a large number of arguments that allow modification of the color and “resolution”. For instance, we can change the brightness:

spectrogram(x = as.numeric(tico2@left), 
            samplingRate = [email protected], 
            windowLength = 30, overlap = 90, 
            ylim = c(2, 6), brightness = -0.1)

 

Or apply smoothing in frequency and time (‘smoothTime’ and smoothFreq’):

spectrogram(x = as.numeric(tico2@left), 
            samplingRate = [email protected], 
            windowLength = 30, overlap = 90, ylim = c(2, 6), 
            smoothFreq = 5,  smoothTime = 5)

 

The monitoR package provides the ViewSpec() function to generate spectrograms:

library(monitoR)

viewSpec(tico2, main = NA, frq.lim = c(2, 6), ovlp = 90)

 

The arguments are very similar to those of spectro() of seewave since ViewSpec() uses that function internally.

Other options are specgram() of signal:

library(signal)

specgram(tico2@left, n = 512, Fs = 8000, overlap = round(512 * 0.9))

 

spectrogram() from phonTools:

library(phonTools)

phonTools::spectrogram(tico2@left, fs = [email protected], maxfreq = 6000, windowlength = round(length(tico2@left) / 512))

 

powS() of tuneR does not generate the display, it only calculates the spectrogram (i.e. the matrix of amplitude values in time and frequency). To visualize it, use the image() function and add the axes manually:

library(tuneR)

# calcular espectrograma
ps <- powspec(tico2@left, sr = [email protected],
wintime = 512 / f, steptime = 0.25 * 512 / [email protected])

# normalizar
ps <- ps / max(ps)

# pasar a dB
ps <- 10*log10(ps)

# graficar
image(t(ps), col = gray((512:0) / 512),
xlab = "Tiempo (s)", ylab = "Frecuency (Hz)", # axes labels
axes=FALSE)

# añadir ejes manualmente
time <- round(seq(0, duration(tico2), length=5), 1)

frequency <- round(seq([email protected]/512, [email protected]/2, length=5))

axis(side=1, at=seq(0, 1,length=5), labels = time)

axis(side=2, at=seq(0, 1,length=5), labels = frequency)

 

References

  1. Sueur J, Aubin T, Simonis C. 2008. Equipment review: seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2):213–226.

  2. Sueur, J. (2018). Sound Analysis and Synthesis with R.


 

Session information

## R version 4.0.1 (2020-06-06)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18362)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tuneR_1.3.3       phonTools_0.2-2.1 signal_0.7-6      soundgen_1.7.0   
## [5] shinyBS_0.61      seewave_2.1.6    
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.4.6    knitr_1.28      magrittr_1.5    MASS_7.3-51.6  
##  [5] lattice_0.20-41 xtable_1.8-4    R6_2.4.1        rlang_0.4.6    
##  [9] fastmap_1.0.1   stringr_1.4.0   tools_4.0.1     grid_4.0.1     
## [13] xfun_0.14       htmltools_0.5.0 yaml_2.2.1      digest_0.6.25  
## [17] shiny_1.5.0     monitoR_1.0.7   later_1.1.0.1   promises_1.1.1 
## [21] evaluate_0.14   mime_0.9        rmarkdown_2.3   stringi_1.4.6  
## [25] compiler_4.0.1  httpuv_1.5.4    zoo_1.8-8