There are R packages that can analyse any type of data through providing an analytically interface to functions used for a wide range of purposes. Sound recordings provide a very rich source of data. The tuneR package can run many different analyses on them.

This post is an interesting example.

Although responsible for what is widely held to be the worst Christmas song of all time. The best is, of course, Fairy tale of New York, although an honorable kitschy mention might go to Slade. It turns out from this detailed analysis that Mariah Carey has a very respectable vocal range, although Whitney Houstoun is outstanding.

The same sort of analysis could be applied in a range of ecological contexts for analysing and classifying bird song, bat calls or insect chirps.

So,, I was quite interested in reproducing the figures in the article in a different context.

The Mariah Carey analysis focuses only on vocal range. The range of dominant frequencies sampled from a song with guitars, bass and drums will be heavily dependent on the instrumentation. However the pattern that emerges may provide a useful signature of the song’s structure and general feel.

Choosing some contrasting examples

The Arctic Monkey’s album AM is a highly sophisticated, original collection of songs which figure both strong bass lines and falsetto backing vocals. Although the only instruments used are guitar, bass and vocals, different elements combine to form the melodic structure of the songs.

The Velvet underground also wrote very sophisticated songs, however their underlying musical structures were much simpler and relied heavily on rhythm guitar to provide a more monotonous backdrop to the narrative vocals.

Analysing the waveform

The article on Mariah Carney uses some clever R code, but it does not it explain the logc of the analysis very clearly.

The first step in the analysis is to read in a sound file and convert one or both of the stereo tracks to a single WAv wave form.

a<-dir(pattern="mp3") ## Find all the mp3 files in the current  directory
x<-a[2]  ## Choose one of them for illusrtation
stereoMP3File <- readMP3(x)  ## Read it in and converto to a wav file
## Nost files are steroe with a right and left band. This is potentially confusing so either one band has to be chosen or the two combined.
wavFile <- extractWave(stereoMP3File, interact = FALSE)
if (nchannel(wavFile) > 1) {
  wavFile <- mono(wavFile, "both")

This shows the amplitude of the wave, but it is not useful as it stands for statistical pattern matching.

Sampling from the wav file

So, the plot shows the absolute amplitude of the wave, Now, in order to convert that to a frequency we need to take time slices with a given width. This is the next step.

perioWav <- periodogram(wavFile, width = 4096)

As it stands this is still not particularly useful We can see peaks of frequencies, but there is no clear interpretation To start to interpret the data we need to compare it with other sources.

Extracting the patterns

So, the raw data of a single waveform is not particularly informative as it stands. We need a way of comparing it with other waveform.

The key step for the analysis is to look not just at the frequencies of wave oscillations within sample slices, for one song but to form a table holding the “frequencies of the frequencies” for many songs. In other words we look at the proportion of time within the total wave form in which one note was dominate. We can do that in R for the AM songs and VU songs by applying a version of the function used in the Mariah Carney analysis to a folder with songs from AM and VU.


  {stereoMP3File <- readMP3(x)
wavFile <- extractWave(stereoMP3File, interact = FALSE)
if (nchannel(wavFile) > 1) {
  wavFile <- mono(wavFile, "both")
perioWav <- periodogram(wavFile, width = 4096)
freqWav <- FF(perioWav)




I really love what comes next in the analysis, as the plotted results look so much like the iconic cover to the Unknown pleasures album by Joy Division.

I’ve added a red line to separate AM tracks from the VU tracks.


d$song<-gsub("[[:digit:]]", "", d$song)


ggplot(d,aes(x = freq,y=song)) +
  geom_density_ridges() +geom_hline(yintercept = 10,col="red")