Song similarity using dynamic time warping

This short tutorial shows how to use the dfts function in warbleR to compare acoustics signals using dynamic time warping (DTW).

First load these packages:

library(dtw)
library(vegan)
library(pbapply)
library(parallel)
library(warbleR)

and load example data from warbleR

# dir.create(file.path(getwd(),"temp"))
# setwd(file.path(getwd(),"temp"))
 
data(list = c( "Phae.long1", "Phae.long2","Phae.long3", "Phae.long4","manualoc.df"))

writeWave(Phae.long1, "Phae.long1.wav")
writeWave(Phae.long2, "Phae.long2.wav")
writeWave(Phae.long3, "Phae.long3.wav") 
writeWave(Phae.long4, "Phae.long4.wav")

These recordings all come from long-billed hermits with different song types.

Now we can run the dfts function on this recordings:

tsLBH<-dfts(X=manualoc.df,length.out = 40,threshold = 4, img=T,ovlp = 90, bp = c(2,9))

## Calculating dominant frequency measurements:
## all done!

This function creates image files showing the dominant frequency overlaid on the spectrograms. The files are saved in the working directory

the output of the function is the time series of dom frequency for each selection:

head(tsLBH)

##      sound.files selec  dfreq-1  dfreq-2  dfreq-3  dfreq-4  dfreq-5
## 1 Phae.long1.wav     1 7.119141 7.282527 7.338867 7.382812 7.382812
## 2 Phae.long1.wav     2 7.207031 7.250977 7.392954 7.426758 7.467323
## 3 Phae.long1.wav     3 4.833984 4.833984 6.072341 4.921875 7.576623
## 4 Phae.long2.wav     1 5.229492 5.581055 5.581055 5.581055 5.617112
## 5 Phae.long2.wav     2 5.097656 5.141602 5.141602 5.161884 5.185547
## 6 Phae.long3.wav     1 4.921875 5.280198 5.888672 6.375451 6.591797
##    dfreq-6  dfreq-7  dfreq-8  dfreq-9 dfreq-10 dfreq-11 dfreq-12 dfreq-13
## 1 7.521409 5.229492 5.291466 5.444712 5.523588 5.537109 5.581055 8.055514
## 2 7.558594 7.589017 7.646484 5.496544 5.516827 8.305664 8.305664 5.712891
## 3 7.690430 7.734375 7.797476 8.000300 7.998047 8.170448 8.393555 6.730394
## 4 5.625000 5.827825 6.427284 6.328125 6.328125 6.459961 6.471229 6.503906
## 5 5.185547 5.185547 5.800781 5.841346 6.010367 6.088116 6.138822 6.152344
## 6 6.372070 6.372070 6.372070 6.372070 6.419396 6.514047 6.635742 6.669546
##   dfreq-14 dfreq-15 dfreq-16 dfreq-17 dfreq-18 dfreq-19 dfreq-20 dfreq-21
## 1 6.679688 6.679688 6.615460 6.503906 6.547852 6.480243 6.372070 6.372070
## 2 5.844727 6.635742 6.679688 6.679688 6.480243 6.497145 6.294321 6.108398
## 3 6.591797 6.591797 6.591797 6.512921 6.503906 6.470102 6.108398 5.495418
## 4 5.947266 5.668945 5.668945 6.164739 7.426758 7.426758 6.835186 6.767578
## 5 6.152344 6.226713 6.635742 6.716872 6.730394 6.679688 6.679688 6.683068
## 6 6.723633 6.713492 6.679688 6.679688 6.679688 6.723633 6.723633 6.777719
##   dfreq-22 dfreq-23 dfreq-24 dfreq-25 dfreq-26 dfreq-27 dfreq-28 dfreq-29
## 1 6.064453 5.185547 5.101037 4.411433 6.873498 6.811523 6.811523 6.811523
## 2 6.108398 6.108398 5.317383 5.317383 4.996244 7.075195 6.977163 6.899414
## 3 5.405273 5.405273 5.302734 6.885892 6.873498 6.811523 6.811523 6.811523
## 4 6.929838 7.294922 7.389573 7.426758 7.426758 7.001953 6.246995 6.328125
## 5 6.926457 7.264498 6.723633 6.632362 6.591797 6.591797 6.899414 6.899414
## 6 6.811523 6.416016 6.416016 6.652644 6.855469 6.679688 6.679688 6.679688
##   dfreq-30 dfreq-31 dfreq-32 dfreq-33 dfreq-34 dfreq-35 dfreq-36 dfreq-37
## 1 6.968149 7.075195 7.081956 7.207031 7.207031 6.987305 6.988431 7.098858
## 2 6.899414 6.899414 6.912936 7.075195 7.250977 7.250977 7.250977 7.250977
## 3 6.941106 7.119141 7.119141 7.119141 7.294922 7.294922 7.294922 7.193510
## 4 6.328125 6.520808 7.307317 7.250977 7.250977 7.250977 7.148438 6.679688
## 5 6.899414 6.899414 6.899414 6.899414 6.547852 6.547852 6.547852 6.524189
## 6 6.710111 7.007587 7.031250 7.031250 7.004207 6.987305 6.987305 7.000826
##   dfreq-38 dfreq-39 dfreq-40
## 1 7.185622 7.207031 7.207031
## 2 7.217172 7.207031 7.250977
## 3 7.119141 7.163086 7.207031
## 4 7.405349 6.408128 6.020508
## 5 6.503906 7.433519 7.602539
## 6 7.031250 7.605919 7.778320

We can run the DTW analysis to compare these time series. We do this using the function dist with the method= “DTW” (need the dtw package to do this)

dm<-dist(tsLBH[,3:ncol(tsLBH)],tsLBH[,3:ncol(tsLBH)],meth="DTW",step=asymmetric)

Let’s see if the dissimilarity from dtw represents the acoutic differences. First we need a binary matrix representing same recording with 0s, and different recording with 1s. The following functions does exactly that:

recid<-function(x) {
  for(i in 1:ncol(x))
  {
    for(j in 1:length(x[,i])){
      # j=1
      # i=1
      if(sapply(strsplit(as.character(colnames(x)), "/",fixed=T), "[[", 1)[j]==sapply(strsplit(as.character(colnames(x)), "/",fixed=T), "[[", 1)[i]) x[j,i]<-0
      if(sapply(strsplit(as.character(colnames(x)), "/",fixed=T), "[[", 1)[j]!=sapply(strsplit(as.character(colnames(x)), "/",fixed=T), "[[", 1)[i]) x[j,i]<-1      
    }
  }
  return(x)}

dm<-as.matrix(dm)

colnames(dm)<-rownames(dm)<-paste(tsLBH$sound.files,tsLBH$selec,sep="/")

recmat<-recid(dm)

these 2 matrices can be compared with a mantel test:

mantel(dm,as.dist(recmat),permutations = 1000)

## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = dm, ydis = as.dist(recmat), permutations = 1000) 
## 
## Mantel statistic r: 0.5674 
##       Significance: 0.000999 
## 
## Upper quantiles of permutations (null model):
##   90%   95% 97.5%   99% 
## 0.154 0.217 0.271 0.384 
## Permutation: free
## Number of permutations: 1000

As you can see there is a strong association between song type variation and acoustic similarity measured by means of DTW!!!

Now, what is the effect of the number of measurements of dominant frequency across the signal in the overal power of the method?

We can test that by assesing the mantel statistic for different number of measurements:

l<-pbsapply(seq(10,500,by = 10), function(x)
{
  tsLBH<-dfts(X=manualoc.df,length.out = x,threshold = 4, img=F,ovlp = 90, bp = c(2,9))

dm<-dist(tsLBH[,3:ncol(tsLBH)],tsLBH[,3:ncol(tsLBH)],meth="DTW",step=asymmetric)

dm<-as.matrix(dm)

colnames(dm)<-rownames(dm)<-paste(tsLBH$sound.files,tsLBH$selec,sep="/")

recmat<-recid(dm)

m1<-mantel(dm,as.dist(recmat),permutations = 10)
return(m1$statistic)
})

This is the effect of number of measurements on the mantel correlation

plot(seq(10,500,by = 10),l,type="l", col="blue", lwd=2, xlab = "number of measurements",
     ylab= "Mantel statistic")

and what about its performance compare to a more standar method like measuring a bunch of acoustic parameters? :

span<-specan(manualoc.df)

dspan<-dist(span[,3:ncol(span)],method = "euclidean",diag = T,upper = T)

mantel(dspan,as.dist(recmat),permutations = 10000)

## Measuring acoustic parameters:
## all done!

## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = dspan, ydis = as.dist(recmat), permutations = 10000) 
## 
## Mantel statistic r: 0.4149 
##       Significance: 0.0011999 
## 
## Upper quantiles of permutations (null model):
##   90%   95% 97.5%   99% 
## 0.150 0.205 0.253 0.310 
## Permutation: free
## Number of permutations: 10000

Looks like DTW represents the acoustic variation a little better.

Similarity from DTW and acoustic parameters is also correlated:

mantel(dm,dspan,permutations = 10000)

## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = dm, ydis = dspan, permutations = 10000) 
## 
## Mantel statistic r: 0.4071 
##       Significance: 0.015798 
## 
## Upper quantiles of permutations (null model):
##   90%   95% 97.5%   99% 
## 0.198 0.298 0.372 0.437 
## Permutation: free
## Number of permutations: 10000

That is it!

Song similarity using dynamic time warping

Marcelo Araya-Salas

October 3, 2015