Quantifying degradation in synthesized sounds

baRulho:quantifying habitat-induced degradation of (animal) acoustic signals

Published

October 9, 2024

0.1 Load package

Code
library(baRulho)
library(warbleR)
library(ggplot2)
library(cowplot)
library(grid)

0.2 Synthetize sounds

Create synthesized sounds to be used for making the master sound file for playback experiments:

Code
synth_data <-
    synth_sounds(
        replicates = 3, # number of replicates for each unique combination of varying features
        frequencies = seq(0.5, 10, length.out = 20),
        durations = c(0.2, 0.1),
        am = TRUE, # amplitude modulation
        fm = TRUE, # frequency modulation
        sig2 = 0.8, # frequency modulation parameter
        shuffle = TRUE # randomize the position of sounds 
    )

The output is of class data frame and extended selection table (warbleR package format, here printed as a data frame):

Code
head(synth_data, 10)
sound.files selec start end bottom.freq top.freq frequency duration frequency.modulation amplitude.modulation treatment replicate sound.id
synthetic_sound_68 1 0.05 0.25 7.000 11.000 9 0.2 fm am dur:0.2;freq:9;fm;am 1 dur:0.2;freq:9;fm;am_1
synthetic_sound_70 1 0.05 0.25 1.990 6.023 4 0.2 fm am dur:0.2;freq:4;fm;am 1 dur:0.2;freq:4;fm;am_1
synthetic_sound_72 1 0.05 0.25 0.020 3.000 1 0.2 fm am dur:0.2;freq:1;fm;am 1 dur:0.2;freq:1;fm;am_1
synthetic_sound_119 1 0.05 0.25 4.984 9.009 7 0.2 fm am dur:0.2;freq:7;fm;am 1 dur:0.2;freq:7;fm;am_1
synthetic_sound_78 1 0.05 0.25 7.450 11.500 9.5 0.2 fm am dur:0.2;freq:9.5;fm;am 1 dur:0.2;freq:9.5;fm;am_1
synthetic_sound_120 1 0.05 0.25 2.486 6.688 4.5 0.2 fm am dur:0.2;freq:4.5;fm;am 1 dur:0.2;freq:4.5;fm;am_1
synthetic_sound_86 1 0.05 0.25 0.784 5.000 3 0.2 fm am dur:0.2;freq:3;fm;am 1 dur:0.2;freq:3;fm;am_1
synthetic_sound_78 2 0.05 0.25 7.450 11.500 9.5 0.2 fm am dur:0.2;freq:9.5;fm;am 2 dur:0.2;freq:9.5;fm;am_2
synthetic_sound_33 1 0.05 0.25 0.100 4.000 2 0.2 fm am dur:0.2;freq:2;fm;am 1 dur:0.2;freq:2;fm;am_1
synthetic_sound_93 1 0.05 0.25 4.343 8.500 6.5 0.2 fm am dur:0.2;freq:6.5;fm;am 1 dur:0.2;freq:6.5;fm;am_1

1 Create master sound file

This step puts all sounds together into a single sound file:

Code
master_annotations <- master_sound_file(X = synth_data, # synthesized sound data
                  file.name = "master", # name of the sound file
                  gap.duration = 0.2) # duration of silence in between sounds

The output file is saved in the current working directory (can be modified using argument ‘path’). A similar file was used for the playback experiments detailed in the paper. The following section shows how to access the test (re-recorded) files.

These are the annotations for the sounds in the master sound files:

Code
head(master_annotations)
sound.files selec start end bottom.freq top.freq sound.id
master.wav 1 1.000000 2.000000 1.333333 2.666667 start_marker
master.wav 2 2.050000 2.250023 7.875000 8.805000 dur:0.2;freq:9;fm;am_1
master.wav 3 2.300023 2.500045 3.208000 4.069000 dur:0.2;freq:4;fm;am_1
master.wav 4 2.550045 2.750068 0.422000 1.223000 dur:0.2;freq:1;fm;am_1
master.wav 5 2.800068 3.000091 6.905000 7.917000 dur:0.2;freq:7;fm;am_1
master.wav 6 3.050091 3.250113 8.417000 9.416000 dur:0.2;freq:9.5;fm;am_1

1.1 Download data

This code downloads the test files. Make sure there are no other sound files in the output directory. The files were re-recorded during a transmission experiment at 10, 30, 65 and 100 m:

Code
path_to_files <- "PATH_TO_FILES"  # add folder path to keep master and test files

# directory path where supplementary files have been saved

options(sound.files.path = path_to_files)

download.file("https://figshare.com/ndownloader/files/41905809", destfile = file.path(path_to_files,
    "degrad_exp_files.zip"))

unzip(file.path(path_to_files, "degrad_exp_files.zip"), exdir = file.path(path_to_files))

2 Find markers

The code below finds the position of the start and end markers in the test files:

Code
# directory path where supplementary files have been saved
options(sound.files.path = path_to_files)

markers_in_tests <- find_markers(X = master_annotations)  # annotations of sounds in master file

head(markers_in_tests)
sound.files selec start end scores marker time.mismatch
trnsc1_100m_closed.wav 1 6.049203 7.049203 0.2228059 start_marker 0.0300544
trnsc1_100m_closed.wav 2 257.700686 258.700686 0.2727442 end_marker NA
trnsc1_100m_open.wav 3 29.502300 30.502300 0.7150524 start_marker 0.0131017
trnsc1_100m_open.wav 4 281.136830 282.136830 0.6083450 end_marker NA
trnsc1_10m_closed.wav 5 39.452253 40.452253 0.7046298 start_marker 0.0114997
trnsc1_10m_closed.wav 6 291.085182 292.085182 0.8364759 end_marker NA

The column ‘time.mismatch’ compares the time difference between the two templates on test-files against that in the master sound file. In a perfect marker detection the value must be 0, meaning that the time in between markers in the master and test files is exactly the same. In this case the average mismatch is of 14 ms and the highest of 32 ms:

Code
# average mismatch
mean(markers_in_tests$time.mismatch, na.rm = TRUE)
[1] 0.01438455
Code
# maximum mismatch
max(markers_in_tests$time.mismatch, na.rm = TRUE)
[1] 0.03171009

Modifying detection parameters as spectrogram type (‘type’ argument), time window overlap (‘ovlp’ argument) and hop size (‘hop.size’ argument) can be adjusted in order to improve precision. Note that for aligning all other sounds only the marker with the highest correlation will be used. Therefore the time mismatch is likely to be lower in the aligned test sounds.

3 Align sounds

Once we know the position of markers we can compute the position for all other sounds in the test files (i.e. align):

Code
aligned_tests <-
    align_test_files(X = master_annotations, # annotations of sounds in master file
                     Y = markers_in_tests) # position of markers in test files
                    

head(aligned_tests)
sound.files selec start end bottom.freq top.freq sound.id marker
trnsc1_100m_closed.wav 1 6.060315 7.060315 1.333333 2.666667 start_marker end_marker
trnsc1_100m_closed.wav 2 7.110315 7.310338 7.875000 8.805000 dur:0.2;freq:9;fm;am_1 end_marker
trnsc1_100m_closed.wav 3 7.360338 7.560361 3.208000 4.069000 dur:0.2;freq:4;fm;am_1 end_marker
trnsc1_100m_closed.wav 4 7.610361 7.810383 0.422000 1.223000 dur:0.2;freq:1;fm;am_1 end_marker
trnsc1_100m_closed.wav 5 7.860383 8.060406 6.905000 7.917000 dur:0.2;freq:7;fm;am_1 end_marker
trnsc1_100m_closed.wav 6 8.110406 8.310429 8.417000 9.416000 dur:0.2;freq:9.5;fm;am_1 end_marker

Alignments can be manually adjusted with the function manual_realign(). he function generates a multipanel graph with the spectrogram of the master sound file in top of that from test sound files, highlighting the position of correspondent test sounds on both in order to simplify assessing and adjusting their alignment. Spectrograms include the first few seconds of the sound files (controlled by ‘duration’) which is usually enough to tell the precision of the alignment. The lower spectrogram shows a series of ‘buttons’ that users can click on to control if the test sound file spectrogram (low panel) needs to be moved to the left (“<”) or right (“>”). Users can also reset the spectrogram to its original position (‘reset’), move on to the next sound file in ‘X’ (test sound file annotations) or stop the process (stop button). The function returns an object similar to the input object ‘X’ in which the start and end of the sounds have been adjusted:

Code
realigned_tests <- manual_realign(X = aligned_tests, Y = master_annotations,
    flim = c(0, 6))

manual_realign

This plot shows the aligned re-recorded sounds for transect 1:

Fourier spectrograms of test recordings from the third experimental transect in the two habitat types (columns) and four distances (rows). The dotted vertical lines highlight the detected position of sounds computed by the functions find_markers and align_test_files.

Similar images can be generated for each test sound file with the function ‘plot_align_sounds()’.

4 Measuring degradation

Must degradation metrics involve comparing tests sounds that were recorded at different distances from the speaker, to their reference, which is typically recorded at 1m. Hence, a column indicating the distance at which each sound was recorded is needed. In this case the recording distance can be extracted from the sound file name:

Code
# get distance
realigned_tests$distance <- sapply(strsplit(realigned_tests$sound.files,
    "_"), "[[", 2)

# make it a numeric column
realigned_tests$distance <- as.numeric(gsub("m", "", realigned_tests$distance))

Once the distance is included in the annotations degradation metrics can be obtained with few lines of code. First the function `set_reference_sounds()’ is used to define the reference for each test sound:

Code
test_data <- set_reference_sounds(realigned_tests, method = 1)

For instance the following code computes excess attenuation, signal-to-noise ratio, blur ratio and tail-to-signal ratio:

Code
# get degradation measures
degrad_df <- blur_ratio(test_data)

degrad_df <- excess_attenuation(degrad_df)

degrad_df <- signal_to_noise_ratio(degrad_df, 
                                   mar = 0.1) # mar = margin to measure noise

degrad_df <- tail_to_signal_ratio(degrad_df, mar = 0.1) # mar = margin to measure tail

# or as a pipe
degrad_df <- test_data |>
    excess_attenuation() |>
    signal_to_noise_ratio(mar = 0.1) |> # mar = margin to measure noise
    blur_ratio() |>
    tail_to_signal_ratio(mar = 0.1) # mar = margin to measure tail

head(degrad_df) # print the first 6 rows
sound.files selec start end bottom.freq top.freq sound.id marker distance reference excess.attenuation signal.to.noise.ratio blur.ratio tail.to.signal.ratio
trnsc1_100m_closed.wav 2 7.110315 7.310338 7.875 8.805 dur:0.2;freq:9;fm;am_1 end_marker 100 trnsc1_1m_open.wav-2 -8.999975 0.7685672 0.2920241 -0.9995171
trnsc1_100m_closed.wav 3 7.360338 7.560361 3.208 4.069 dur:0.2;freq:4;fm;am_1 end_marker 100 trnsc1_1m_open.wav-3 -10.829675 0.6384587 0.2976129 0.1901690
trnsc1_100m_closed.wav 4 7.610361 7.810383 0.422 1.223 dur:0.2;freq:1;fm;am_1 end_marker 100 trnsc1_1m_open.wav-4 -26.216040 4.5246601 0.1418792 2.4441865
trnsc1_100m_closed.wav 5 7.860383 8.060406 6.905 7.917 dur:0.2;freq:7;fm;am_1 end_marker 100 trnsc1_1m_open.wav-5 -15.191104 -1.3816010 0.2667528 0.0593048
trnsc1_100m_closed.wav 6 8.110406 8.310429 8.417 9.416 dur:0.2;freq:9.5;fm;am_1 end_marker 100 trnsc1_1m_open.wav-6 -6.041307 -0.2574492 0.2059137 -1.2052163
trnsc1_100m_closed.wav 7 8.360429 8.560451 4.171 4.839 dur:0.2;freq:4.5;fm;am_1 end_marker 100 trnsc1_1m_open.wav-7 -12.278579 -0.7122037 0.2213225 1.6697592

The package can also generate images to inspect the patterns of degradation on different acoustic dimension. The function ‘plot_degradation()’ produces JPEG files with a mosaic of visual representations of sounds (spectrograms, power spectrum and amplitude envelopes) for each test sound and correspondent reference sound

Code
options(dest.path = "DIRECTORY_TO_SAVE_IMAGE_FILES")
plot_degradation(X = test_data)

Example of the output image files from plot_degradation() showing the spectrogram, amplitude envelope and power spectrum for each test sound and its correspondent reference arranged by sound ID and transect.

The function plot_blur_ratio() can help explore more closely the effects of degradation on signal structure. It generates image files (in ‘jpeg’ format) for each possible blur ratio estimation in the test sound data. The image files show the spectrograms of both sounds and the overlaid energy distribution (either amplitude envelopes or power spectrum, see argument ‘type’) as probability mass functions (PMF). The output graphs highlight the mismatch between the compared distribution which represent the estimated blur ratio returned by either blur_ratio() or spectrum_blur_ratio().

Amplitude envelope blur ratio (or simply blur ratio):

Code
plot_blur_ratio(X = test_data[test_data$sound.id == "dur:0.1;freq:2.5;fm;am_3",
    ], ovlp = 95, dest.path = tempdir())

Example of the output image files from plot_blur_ratio() showing the overlaid amplitude envelope of the test sound and its reference, the mismatch between the two, and the spectrograms for both Power spectrum blur ratio:

Code
plot_blur_ratio(X = test_data, type = "spectrum", ovlp = 95, dest.path = tempdir())

Example of the output image files from plot_blur_ratio() (when type = ‘spectrum’) showing the overlaid power spectrum of the test sound and its reference, the mismatch between the two, and the spectrograms for both

Spectrograms are shown within the frequency range of the reference sound and also show vertical lines with the start and end of sounds.

Plotting degradation metrics against distance can give us a sense of how well each of them quantify aspects of degradation as this is expected to increase with distance:

Code
## prepare data add habitat strucvture
degrad_df$habitat.structure <- ifelse(grepl("closed", degrad_df$sound.files),
    "closed", "open")
names(degrad_df)

# select subset of 3 kHz tonal flat sounds and aggregate by
# distance mean
agg_degrad <- aggregate(cbind(blur.ratio, spectrum.blur.ratio, excess.attenuation,
    envelope.correlation, spectrum.correlation, cross.correlation,
    signal.to.noise.ratio, tail.to.signal.ratio) ~ distance + habitat.structure,
    degrad_df[degrad_df$distance > 1 & grepl("freq:3;tonal;flat",
        degrad_df$sound.id), ], mean)

# and SD
agg_degrad_sd <- aggregate(cbind(blur.ratio, spectrum.blur.ratio,
    excess.attenuation, envelope.correlation, spectrum.correlation,
    cross.correlation, signal.to.noise.ratio, tail.to.signal.ratio) ~
    distance + habitat.structure, degrad_df[degrad_df$distance > 1 &
    grepl("freq:5;tonal;flat", degrad_df$sound.id), ], sd)

# stack to use with ggplot2
stck_degrad <- stack(agg_degrad)
stck_degrad$distance <- as.numeric(stck_degrad$values[stck_degrad$ind ==
    "distance"])
stck_degrad$habitat.structure <- stck_degrad$values[stck_degrad$ind ==
    "habitat.structure"]
stck_degrad <- stck_degrad[-1:-16, ]
stck_degrad_sd <- stack(agg_degrad_sd)[-1:-16, ]
stck_degrad$sd <- as.numeric(stck_degrad_sd$values)
stck_degrad$values <- as.numeric(stck_degrad$values)
stck_degrad$ind <- gsub("\\.", " ", stck_degrad$ind)
stck_degrad$ind <- gsub(" to ", "-to-", stck_degrad$ind)

# plot
gg <- ggplot(data = stck_degrad, aes(x = distance, y = values, color = habitat.structure)) +
    geom_point(size = 3) + geom_errorbar(aes(ymin = values - sd, ymax = values +
    sd), width = 0) + geom_line(aes(colour = habitat.structure, group = habitat.structure)) +
    scale_color_viridis_d(alpha = 0.6, begin = 0.3, end = 0.8) + facet_wrap(~ind,
    ncol = 2, scales = "free_y") + theme_classic() + scale_x_continuous(breaks = unique(stck_degrad$distance),
    labels = unique(stck_degrad$distance)) + labs(color = "Habitat\nstructure",
    x = "Distance (m)", y = "Values")

gg

Mean degradation metrics (+/- s.d.) by distance and habitat structure types

Finally, users can explore the variation in background noise across re-recorded files with the function noise_profile(), which estimates full spectrum sound pressure levels (i.e. noise profiles) of the ambient noise for annotations or entire sound files:

Code
# compute noise profile only for 10 m files
noise_prof <- noise_profile(X = test_data[test_data$distance == 10,
    ], mar = 0.01, noise.ref = "adjacent", averaged = TRUE)

# get the original sound file name
noise_prof$org.sound.files <- sapply(strsplit(noise_prof$sound.files,
    ".wav"), "[[", 1)

# aggregate by frequency and original sound file
agg_noise_prof <- aggregate(amp ~ freq + org.sound.files, noise_prof,
    mean)

# add habitat structure and transect info
agg_noise_prof$habitat.structure <- ifelse(grepl("closed", agg_noise_prof$org.sound.files),
    "closed", "open")
agg_noise_prof$transect <- sapply(strsplit(agg_noise_prof$org.sound.files,
    "_"), "[[", 1)

# plot
ggplot(data = agg_noise_prof, aes(y = amp, x = freq, col = habitat.structure,
    lty = transect, group = org.sound.files)) + geom_point(size = 1,
    color = "gray") + geom_line(linewidth = 1.4) + scale_color_viridis_d(begin = 0.2,
    end = 0.8, alpha = 0.5) + labs(x = "Frequency (kHz)", y = "Amplitude (dBA)",
    color = "Habitat\nstructure", lty = "Transect") + coord_flip() +
    theme_classic()


Session information

R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=es_CR.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=es_CR.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=es_CR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=es_CR.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Costa_Rica
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_3.5.1      viridis_0.6.5      viridisLite_0.4.2  tidyr_1.3.1       
 [5] baRulho_2.1.2      ohun_1.0.2         warbleR_1.1.32     NatureSounds_1.0.4
 [9] seewave_2.2.3      tuneR_1.4.7        rprojroot_2.0.4    formatR_1.14      
[13] knitr_1.48         kableExtra_1.4.0   remotes_2.5.0     

loaded via a namespace (and not attached):
 [1] gtable_0.3.5       rjson_0.2.23       xfun_0.48          htmlwidgets_1.6.4 
 [5] vctrs_0.6.5        tools_4.4.1        bitops_1.0-9       generics_0.1.3    
 [9] parallel_4.4.1     tibble_3.2.1       proxy_0.4-27       fansi_1.0.6       
[13] pkgconfig_2.0.3    KernSmooth_2.23-24 checkmate_2.3.2    lifecycle_1.0.4   
[17] farver_2.1.2       compiler_4.4.1     stringr_1.5.1      brio_1.1.5        
[21] munsell_0.5.1      htmltools_0.5.8.1  class_7.3-22       RCurl_1.98-1.16   
[25] yaml_2.3.10        pillar_1.9.0       MASS_7.3-61        classInt_0.4-10   
[29] Deriv_4.1.6        tidyselect_1.2.1   digest_0.6.37      stringi_1.8.4     
[33] purrr_1.0.2        sf_1.0-17          dplyr_1.1.4        labeling_0.4.3    
[37] fastmap_1.2.0      grid_4.4.1         colorspace_2.1-1   cli_3.6.3         
[41] magrittr_2.0.3     utf8_1.2.4         e1071_1.7-16       withr_3.0.1       
[45] scales_1.3.0       backports_1.5.0    rmarkdown_2.28     Sim.DiffProc_4.9  
[49] signal_1.8-1       igraph_2.0.3       gridExtra_2.3      png_0.1-8         
[53] pbapply_1.7-2      evaluate_1.0.0     dtw_1.23-1         fftw_1.0-9        
[57] testthat_3.2.1.1   rlang_1.1.4        Rcpp_1.0.13        glue_1.8.0        
[61] DBI_1.2.3          xml2_1.3.6         svglite_2.1.3      rstudioapi_0.16.0 
[65] jsonlite_1.8.9     R6_2.5.1           units_0.8-5        systemfonts_1.1.0