Quantifying degradation in synthesized sounds
baRulho:quantifying habitat-induced degradation of (animal) acoustic signals
0.1 Load package
0.2 Synthetize sounds
Create synthesized sounds to be used for making the master sound file for playback experiments:
Code
synth_data <-
synth_sounds(
replicates = 3, # number of replicates for each unique combination of varying features
frequencies = seq(0.5, 10, length.out = 20),
durations = c(0.2, 0.1),
am = TRUE, # amplitude modulation
fm = TRUE, # frequency modulation
sig2 = 0.8, # frequency modulation parameter
shuffle = TRUE # randomize the position of sounds
)
The output is of class data frame and extended selection table (warbleR package format, here printed as a data frame):
sound.files | selec | start | end | bottom.freq | top.freq | frequency | duration | frequency.modulation | amplitude.modulation | treatment | replicate | sound.id |
---|---|---|---|---|---|---|---|---|---|---|---|---|
synthetic_sound_68 | 1 | 0.05 | 0.25 | 7.000 | 11.000 | 9 | 0.2 | fm | am | dur:0.2;freq:9;fm;am | 1 | dur:0.2;freq:9;fm;am_1 |
synthetic_sound_70 | 1 | 0.05 | 0.25 | 1.990 | 6.023 | 4 | 0.2 | fm | am | dur:0.2;freq:4;fm;am | 1 | dur:0.2;freq:4;fm;am_1 |
synthetic_sound_72 | 1 | 0.05 | 0.25 | 0.020 | 3.000 | 1 | 0.2 | fm | am | dur:0.2;freq:1;fm;am | 1 | dur:0.2;freq:1;fm;am_1 |
synthetic_sound_119 | 1 | 0.05 | 0.25 | 4.984 | 9.009 | 7 | 0.2 | fm | am | dur:0.2;freq:7;fm;am | 1 | dur:0.2;freq:7;fm;am_1 |
synthetic_sound_78 | 1 | 0.05 | 0.25 | 7.450 | 11.500 | 9.5 | 0.2 | fm | am | dur:0.2;freq:9.5;fm;am | 1 | dur:0.2;freq:9.5;fm;am_1 |
synthetic_sound_120 | 1 | 0.05 | 0.25 | 2.486 | 6.688 | 4.5 | 0.2 | fm | am | dur:0.2;freq:4.5;fm;am | 1 | dur:0.2;freq:4.5;fm;am_1 |
synthetic_sound_86 | 1 | 0.05 | 0.25 | 0.784 | 5.000 | 3 | 0.2 | fm | am | dur:0.2;freq:3;fm;am | 1 | dur:0.2;freq:3;fm;am_1 |
synthetic_sound_78 | 2 | 0.05 | 0.25 | 7.450 | 11.500 | 9.5 | 0.2 | fm | am | dur:0.2;freq:9.5;fm;am | 2 | dur:0.2;freq:9.5;fm;am_2 |
synthetic_sound_33 | 1 | 0.05 | 0.25 | 0.100 | 4.000 | 2 | 0.2 | fm | am | dur:0.2;freq:2;fm;am | 1 | dur:0.2;freq:2;fm;am_1 |
synthetic_sound_93 | 1 | 0.05 | 0.25 | 4.343 | 8.500 | 6.5 | 0.2 | fm | am | dur:0.2;freq:6.5;fm;am | 1 | dur:0.2;freq:6.5;fm;am_1 |
1 Create master sound file
This step puts all sounds together into a single sound file:
The output file is saved in the current working directory (can be modified using argument ‘path’). A similar file was used for the playback experiments detailed in the paper. The following section shows how to access the test (re-recorded) files.
These are the annotations for the sounds in the master sound files:
sound.files | selec | start | end | bottom.freq | top.freq | sound.id |
---|---|---|---|---|---|---|
master.wav | 1 | 1.000000 | 2.000000 | 1.333333 | 2.666667 | start_marker |
master.wav | 2 | 2.050000 | 2.250023 | 7.875000 | 8.805000 | dur:0.2;freq:9;fm;am_1 |
master.wav | 3 | 2.300023 | 2.500045 | 3.208000 | 4.069000 | dur:0.2;freq:4;fm;am_1 |
master.wav | 4 | 2.550045 | 2.750068 | 0.422000 | 1.223000 | dur:0.2;freq:1;fm;am_1 |
master.wav | 5 | 2.800068 | 3.000091 | 6.905000 | 7.917000 | dur:0.2;freq:7;fm;am_1 |
master.wav | 6 | 3.050091 | 3.250113 | 8.417000 | 9.416000 | dur:0.2;freq:9.5;fm;am_1 |
1.1 Download data
This code downloads the test files. Make sure there are no other sound files in the output directory. The files were re-recorded during a transmission experiment at 10, 30, 65 and 100 m:
Code
path_to_files <- "PATH_TO_FILES" # add folder path to keep master and test files
# directory path where supplementary files have been saved
options(sound.files.path = path_to_files)
download.file("https://figshare.com/ndownloader/files/41905809", destfile = file.path(path_to_files,
"degrad_exp_files.zip"))
unzip(file.path(path_to_files, "degrad_exp_files.zip"), exdir = file.path(path_to_files))
2 Find markers
The code below finds the position of the start and end markers in the test files:
sound.files | selec | start | end | scores | marker | time.mismatch |
---|---|---|---|---|---|---|
trnsc1_100m_closed.wav | 1 | 6.049203 | 7.049203 | 0.2228059 | start_marker | 0.0300544 |
trnsc1_100m_closed.wav | 2 | 257.700686 | 258.700686 | 0.2727442 | end_marker | NA |
trnsc1_100m_open.wav | 3 | 29.502300 | 30.502300 | 0.7150524 | start_marker | 0.0131017 |
trnsc1_100m_open.wav | 4 | 281.136830 | 282.136830 | 0.6083450 | end_marker | NA |
trnsc1_10m_closed.wav | 5 | 39.452253 | 40.452253 | 0.7046298 | start_marker | 0.0114997 |
trnsc1_10m_closed.wav | 6 | 291.085182 | 292.085182 | 0.8364759 | end_marker | NA |
The column ‘time.mismatch’ compares the time difference between the two templates on test-files against that in the master sound file. In a perfect marker detection the value must be 0, meaning that the time in between markers in the master and test files is exactly the same. In this case the average mismatch is of 14 ms and the highest of 32 ms:
[1] 0.01438455
[1] 0.03171009
Modifying detection parameters as spectrogram type (‘type’ argument), time window overlap (‘ovlp’ argument) and hop size (‘hop.size’ argument) can be adjusted in order to improve precision. Note that for aligning all other sounds only the marker with the highest correlation will be used. Therefore the time mismatch is likely to be lower in the aligned test sounds.
3 Align sounds
Once we know the position of markers we can compute the position for all other sounds in the test files (i.e. align):
sound.files | selec | start | end | bottom.freq | top.freq | sound.id | marker |
---|---|---|---|---|---|---|---|
trnsc1_100m_closed.wav | 1 | 6.060315 | 7.060315 | 1.333333 | 2.666667 | start_marker | end_marker |
trnsc1_100m_closed.wav | 2 | 7.110315 | 7.310338 | 7.875000 | 8.805000 | dur:0.2;freq:9;fm;am_1 | end_marker |
trnsc1_100m_closed.wav | 3 | 7.360338 | 7.560361 | 3.208000 | 4.069000 | dur:0.2;freq:4;fm;am_1 | end_marker |
trnsc1_100m_closed.wav | 4 | 7.610361 | 7.810383 | 0.422000 | 1.223000 | dur:0.2;freq:1;fm;am_1 | end_marker |
trnsc1_100m_closed.wav | 5 | 7.860383 | 8.060406 | 6.905000 | 7.917000 | dur:0.2;freq:7;fm;am_1 | end_marker |
trnsc1_100m_closed.wav | 6 | 8.110406 | 8.310429 | 8.417000 | 9.416000 | dur:0.2;freq:9.5;fm;am_1 | end_marker |
Alignments can be manually adjusted with the function manual_realign()
. he function generates a multipanel graph with the spectrogram of the master sound file in top of that from test sound files, highlighting the position of correspondent test sounds on both in order to simplify assessing and adjusting their alignment. Spectrograms include the first few seconds of the sound files (controlled by ‘duration’) which is usually enough to tell the precision of the alignment. The lower spectrogram shows a series of ‘buttons’ that users can click on to control if the test sound file spectrogram (low panel) needs to be moved to the left (“<”) or right (“>”). Users can also reset the spectrogram to its original position (‘reset’), move on to the next sound file in ‘X’ (test sound file annotations) or stop the process (stop button). The function returns an object similar to the input object ‘X’ in which the start and end of the sounds have been adjusted:
This plot shows the aligned re-recorded sounds for transect 1:
Similar images can be generated for each test sound file with the function ‘plot_align_sounds()’.
4 Measuring degradation
Must degradation metrics involve comparing tests sounds that were recorded at different distances from the speaker, to their reference, which is typically recorded at 1m. Hence, a column indicating the distance at which each sound was recorded is needed. In this case the recording distance can be extracted from the sound file name:
Once the distance is included in the annotations degradation metrics can be obtained with few lines of code. First the function `set_reference_sounds()’ is used to define the reference for each test sound:
For instance the following code computes excess attenuation, signal-to-noise ratio, blur ratio and tail-to-signal ratio:
Code
# get degradation measures
degrad_df <- blur_ratio(test_data)
degrad_df <- excess_attenuation(degrad_df)
degrad_df <- signal_to_noise_ratio(degrad_df,
mar = 0.1) # mar = margin to measure noise
degrad_df <- tail_to_signal_ratio(degrad_df, mar = 0.1) # mar = margin to measure tail
# or as a pipe
degrad_df <- test_data |>
excess_attenuation() |>
signal_to_noise_ratio(mar = 0.1) |> # mar = margin to measure noise
blur_ratio() |>
tail_to_signal_ratio(mar = 0.1) # mar = margin to measure tail
head(degrad_df) # print the first 6 rows
sound.files | selec | start | end | bottom.freq | top.freq | sound.id | marker | distance | reference | excess.attenuation | signal.to.noise.ratio | blur.ratio | tail.to.signal.ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
trnsc1_100m_closed.wav | 2 | 7.110315 | 7.310338 | 7.875 | 8.805 | dur:0.2;freq:9;fm;am_1 | end_marker | 100 | trnsc1_1m_open.wav-2 | -8.999975 | 0.7685672 | 0.2920241 | -0.9995171 |
trnsc1_100m_closed.wav | 3 | 7.360338 | 7.560361 | 3.208 | 4.069 | dur:0.2;freq:4;fm;am_1 | end_marker | 100 | trnsc1_1m_open.wav-3 | -10.829675 | 0.6384587 | 0.2976129 | 0.1901690 |
trnsc1_100m_closed.wav | 4 | 7.610361 | 7.810383 | 0.422 | 1.223 | dur:0.2;freq:1;fm;am_1 | end_marker | 100 | trnsc1_1m_open.wav-4 | -26.216040 | 4.5246601 | 0.1418792 | 2.4441865 |
trnsc1_100m_closed.wav | 5 | 7.860383 | 8.060406 | 6.905 | 7.917 | dur:0.2;freq:7;fm;am_1 | end_marker | 100 | trnsc1_1m_open.wav-5 | -15.191104 | -1.3816010 | 0.2667528 | 0.0593048 |
trnsc1_100m_closed.wav | 6 | 8.110406 | 8.310429 | 8.417 | 9.416 | dur:0.2;freq:9.5;fm;am_1 | end_marker | 100 | trnsc1_1m_open.wav-6 | -6.041307 | -0.2574492 | 0.2059137 | -1.2052163 |
trnsc1_100m_closed.wav | 7 | 8.360429 | 8.560451 | 4.171 | 4.839 | dur:0.2;freq:4.5;fm;am_1 | end_marker | 100 | trnsc1_1m_open.wav-7 | -12.278579 | -0.7122037 | 0.2213225 | 1.6697592 |
The package can also generate images to inspect the patterns of degradation on different acoustic dimension. The function ‘plot_degradation()’ produces JPEG files with a mosaic of visual representations of sounds (spectrograms, power spectrum and amplitude envelopes) for each test sound and correspondent reference sound
The function plot_blur_ratio()
can help explore more closely the effects of degradation on signal structure. It generates image files (in ‘jpeg’ format) for each possible blur ratio estimation in the test sound data. The image files show the spectrograms of both sounds and the overlaid energy distribution (either amplitude envelopes or power spectrum, see argument ‘type’) as probability mass functions (PMF). The output graphs highlight the mismatch between the compared distribution which represent the estimated blur ratio returned by either blur_ratio()
or spectrum_blur_ratio()
.
Amplitude envelope blur ratio (or simply blur ratio):
Power spectrum blur ratio:
Spectrograms are shown within the frequency range of the reference sound and also show vertical lines with the start and end of sounds.
Plotting degradation metrics against distance can give us a sense of how well each of them quantify aspects of degradation as this is expected to increase with distance:
Code
## prepare data add habitat strucvture
degrad_df$habitat.structure <- ifelse(grepl("closed", degrad_df$sound.files),
"closed", "open")
names(degrad_df)
# select subset of 3 kHz tonal flat sounds and aggregate by
# distance mean
agg_degrad <- aggregate(cbind(blur.ratio, spectrum.blur.ratio, excess.attenuation,
envelope.correlation, spectrum.correlation, cross.correlation,
signal.to.noise.ratio, tail.to.signal.ratio) ~ distance + habitat.structure,
degrad_df[degrad_df$distance > 1 & grepl("freq:3;tonal;flat",
degrad_df$sound.id), ], mean)
# and SD
agg_degrad_sd <- aggregate(cbind(blur.ratio, spectrum.blur.ratio,
excess.attenuation, envelope.correlation, spectrum.correlation,
cross.correlation, signal.to.noise.ratio, tail.to.signal.ratio) ~
distance + habitat.structure, degrad_df[degrad_df$distance > 1 &
grepl("freq:5;tonal;flat", degrad_df$sound.id), ], sd)
# stack to use with ggplot2
stck_degrad <- stack(agg_degrad)
stck_degrad$distance <- as.numeric(stck_degrad$values[stck_degrad$ind ==
"distance"])
stck_degrad$habitat.structure <- stck_degrad$values[stck_degrad$ind ==
"habitat.structure"]
stck_degrad <- stck_degrad[-1:-16, ]
stck_degrad_sd <- stack(agg_degrad_sd)[-1:-16, ]
stck_degrad$sd <- as.numeric(stck_degrad_sd$values)
stck_degrad$values <- as.numeric(stck_degrad$values)
stck_degrad$ind <- gsub("\\.", " ", stck_degrad$ind)
stck_degrad$ind <- gsub(" to ", "-to-", stck_degrad$ind)
# plot
gg <- ggplot(data = stck_degrad, aes(x = distance, y = values, color = habitat.structure)) +
geom_point(size = 3) + geom_errorbar(aes(ymin = values - sd, ymax = values +
sd), width = 0) + geom_line(aes(colour = habitat.structure, group = habitat.structure)) +
scale_color_viridis_d(alpha = 0.6, begin = 0.3, end = 0.8) + facet_wrap(~ind,
ncol = 2, scales = "free_y") + theme_classic() + scale_x_continuous(breaks = unique(stck_degrad$distance),
labels = unique(stck_degrad$distance)) + labs(color = "Habitat\nstructure",
x = "Distance (m)", y = "Values")
gg
Finally, users can explore the variation in background noise across re-recorded files with the function noise_profile()
, which estimates full spectrum sound pressure levels (i.e. noise profiles) of the ambient noise for annotations or entire sound files:
Code
# compute noise profile only for 10 m files
noise_prof <- noise_profile(X = test_data[test_data$distance == 10,
], mar = 0.01, noise.ref = "adjacent", averaged = TRUE)
# get the original sound file name
noise_prof$org.sound.files <- sapply(strsplit(noise_prof$sound.files,
".wav"), "[[", 1)
# aggregate by frequency and original sound file
agg_noise_prof <- aggregate(amp ~ freq + org.sound.files, noise_prof,
mean)
# add habitat structure and transect info
agg_noise_prof$habitat.structure <- ifelse(grepl("closed", agg_noise_prof$org.sound.files),
"closed", "open")
agg_noise_prof$transect <- sapply(strsplit(agg_noise_prof$org.sound.files,
"_"), "[[", 1)
# plot
ggplot(data = agg_noise_prof, aes(y = amp, x = freq, col = habitat.structure,
lty = transect, group = org.sound.files)) + geom_point(size = 1,
color = "gray") + geom_line(linewidth = 1.4) + scale_color_viridis_d(begin = 0.2,
end = 0.8, alpha = 0.5) + labs(x = "Frequency (kHz)", y = "Amplitude (dBA)",
color = "Habitat\nstructure", lty = "Transect") + coord_flip() +
theme_classic()
Session information
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_CR.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=es_CR.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_CR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_CR.UTF-8 LC_IDENTIFICATION=C
time zone: America/Costa_Rica
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_3.5.1 viridis_0.6.5 viridisLite_0.4.2 tidyr_1.3.1
[5] baRulho_2.1.2 ohun_1.0.2 warbleR_1.1.32 NatureSounds_1.0.4
[9] seewave_2.2.3 tuneR_1.4.7 rprojroot_2.0.4 formatR_1.14
[13] knitr_1.48 kableExtra_1.4.0 remotes_2.5.0
loaded via a namespace (and not attached):
[1] gtable_0.3.5 rjson_0.2.23 xfun_0.48 htmlwidgets_1.6.4
[5] vctrs_0.6.5 tools_4.4.1 bitops_1.0-9 generics_0.1.3
[9] parallel_4.4.1 tibble_3.2.1 proxy_0.4-27 fansi_1.0.6
[13] pkgconfig_2.0.3 KernSmooth_2.23-24 checkmate_2.3.2 lifecycle_1.0.4
[17] farver_2.1.2 compiler_4.4.1 stringr_1.5.1 brio_1.1.5
[21] munsell_0.5.1 htmltools_0.5.8.1 class_7.3-22 RCurl_1.98-1.16
[25] yaml_2.3.10 pillar_1.9.0 MASS_7.3-61 classInt_0.4-10
[29] Deriv_4.1.6 tidyselect_1.2.1 digest_0.6.37 stringi_1.8.4
[33] purrr_1.0.2 sf_1.0-17 dplyr_1.1.4 labeling_0.4.3
[37] fastmap_1.2.0 grid_4.4.1 colorspace_2.1-1 cli_3.6.3
[41] magrittr_2.0.3 utf8_1.2.4 e1071_1.7-16 withr_3.0.1
[45] scales_1.3.0 backports_1.5.0 rmarkdown_2.28 Sim.DiffProc_4.9
[49] signal_1.8-1 igraph_2.0.3 gridExtra_2.3 png_0.1-8
[53] pbapply_1.7-2 evaluate_1.0.0 dtw_1.23-1 fftw_1.0-9
[57] testthat_3.2.1.1 rlang_1.1.4 Rcpp_1.0.13 glue_1.8.0
[61] DBI_1.2.3 xml2_1.3.6 svglite_2.1.3 rstudioapi_0.16.0
[65] jsonlite_1.8.9 R6_2.5.1 units_0.8-5 systemfonts_1.1.0