Statistical Electroacoustic Sonic Analyzation

Introduction to Sonic Analyzation

The Goal: Electroacoustic analysis bridges the gap between subjective musical listening and objective data science.
The Process: By converting audio signals into quantifiable datasets, producers can analyze frequency balances, dynamic ranges, and stereo field imaging.
The Application: This statistical approach allows for precise equalization, compression, and mastering in digital audio workstations.

The Math Behind Sound

To analyze an audio signal, we must convert it from the time domain (how it sounds over time) to the frequency domain (what pitches are present).

The Discrete Fourier Transform (DFT) makes this possible. Here is the mathematical foundation:

\[X_k = \sum_{n=0}^{N-1} x_n e^{-i 2\pi k n / N}\]

\(X_k\) represents the frequency domain representation.
\(x_n\) represents the time domain samples.
\(N\) is the total number of samples.

Statistical Metrics in Audio Dynamics

When mastering a track, peak volume is less important than perceived loudness. We measure this using the Root Mean Square (RMS).

RMS provides a statistical average of the audio signal’s power, correlating closely with how human ears perceive continuous loudness:

\[x_{\mathrm{RMS}} = \left( \frac{1}{n}\sum_{i=1}^{n} x_i^2 \right)^{1/2}\]

\(x_i\) represents the individual amplitude values.
\(n\) represents the total number of values in the window.

Visualizing Audio Waveforms

This slide displays the R code used to generate our first plot.

# Generating a simulated audio waveform (Sine wave + noise)
time <- seq(0, 1, length.out = 500)
amplitude <- sin(2 * pi * 10 * time) + rnorm(500, mean = 0, sd = 0.2)
audio_data <- data.frame(Time = time, Amplitude = amplitude)

# Plotting the waveform
waveform <- ggplot(audio_data, aes(x = Time, y = Amplitude)) +
  geom_line(color = "steelblue") +
  theme_minimal() +
  labs(title = "Simulated Audio Waveform", x = "Time (s)", y = "Amplitude")

The Waveform Output

Here is the ggplot2 visualization generated by the code on the previous slide.

Frequency Density Distribution

Using statistical density plotting, we can analyze the concentration of specific amplitude thresholds within an audio file. This helps identify if a track is overly compressed or dynamically rich.

3D Spectrogram

A spectrogram visualizes sound by mapping Time, Frequency, and Amplitude simultaneously.

Conclusion

Data-Driven Production: Leveraging statistical tools like ggplot2 and plotly allows producers to see what they hear.
Math as a Tool: Understanding the math behind concepts like FFT and RMS ensures better decision-making during the mixing process.
Future Work: Further analysis could involve applying these scripts to actual .wav or .mp3 datasets extracted from a DAW.