HW3 R Notebook

Setup

This neat block of R code will try to load each of the libraries we need (installing them if they’re not already installed).

if(!require(tidyverse)){
    install.packages("tidyverse")
    library(tidyverse)
}

if(!require(ggplot2)){
    install.packages("ggplot2")
    library(ggplot2)
}

if(!require(ggthemes)){
    install.packages("ggthemes")
    library(ggthemes)
}

if(!require(ggsci)){
    install.packages("ggsci")
    library(ggsci)
}

if(!require(caret)){
    install.packages('caret')
    library(caret)
}

theme_set(theme_tufte(ticks=TRUE))

Can listeners use anticipatory coarticulatory information to identify deleted vowels?

Imagine the consonant/vowel sequence CV1ˈCV2; the first vowel quality is [ə] and the second vowel is stressed. In class, we’ve discussed that, due to vowel-to-vowel coarticulation, the formant frequencies of V1 are influenced in systematic ways by the formant frequencies of V2. The goal of this assignment is to determine whether listeners –i.e., you– can use this coarticulatory information to identify a deleted V2 on the basis of coarticulatory information in V1.

Figure 1: Example Vowel to Vowel coarticulatory influence in perception

Figure 1 plots results of a previous iteration of this experiment. It was generated in Microsoft Excel and contains data for 7 people; 2 of whom had acquired English as a second language. In this figure, the x-axis categories are the deleted vowel qualities from the excised V2. Each bar represents listeners’ responses to that stimulus: height conveys proportion, fill pattern indicates which vowel quality the participant indicated they heard. Note that bars sum to 1 within each original, deleted, vowel context.

Run the (Praat) Experiment

You already downloaded the .zip file and expanded it. In Praat, open the familiarization ieaou 3 times.aiff file and the V-to-V Perception.praat script file. When read into Praat, the attached V-to-V Perception experiment file will present 5 repetitions each of these 30 edited stimuli in a randomized order, for a total of 150 stimulus presentations. You need to run yourself on the experiment. The test must be taken over decent headphones, in a quiet place, with minimal distractions.; come to the phonetics lab if you don’t have headphones, a quiet space, or working Praat.

First, please, listen to the familiarization series (Sound familiarization_ieaou_3_times in the Praat objects window) which consists of a series of 5 [ə]’s excised from /i e a o u/ contexts (in that order; 3 repetitions of the set of 5 for a total of 150 trials). Listen closely. Do this a few times if you need to. Then select and run the actual test (there are 150 trials with a break every 50 if you want it). Your task is to identify which V2 vowel has been deleted when you hear only [ə] (the original V1 vowel). Important: be careful not to quit Praat until you’ve Extracted your results and saved the data (see next step) or you’ll lose your data.

Analyze Your Data

After running the experiment, select Extract results from the Praat objects window, then highlight theResults MFC V-to-V Perception object and select Collect to Table. Then highlightTable allResults and, from the Save pull-down menu, select `Save to tab-delimited file`. Save the resulting text file in the experiment folder as YourLastname-YourFirstname.txt (name it just like this, this is part of what you’ll hand in).

Next we need to load the data into a data frame so we can manipulate, visualize, and analyze.

# this code will load any txt files in the directory into the v2v df
v2v.txt <- list.files(pattern = "\\.txt$")

v2v <- v2v.txt %>%
  set_names() %>%
  map_dfr(read_delim, .id = "file", col_select=c(subject, stimulus, response, reactionTime)) %>%
  # turn stimulus into levels
  extract(stimulus, into = c("deletedV", "itemN"), "^@b([aeiou])(\\d)-cut$", remove = FALSE) %>%
  # flag as correct/incorrect
  mutate(matching = 1 * (deletedV == response))  %>%
  # strip .txt extension from file column
  mutate( file = stringr::str_replace(file, "\\.txt$", "" )) 

# coerce strings of characters as factors with vowel quality order
v2v$response <- factor(v2v$response, levels = c("i", "e", "a", "o", "u"))
v2v$deletedV <- factor(v2v$deletedV, levels = c("i", "e", "a", "o", "u"))

Response to V1 by V2

The following code will replicate in R the Excel plot above (Figure 1).

v2v %>%
    count(response, deletedV) %>%
    group_by(response) %>%
        mutate(s = sum(n)) %>%
        mutate(Proportion = n/s) %>% 
    ggplot(aes(x = deletedV, y = Proportion, fill = response)) +
    geom_col(position = position_dodge(preserve="single")) +
    scale_fill_observable()

In this figure, the Y axis represents proportion of matches between your response to the auditory stimulus (the V1) and the list of vowel qualities presented on the X axis the deleted V (the V2). Typically the ‘correct’ response will be the most common response to each stimulus but there is also typically a second pattern. When there is a mismatch between the deleted V and the participant’s response, the mismatches tend to share a phonetic feature. Do you see any such pattern in your results?

Confusion Matrix

Miller & Nicely (1955, 341) analyzed perceptual confusions among (some) English consonants and found that their participants confused some English consonants more than others at different signal to noise ratios (snr). In one particular listening condition (+6dB snr, which is quite noisy) presented at a sampling frequency of 200 - 6500 Hz, listeners frequently confused [f] and [θ] and this confusability was presented in a “confusion matrix” or sometimes called a “confusability matrix”. Here is a sample of their Table V:

a portion of (Miller and Nicely 1955)’s Table V
f θ s ʃ
f 207 57
θ 71 142 3
s 1 7 232 2
ʃ 1 239

This table shows that when listeners heard [f] (top row), they responded [f] 207 times, [θ] 71 times, and [s] once. The next column shows listeners responses to [θ] which they heard 57 times as [f], 142 times as [θ], and 7 times as [s]. Presented below is the R code, using the caret and ggplot libraries for generating a confusion matrix from your deleted V responses. This presents the same data as the bar chart in the previous section, but presented in a way that helps us think about those data differently.

First we can generate a small text table along with some summary statistics:

cm <- confusionMatrix(v2v$deletedV, v2v$response, dnn = c("Response", "DeletedV"))
cm
Confusion Matrix and Statistics

        DeletedV
Response  i  e  a  o  u
       i 15 11  2  0  2
       e  8 14  7  1  0
       a  2  7 13  5  3
       o  1  1  2 19  7
       u  0  0  4 14 12

Overall Statistics
                                          
               Accuracy : 0.4867          
                 95% CI : (0.4043, 0.5695)
    No Information Rate : 0.26            
    P-Value [Acc > NIR] : 2.305e-09       
                                          
                  Kappa : 0.3583          
                                          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: i Class: e Class: a Class: o Class: u
Sensitivity            0.5769  0.42424  0.46429   0.4872   0.5000
Specificity            0.8790  0.86325  0.86066   0.9009   0.8571
Pos Pred Value         0.5000  0.46667  0.43333   0.6333   0.4000
Neg Pred Value         0.9083  0.84167  0.87500   0.8333   0.9000
Prevalence             0.1733  0.22000  0.18667   0.2600   0.1600
Detection Rate         0.1000  0.09333  0.08667   0.1267   0.0800
Detection Prevalence   0.2000  0.20000  0.20000   0.2000   0.2000
Balanced Accuracy      0.7280  0.64375  0.66247   0.6940   0.6786

Now let’s plot the confusion matrix with ggplot

plt <- as.data.frame(cm$table)
plt$DeletedV <- factor(plt$DeletedV, levels=rev(levels(plt$DeletedV)))

# plot the confusion matrix in a geom_tile()
ggplot(plt, aes(Response, DeletedV, fill= Freq)) +
        geom_tile() + geom_text(aes(label=Freq)) +
        scale_fill_gradient(low="white", high="#6666CC") +
        labs(x = "Deleted V", y = "Response") +
        # these next two lines are here to sort the identities on 
        # the diagonol from upper left to lower right
        scale_x_discrete(labels=c("i","e","a","o","u"), position="top") +
        scale_y_discrete(labels=c("u","o","a","e","i"))

One nice property of this view over the bar chart above is that here we can really clearly see where there were no confusions alongside the cells with numerous confusions (did you notice that some response bars were missing entirely in the first figure?).

Answer the following Questions

I recommend you read through all of the questions before answering.

  1. Chance performance on this task is 20% correct (i.e. given 5 response options, if you responded completely randomly, you would be correct on, on average, 1 out of 5 times). To what extent were you successful in identifying the deleted vowels? In answering this question, you should consider not only your percentage accuracy on a given vowel (e.g., How often did you respond /i/ when the original context vowel was /i/?), but also the articulatory-acoustic properties of vowels. (10pts)
Note

Your answer could go here!

  1. This is a difficult task and my experience is that there is considerable across-listener variation in perception. The graph in Figure 1 gives the averaged results for 7 listeners with varying language backgrounds and experience with English. In articulatory terms}, what vowel properties emerged as particularly salient for these listeners (i.e., in what respects were they consistently performing above chance level)? (15pts)
Note

Your answer could go here!

  1. For this exercise, we won’t analyze the acoustic characteristics of the [ə]s that you identified, so this question is a thought experiment: Given your response to (2), what acoustic (e.g., F1, F2) effects of coarticulation} on [ə] would you expect to emerge –or not to emerge– in an acoustic analysis? Briefly explain your answer. (15pts)
Note

Your answer could go here!

What to hand in:

  1. Your answers to the 3 questions in the previous section (preferably in PDF format).

  2. The working plot of your results (Just the output of this Rmd notebook is fine) worth 10 pts on its own and necessary for answering question 1.

  3. Your results text file (as a separate text file, attached to your Canvas submission). You get 10pts just for including this file (or lose 10 for excluding it).

References

Beddor, Patrice Speeter, James D Harnsberger, and Stephanie Lindemann. 2002. “Language-Specific Patterns of Vowel-to-Vowel Coarticulation: Acoustic Structures and Their Perceptual Correlates.” Journal of Phonetics 30 (4): 591627.
Magen, Harriet S. 1997. “The Extent of Vowel-to-Vowel Coarticulation in English.” Journal of Phonetics 25 (2): 187205.
Manuel, Sharon Y, and Rena A Krakow. 1984. “Universal and Language Particular Aspects of Vowel-to-Vowel Coarticulation.” Haskins Laboratories Status Report on Speech Research 77 (78): 6978.
Miller, George A., and Patricia E. Nicely. 1955. “An Analysis of Perceptual Confusions Among Some English Consonants.” The Journal of the Acoustical Society of America 27 (2): 338–52. https://doi.org/10.1121/1.1907526.