Question 1

1A

# use read.delim() to read txt files to df's
gb.df <- read.delim('goldrickBlumstein.txt')
bach.df <- read.delim('bach.txt')

1B

# use $ and mean() to access fields in df
(mean.voiced <- mean(gb.df$VOT[gb.df$OnsetVoicing=='voiced']))
## [1] 22.66587
(mean.voiceless <- mean(gb.df$VOT[gb.df$OnsetVoicing=='voiceless']))
## [1] 65.2869

Voiced consonants have much shorter VOT than voiceless consonants.

1C

# use $ and <- to specify and add a new field
gb.df$NVOT <- gb.df$VOT/gb.df$VowelLength

1D

# use $ and mean() to access fields in df
(mean.aboveC <- mean(bach.df$Duration[bach.df$Note >= 60]))
## [1] 1.007399
(mean.belowC <- mean(bach.df$Duration[bach.df$Note < 60]))
## [1] 0.9338945

The means are much closer, although the duration is slightly longer at or above middle C.

Question 2

2A

# use hist() and subset data
hist(gb.df$VOT[gb.df$OnsetVoicing=='voiceless'])

  1. The bulk of the data is symmetrically centered around the mean.
  2. The data has one mode, around which the data is centered.
  3. These values are limited by human physiology. It is impossible to swich from a consonant to a vowel in 0 milliseconds. However, the switch can be very rapid. It is similarly difficult to sustain a consonant and then switch to a vowel, though consonants can be slightly sustained.

2B

# use hist() and subset data
hist(gb.df$VowelLength[gb.df$OnsetVoicing=='voiced'])

  1. Some of the data is symmetric around the center. However, there is a large mass at the lower end.
  2. Although it is difficult to call the lower mass an additional mode, it is clear from the histogram that the data has multiple sources.
  3. A vowel cannot be 0 length, otherwise it does not exist. I assume there is also a physical limit below which a human produce a sound. A vowel can be sustained for as long as a speaker can expel air.

2C

# use hist() and subset data
hist(gb.df$NVOT[gb.df$OnsetVoicing=='voiceless'])