Due before class on Wednesday, 1/14/15. Ideally, submit via RPubs as outlined in HW1. If you prefer, you can also use knitr to compile to .pdf and submit as an email attachment. As a last resort, if you can’t get knitr to work, send me your .R or .Rmd source file. Please don’t submit a .pdf with code.

Tips

Exercises

  1. If you postponed it last week, do exercises 1-4 from Baayen ch.2.
  2. Levy ch.1, pp.34-6, exercise 2.2, modified: “Give an example in words - involving language understanding, and one that was not specifically discussed in class- where two events \(A\) and \(B\) are conditionally independent given some state of knowledge \(C\), but when another piece of knowledge \(D\) is learned, \(A\) and \(B\) lose conditional independence.”
  3. Levy ch.1, pp.34-6, exercise 2.3:
  1. Re-do both parts of the previous exercise using a simulation in R instead of mathematical reasoning. That is, think carefully about the generative process by which this example proceeds, and write a program which implements a model of this process and uses it to generate many draws of 3 letters. Use the sample() function, and estimate probabilities as the proportion of samples that have the desired property as your best estimate of the probability. Hints: you can learn about sample() by typing ?sample into the console. Make sure that you think carefully about what vector you are sampling from. You may fill in letters for which Levy does not specify frequencies in 2.5.2 with a generic ‘other’ value. Also, make sure that you think carefully about whether to set sample(..., replace=TRUE) or sample(..., replace = FALSE) when answering each sub-question.
  2. Levy ch.1, pp.34-6, exercise 2.10: “For adult female native speakers of American English, the distribution of first-formant frequencies for the vowel [E] is reasonably well modeled as a normal distribution with mean 608Hz and standard deviation 77.5Hz. What is the probability that the first-formant frequency of an utterance of [E] for a randomly selected adult female native speaker of American English will be between 555Hz and 697Hz?” Show the R code that you used to calculate the answer. (Hint: look back at Monday’s class notes, specifically the part about using R to find the cumulative probability of continuous distributions.)
  3. Design a simulation implementing Pearl’s rain/sprinkler/wet grass as discussed in class. Make sure that the samples that you generate assign a truth-value to each variable - rain, sprinkler, and wet grass - and that they have the dependency structure assumed: rain and sprinkler are uncaused, occurring with some fixed probability (say, both are flip(.3)); and wet grass occurs if and only if: either it rained or the sprinkler was on. You’ll want to begin your code by defining the ‘coin flip’ function.
flip = function(p) runif(1,0,1) < p
sim = function(i) {
  rain = ...
  sprinkler = ...
    ...
  return(...) # what you return should be a non-trivial vector
}
samples = sapply(1:1000, FUN=sim)