# x <-
# y <-
# print(_____)
Construct a list of all even numbers between 1 and 100, assign this to the variable ‘even’. Construct a list of 100 numbers, starting with 1, such that the difference between consecutive numbers is three, assign this to the variable ‘threes’. Construct a list of 100 equally spaced numbers between 0 and 2. Assign this to the variable x.
Define a function that computes the square of the input value, call this function square:
Plot trigonometric functions sine and cosine for values of x between 0 and 10. Also plot the ‘square’ function for input values between 0 and 2.
In this problem we will simulate two important experiments: coin tosses and die rolls.
We will explore empirical probability of events using random samples. The two important functions to understand are “sample” and “replicate”.
We use the sample function to set up our experiment, and replicate function to repeat it as many times as we want.
Set up an function, called ‘cion_toss’, that simulates a single toss of a coin with \(P(H) = p\in (0,1)\). Note, the function will take input \(p\), and will output the outcome of a single toss of the coin. It might help to have the function return 1 if coin lands H, and return 0 otherwise.
Set up an function, called ‘die_roll’, that simulates a single roll of a six-sided die, with a given probability distribution. Note, the function will take input a vector \(p\) with the six probabilities associated the respective faces of the die, and will output the outcome of a single roll of the die.
Setup the experiment where we roll two fair six 6-sided die independently, and calculate the sum of the numbers that appear for each of them.
Use the replicate function to repeat this experiment 10000 times, store the output of the experiment in a variable “sum_die”.
Use “sum_die” to calculate the empirical probability of getting the sums 2, 3, 4, …, 12. List these empirical probabilities into a table, using the ‘table’ function.
Compare your answers with the theoretical values.
Plot the following: relative frequency of getting sum equal to 8 as a function of number of trials, for this experiment.
What do you notice about this graph as the number of trials increase? YOUR ANSWER:
Suppose an experiment involves rolling a 6-sided die 5 times, the die has the following distribution: \[P(1) = P(2) = P(3) = P(4)= P(5)=0.1, \quad P(6) = 0.5\]
Suppose \[E\] is the event that the sum of all the die rolls is 13. And \[G\] is the event that the max of the 5 rolls is atmost 4.
Set up a function ‘E_happens’, that rolls the die 5 times and returns 1 if E happens, and returns zero otherwise
Set up a function ‘G_happens’, that rolls the die 5 times and returns 1 if G happens, and returns 0 otherwise
Use the two functions you constructed, ‘E_happens’ and ‘G_happens’, and the ‘replicate’ function, to calculate the empirical approximations to \(P(E)\) and \(P(G)\).
What are the approximate probabilities? YOUR ANSWER:
Suppose we toss a coin with \(P(H) = 0.8\), \(n\) times independently. Let \(X\) be the the random variable that counts the number of heads in \(n\) tosses.
Setup a function that simulates the random variable \(X\) when we toss the coin two times.
Use the above function to calculate the emprical probability distribution table for \(X\). Compare this with the theoretical probability distribution of \(X\).
Calculate empirical estimates for the expected value and the variance of \(X\). Compare these with the actual theoretical calculations.
Setup a function that simulates the random variable \(X\) when we toss the coin three times.
Use the above function to calculate the emprical probability distribution table for \(X\). Compare this with the theoretical probability distribution of \(X\).
Calculate emprical estimates for the expected value and the variance of \(X\). Compre these with the actual theoretical calculations.
Setup a function that simulates the random variable \(X\) when we toss the coin four times.
Use the above function to calculate the empirical probability distribution table for \(X\). Compare this with the theoretical probability distribution of \(X\).
Calculate emprical estimates for the expected value and the variance of \(X\). Compre these with the actual theoretical calculations.
Suppose we toss a coin with \(P(H) = 0.8\) until we get \(r\) heads.
Setup an experiment that counts the number of tails until the first head. You will need to use the ‘while’ loop.
Run 10000 simulations of this experiment, and use these to find estimate for the expected number of tails until the first head. Match this value with the theoretical/calculated value.
Setup an experiment that counts the number of tails until the second head. Run 10000 simulations of this experiment, and use these to find an estimate for the expected number of tails until the second head. Match this value with the theoretical/calculated value.
Setup an experiment that counts the number of tails until the third head. Run 10000 simulations of this experiment, and use these to find an estimate for the expected number of tails until the third head. Match this value with the theoretical/calculated value.
Recall the hit-and-run example from the asynchronous lecture and HW4. We noted that the Bayes’ probabilities are calculated as the following formula \[ P(B|W) = \frac{pq}{pq + (1-p)(1-q)},\] where \(P(B) = q\) and \(P(W|B) = p\) (check notes). We also noted in HW4 that if \(P(W|B)=p > \frac{1}{2}\) we will have \[ P(B|W) > P(B).\]
In this problem, we will explore how changing the level of Witness-Reliability, that is \(p\), will affect the chance of getting a Blue Cab driver arrested.
Recall that \(P(B)\) is the probability that the Blue cab is at fault with no added information. We will now run multiple iteration of Bayes’ rule, and at every iteration, we will update the value of \(P(B)\) with the value of \(P(B|W)\) from earlier iteration. From our experience from the homework, if \(p > \frac{1}{2}\), we must have \(P(B) \longrightarrow 1\) as the number of iterations increase (intuitively, the probability that the Blue cab is at fault will go to 1 as the number of “reliable witness” who say that “the Blue Cab is at fault” increases).
Write a function called “bayes” that takes input \(q, p\) and outputs \(P(B|W)\) (use the formula above)
bayes <- function(q, p){
return(p*q/(p*q + (1-p)*(1-q)))
}
bayes(0.01, 0.8)
## [1] 0.03883495
In this part, we will assume that we are using testimony of reliable witnesses with fixed reliability \(p\).
Initialize \(p=0.6\), \(q=0.01\), and q_vals = c(), the empty vector. Run a for loop (1 in 1:40) inside it, update q at every iteration to \(q = \text{bayes}(p,q)\), and concatenate the vector q_vals with this new value of q.
Construct a plot with x-axis 1:40 and y-axis q_vals.
Repeat the above experiment with \(p=0.4\), \(q=0.99\).
Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).
Suppose we intialize \(p=0.6\) (witness reliability), \(q=0.01\) (proportion of Blue cabs in the city), how many witnesses with reliability level \(p\) would we need for the police to consider arresting the Blue cab?
You will need a counter variable initialized to zero and a while loop inside which you update \(q\) using the bayes function, and update the counter by 1 at every iteration of the while loop.
In this problem we will work with two types of witnesses (1) reliable witnesses with fixed reliability level \(p_1 > \frac{1}{2}\) and (2) unreliable witness with fixed reliability level \(p_2 < \frac{1}{2}\).
Initialize \(p_1=0.6\), \(p_2 = 0.45\), \(q=0.01\), and q_vals = c(), the empty vector. Run a for loop (1 in 1:90) inside it, update q at every even iteration to \(q = \text{bayes}(p_2,q)\), and at every odd iteration to \(q = \text{bayes}(p_1,q)\) (You will need to use a “if” command), and finally concatanating the vector q_vals with the new value of q at every iteration.
Construct a plot with x-axis 1:90 and y-axis q_vals.
What do you notice? (Compared to case with single witness)
Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).
Initialize \(p_1=0.6\), \(p_2 = 0.45\), \(q=0.01\). How many witnesses will we have to have if we are going to sequentially work with one reliable witness followed by an unreliable witness?
This problem can be generalized to choosing \(p\) from any given probability distribution on \((0,1)\), we will be working with the Beta distribution later in the semester, which can be an interesting candidate to sample our values for \(p\) (that is witness reliability).
In this problem we will work with two types of witnesses (1) two reliable witnesses with fixed reliability level \(p_1, p_2 > \frac{1}{2}\) and (2) two unreliable witnesses with fixed reliability level \(p_3, p_4 < \frac{1}{2}\).
Initialize \(q=0.01\) and we will sample \(p\) from the vector c(0.6, 0.8, 0.45, 0.3) and suppose choosing a witness with either of these witness-reliabilities is equally likely.
Now let q_vals = c(), the empty vector and run a for loop (1 in 1:100) inside it, update q by sampling p from the vector c(0.6, 0.8, 0.45, 0.3), and concatanating the vector q_vals with the new value of q at every iteration.
Construct a plot with x-axis 1:100 and y-axis q_vals.
What do you notice? How does this compare with the earlier cases?
Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).
Initialize \(q=0.01\) and we will sample \(p\) from the vector c(0.6, 0.8, 0.45, 0.3) and suppose choosing a witness with either of these witness-reliabilities is equally likely. How many witnesses will we need to have if we are going to uniformly sample from each of these four types of of witnesses at every step?
In this problem we will assume that the witness reliability of the people in the city follows the Beta distribution with parameters \(\alpha\), \(\beta\). Choose \(\alpha, \beta\) in such a way that the expected value of this Beta distribution is greater than .65.
Initialize \(q=0.01\) and we will choose \(p\) by sampling the Beta distribution with parameters \(\alpha, \beta\) chosen above.
Now let q_vals = c(), the empty vector and run a for loop (1 in 1:100) inside it, update q by sampling p \(\text{Beta}(\alpha, \beta)\), and concatanating the vector q_vals with the new value of q at every iteration.
Construct a plot with x-axis 1:100 and y-axis q_vals.
What do you notice? How does this compare with the earlier cases?
Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).
Initialize \(q=0.01\) and we will choose \(p\) by sampling the Beta distribution with parameters \(\alpha, \beta\) chosen above. How many witnesses will we need to have in this setting?