1 Basic R manipulation.

In the code segment below, assign the value 2 to variable x, and the value 6 to variable y. Then compute the sum ‘x+y’:

# x <- 
# y <- 
# print(_____)

Construct a list of all even numbers between 1 and 100, assign this to the variable ‘even’. Construct a list of 100 numbers, starting with 1, such that the difference between consecutive numbers is three, assign this to the variable ‘threes’. Construct a list of 100 equally spaced numbers between 0 and 2. Assign this to the variable x.
Define a function that computes the square of the input value, call this function square:
Plot trigonometric functions sine and cosine for values of x between 0 and 10. Also plot the ‘square’ function for input values between 0 and 2.

2 Simulating Experiments

In this problem we will simulate two important experiments: coin tosses and die rolls.

We will explore empirical probability of events using random samples. The two important functions to understand are “sample” and “replicate”.

We use the sample function to set up our experiment, and replicate function to repeat it as many times as we want.

2.1 Functions for Coin Toss and Die Rolls

Set up an function, called ‘cion_toss’, that simulates a single toss of a coin with \(P(H) = p\in (0,1)\). Note, the function will take input \(p\), and will output the outcome of a single toss of the coin. It might help to have the function return 1 if coin lands H, and return 0 otherwise.

Set up an function, called ‘die_roll’, that simulates a single roll of a six-sided die, with a given probability distribution. Note, the function will take input a vector \(p\) with the six probabilities associated the respective faces of the die, and will output the outcome of a single roll of the die.

3 Experiments involving Coin Tosses and Die Rolls

3.1 Die Rolls

Setup the experiment where we roll two fair six 6-sided die independently, and calculate the sum of the numbers that appear for each of them.

Use the replicate function to repeat this experiment 10000 times, store the output of the experiment in a variable “sum_die”.

Use “sum_die” to calculate the empirical probability of getting the sums 2, 3, 4, …, 12. List these empirical probabilities into a table, using the ‘table’ function.

Compare your answers with the theoretical values.

Plot the following: relative frequency of getting sum equal to 8 as a function of number of trials, for this experiment.

What do you notice about this graph as the number of trials increase? YOUR ANSWER:

3.2 Calculating a Die Roll Probability

Suppose an experiment involves rolling a 6-sided die 5 times, the die has the following distribution: \[P(1) = P(2) = P(3) = P(4)= P(5)=0.1, \quad P(6) = 0.5\]

Suppose \[E\] is the event that the sum of all the die rolls is 13. And \[G\] is the event that the max of the 5 rolls is atmost 4.

Set up a function ‘E_happens’, that rolls the die 5 times and returns 1 if E happens, and returns zero otherwise

Set up a function ‘G_happens’, that rolls the die 5 times and returns 1 if G happens, and returns 0 otherwise

Use the two functions you constructed, ‘E_happens’ and ‘G_happens’, and the ‘replicate’ function, to calculate the empirical approximations to \(P(E)\) and \(P(G)\).

What are the approximate probabilities? YOUR ANSWER:

3.3 Coin Tosses

Suppose we toss a coin with \(P(H) = 0.8\), \(n\) times independently. Let \(X\) be the the random variable that counts the number of heads in \(n\) tosses.

3.3.1 n = 2

Setup a function that simulates the random variable \(X\) when we toss the coin two times.

Use the above function to calculate the emprical probability distribution table for \(X\). Compare this with the theoretical probability distribution of \(X\).

Calculate empirical estimates for the expected value and the variance of \(X\). Compare these with the actual theoretical calculations.

3.3.2 n = 3

Setup a function that simulates the random variable \(X\) when we toss the coin three times.

Use the above function to calculate the emprical probability distribution table for \(X\). Compare this with the theoretical probability distribution of \(X\).

Calculate emprical estimates for the expected value and the variance of \(X\). Compre these with the actual theoretical calculations.

3.3.3 n = 4

Setup a function that simulates the random variable \(X\) when we toss the coin four times.

Use the above function to calculate the empirical probability distribution table for \(X\). Compare this with the theoretical probability distribution of \(X\).

Calculate emprical estimates for the expected value and the variance of \(X\). Compre these with the actual theoretical calculations.

3.4 Coin Tosses for with Infinite Sample Space.

Suppose we toss a coin with \(P(H) = 0.8\) until we get \(r\) heads.

3.4.1 r=1

Setup an experiment that counts the number of tails until the first head. You will need to use the ‘while’ loop.

Run 10000 simulations of this experiment, and use these to find estimate for the expected number of tails until the first head. Match this value with the theoretical/calculated value.

3.4.2 r = 2

Setup an experiment that counts the number of tails until the second head. Run 10000 simulations of this experiment, and use these to find an estimate for the expected number of tails until the second head. Match this value with the theoretical/calculated value.

3.4.3 r = 3

Setup an experiment that counts the number of tails until the third head. Run 10000 simulations of this experiment, and use these to find an estimate for the expected number of tails until the third head. Match this value with the theoretical/calculated value.

4 Bayes’s Theorem and Witnesses

Recall the hit-and-run example from the asynchronous lecture and HW4. We noted that the Bayes’ probabilities are calculated as the following formula \[ P(B|W) = \frac{pq}{pq + (1-p)(1-q)},\] where \(P(B) = q\) and \(P(W|B) = p\) (check notes). We also noted in HW4 that if \(P(W|B)=p > \frac{1}{2}\) we will have \[ P(B|W) > P(B).\]

In this problem, we will explore how changing the level of Witness-Reliability, that is \(p\), will affect the chance of getting a Blue Cab driver arrested.

Recall that \(P(B)\) is the probability that the Blue cab is at fault with no added information. We will now run multiple iteration of Bayes’ rule, and at every iteration, we will update the value of \(P(B)\) with the value of \(P(B|W)\) from earlier iteration. From our experience from the homework, if \(p > \frac{1}{2}\), we must have \(P(B) \longrightarrow 1\) as the number of iterations increase (intuitively, the probability that the Blue cab is at fault will go to 1 as the number of “reliable witness” who say that “the Blue Cab is at fault” increases).

Write a function called “bayes” that takes input \(q, p\) and outputs \(P(B|W)\) (use the formula above)

bayes <- function(q, p){
  return(p*q/(p*q + (1-p)*(1-q)))
}

bayes(0.01, 0.8)

## [1] 0.03883495

4.1 Reliable Witnesses with fixed probability

In this part, we will assume that we are using testimony of reliable witnesses with fixed reliability \(p\).

4.1.1 Initial Plots

Initialize \(p=0.6\), \(q=0.01\), and q_vals = c(), the empty vector. Run a for loop (1 in 1:40) inside it, update q at every iteration to \(q = \text{bayes}(p,q)\), and concatenate the vector q_vals with this new value of q.

Construct a plot with x-axis 1:40 and y-axis q_vals.

Repeat the above experiment with \(p=0.4\), \(q=0.99\).

4.1.2 How many reliable Witnesses needed?

Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).

Suppose we intialize \(p=0.6\) (witness reliability), \(q=0.01\) (proportion of Blue cabs in the city), how many witnesses with reliability level \(p\) would we need for the police to consider arresting the Blue cab?

You will need a counter variable initialized to zero and a while loop inside which you update \(q\) using the bayes function, and update the counter by 1 at every iteration of the while loop.

4.2 Working with two types of witnesses

In this problem we will work with two types of witnesses (1) reliable witnesses with fixed reliability level \(p_1 > \frac{1}{2}\) and (2) unreliable witness with fixed reliability level \(p_2 < \frac{1}{2}\).

4.2.1 Initial Plots

Initialize \(p_1=0.6\), \(p_2 = 0.45\), \(q=0.01\), and q_vals = c(), the empty vector. Run a for loop (1 in 1:90) inside it, update q at every even iteration to \(q = \text{bayes}(p_2,q)\), and at every odd iteration to \(q = \text{bayes}(p_1,q)\) (You will need to use a “if” command), and finally concatanating the vector q_vals with the new value of q at every iteration.

Construct a plot with x-axis 1:90 and y-axis q_vals.

What do you notice? (Compared to case with single witness)

4.2.2 How many witnesses needed?

Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).

Initialize \(p_1=0.6\), \(p_2 = 0.45\), \(q=0.01\). How many witnesses will we have to have if we are going to sequentially work with one reliable witness followed by an unreliable witness?

This problem can be generalized to choosing \(p\) from any given probability distribution on \((0,1)\), we will be working with the Beta distribution later in the semester, which can be an interesting candidate to sample our values for \(p\) (that is witness reliability).

4.3 Working with four types of witnesses

In this problem we will work with two types of witnesses (1) two reliable witnesses with fixed reliability level \(p_1, p_2 > \frac{1}{2}\) and (2) two unreliable witnesses with fixed reliability level \(p_3, p_4 < \frac{1}{2}\).

4.3.1 Initial Plots

Initialize \(q=0.01\) and we will sample \(p\) from the vector c(0.6, 0.8, 0.45, 0.3) and suppose choosing a witness with either of these witness-reliabilities is equally likely.

Now let q_vals = c(), the empty vector and run a for loop (1 in 1:100) inside it, update q by sampling p from the vector c(0.6, 0.8, 0.45, 0.3), and concatanating the vector q_vals with the new value of q at every iteration.

Construct a plot with x-axis 1:100 and y-axis q_vals.

What do you notice? How does this compare with the earlier cases?

4.3.2 How many witnesses needed?

Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).

Initialize \(q=0.01\) and we will sample \(p\) from the vector c(0.6, 0.8, 0.45, 0.3) and suppose choosing a witness with either of these witness-reliabilities is equally likely. How many witnesses will we need to have if we are going to uniformly sample from each of these four types of of witnesses at every step?

4.4 BONUS: Working with witnesses coming from a Beta distribution

In this problem we will assume that the witness reliability of the people in the city follows the Beta distribution with parameters \(\alpha\), \(\beta\). Choose \(\alpha, \beta\) in such a way that the expected value of this Beta distribution is greater than .65.

4.4.1 Initial Plots

Initialize \(q=0.01\) and we will choose \(p\) by sampling the Beta distribution with parameters \(\alpha, \beta\) chosen above.

Now let q_vals = c(), the empty vector and run a for loop (1 in 1:100) inside it, update q by sampling p \(\text{Beta}(\alpha, \beta)\), and concatanating the vector q_vals with the new value of q at every iteration.

Construct a plot with x-axis 1:100 and y-axis q_vals.

What do you notice? How does this compare with the earlier cases?

4.4.2 How many witnesses needed?

Suppose the police decide to arrest the Blue cab driver if \(P(B|W) \ge 0.9\).

Initialize \(q=0.01\) and we will choose \(p\) by sampling the Beta distribution with parameters \(\alpha, \beta\) chosen above. How many witnesses will we need to have in this setting?

Experiments, Conditional Probability, and Bayes’ Theorem

Your Name and id

1 Basic R manipulation.

2 Simulating Experiments

2.1 Functions for Coin Toss and Die Rolls

3 Experiments involving Coin Tosses and Die Rolls

3.1 Die Rolls

3.2 Calculating a Die Roll Probability

3.3 Coin Tosses

3.3.1 n = 2

3.3.2 n = 3

3.3.3 n = 4

3.4 Coin Tosses for with Infinite Sample Space.

3.4.1 r=1

3.4.2 r = 2

3.4.3 r = 3

4 Bayes’s Theorem and Witnesses

4.1 Reliable Witnesses with fixed probability

4.1.1 Initial Plots

4.1.2 How many reliable Witnesses needed?

4.2 Working with two types of witnesses

4.2.1 Initial Plots

4.2.2 How many witnesses needed?

4.3 Working with four types of witnesses

4.3.1 Initial Plots

4.3.2 How many witnesses needed?

4.4 BONUS: Working with witnesses coming from a Beta distribution

4.4.1 Initial Plots

4.4.2 How many witnesses needed?