Question 1. Consider the Binomial distribution with \(n = 24\) and \(p = .9\) that is used to model the number of correctly received bits on a satellite link that transmits data in \(24\)-bit blocks.

Quesion 1 (a) Plot the probability density function and cumulative distribution function for the number of correctly received bits

#Answer 1a
plot(dbinom(1:24, 24, .9),
     type='h',
     main="Binomial Distribution",
     ylab="Probability",
     xlab="Num of Correctly Received Bits")

plot(pbinom(1:24, 24, .9),
     type='s',
     main="Cumulative Distribution",
     ylab="Probability",
     xlab="Num of Correctly Received Bits")

Quesion 1 (b) What is the mean number of correctly received bits? What is the standard deviation?

#Answer 1b
#(make sure that what you print make sense to me) - for example to print mean you can use cat command once you done with the calculation and have value in the variable
#variable_name=10 
#cat("Mean number of correctly received bits ",  variable_name)

mean = 24*0.9
cat("Mean Number of Correctly Received Bits", mean)

## Mean Number of Correctly Received Bits 21.6

SD = sqrt(24*0.9*(1-0.9))


cat("Standard Deviation of Correctly Received Bits", SD)

## Standard Deviation of Correctly Received Bits 1.469694

Quesion 1 (c) What is the probability of more than \(3\)-bit errors in the block of \(24\)?

#Answer 1c
partc = pbinom(20, 24, 0.9)
cat("Probability of more than 3-bit errors", partc)

## Probability of more than 3-bit errors 0.2142622

Quesion 1 (d) What is the median of this distribution? What can you interpret from the median value?

#Answer 1d
median = qbinom(0.5, 24, 0.9, lower.tail=TRUE)
cat("Median of distribution", median)

## Median of distribution 22

cat("Interpretation", "Half of the data points are less than or equal to 22 (correct bits), and half of the data points are greater than or equal to 22(correct bits).")

## Interpretation Half of the data points are less than or equal to 22 (correct bits), and half of the data points are greater than or equal to 22(correct bits).

Quesion 1 (e) What is the 60th quantile of this distribution? What can you interpret from this value?

#Answer 1e
ans = qbinom(0.6, 24, 0.9)
cat("60th quantile of distribution", ans)

## 60th quantile of distribution 22

#Interpretation
cat("Interpretation", "60 percent of the data falls below or at 22(correct bits)")

## Interpretation 60 percent of the data falls below or at 22(correct bits)

Question2. The Public Service Answering Point (PSAP) in San Francisco employs \(19\) operators in \(8\)-hour shifts to process \(911\) calls. There are at least \(5\) operators always answering calls. The number of calls processed per operator can be modeled with a Poisson random variable with rate \(\lambda_0 =20\) calls per hour.

Quesion 2 (a) What is the probability an operator can process 20 calls in an hour? Repeat for 30 calls in an hour?

#Answer 2a
ans = dpois(20, 20)
cat("Probability an operator can process 20 calls in an hour", ans)

## Probability an operator can process 20 calls in an hour 0.08883532

ans2 = dpois(30, 20)
cat("Probability an operator can process 30 calls in an hour", ans2)

## Probability an operator can process 30 calls in an hour 0.008343536

Quesion 2 (b) Given that 240 calls occurred in an hour, what is the probability that the ten operators can process them all assuming they are split equally among the operators?

#Answer 2b
ans3 = (dpois(24, 20))^10
cat("Probability that the ten operators can process all calls", ans3)

## Probability that the ten operators can process all calls 2.892317e-13

Quesion 2 (c) Now, If the aggregate call rate per hour \(\lambda_c\) , is measured by \(\lambda_c = 85\) calls per hour, what is the probability of more than \(100\) calls in an hour

#Answer 2c
ans4 = 1-ppois(100, 85)
cat("Probability of more than 100 calls in an hour", ans4)

## Probability of more than 100 calls in an hour 0.04934533

Question 3. For the two random number generator below A and B(don’t forget to add your R code)

- [A] \(Z_i = (9Z_{i-1} + 1) \mod 16\) with \(Z_0 = 5\).

- [B] \(Z_i = (7Z_{i-1} + 3) \mod 32\) with \(Z_0 = 10\),

Quesion 3 (a) Compute \(Z_i\) and \(U_i\) for values of \(i\) until a number is repeated, what is the period of both the generator? Provide your comments about the period of both RGN

#Answer 3a
lcg = function(a, c, m, rlength, seed){
  z = rep(0, rlength)
  z[1] = seed
  for(i in 1:(rlength-1)){
    z[i+1] = (a*z[i]+c)%%m
  }
  U = z/m
  return(list(z=z,U=U))
}

#A
A = lcg(9, 1, 16, 17, 5)
cat("Zi for A", A$z)

## Zi for A 5 14 15 8 9 2 3 12 13 6 7 0 1 10 11 4 5

cat("Ui for A", A$U)

## Ui for A 0.3125 0.875 0.9375 0.5 0.5625 0.125 0.1875 0.75 0.8125 0.375 0.4375 0 0.0625 0.625 0.6875 0.25 0.3125

cat("period of A", "16")

## period of A 16

#B
B = lcg(7, 3, 32, 9, 10)
cat("Zi for B", B$z)

## Zi for B 10 9 2 17 26 25 18 1 10

cat("Ui for B", B$U)

## Ui for B 0.3125 0.28125 0.0625 0.53125 0.8125 0.78125 0.5625 0.03125 0.3125

cat("period of B", "8")

## period of B 8

Quesion 3 (b) Which of these parameters effect the period of LCG – \(a\), \(b\), \(Z_0\)

#Answer 
cat("Answer", "a(the multiplier) certainly affects the period of an LCG and b(the increment) can as well, but a affects it much more")

## Answer a(the multiplier) certainly affects the period of an LCG and b(the increment) can as well, but a affects it much more

Quesion 3 (c) For both generators plot a scatter diagram of the Zi values 1 apart. What are your observations from these plots? What do you think about the property of randomness of these generators?

#Answer
lcg = function(a, c, m, rlength, seed){
  z = rep(0, rlength)
  z[1] = seed
  for(i in 1:(rlength-1)){
    z[i+1] = (a*z[i]+c)%%m
  }
  U = z/m
  return(list(z=z,U=U))
}

A = lcg(9, 1, 16, 30, 5)
Za = A$z
AZ1 = Za[1:29]
AZ2 = Za[2:30]
plot(AZ1, AZ2, main = "Scatter Plot of Generator A Zi Values", xlab="Zi of Generate A")
axis(1, at=1:30)

cat("Observation of A", "Generator A does not seem to properly generate random numbers as there seems to be a pattern to its generated numbers.")

## Observation of A Generator A does not seem to properly generate random numbers as there seems to be a pattern to its generated numbers.

B = lcg(7, 3, 32, 30, 10)
Zb = B$z
BZ1 = Zb[1:29]
BZ2 = Zb[2:30]
plot(BZ1, BZ2, main = "Scatter Plot of Generator B Zi Values", xlab="Zi of Generator B")
axis(1, at=1:30)

cat("Observation of B", "Generate B looks to be better at generating random numbers than A as the values are more sporatic, but some in some areas we can still see a slight pattern, so it might not be the best option to generate random numbers.")

## Observation of B Generate B looks to be better at generating random numbers than A as the values are more sporatic, but some in some areas we can still see a slight pattern, so it might not be the best option to generate random numbers.

Quesion 3 (d) Randomness of default random generator of R – Run runif command to generate 100 random numbers and plot the scatter diagram of these numbers ( values 1 apart) and discuss your observations about randomness of this random generator.

#Answer
numbers = runif(100)
plot(numbers)
axis(1, at = 1:100)

cat("Observation", "This generator looks to be good at generating random numbers as I do not see any pattern between the values. It is much better at generating random numbers than Generators A and B.")

## Observation This generator looks to be good at generating random numbers as I do not see any pattern between the values. It is much better at generating random numbers than Generators A and B.

Quesion 3 (e) Compute the mean value of \(U_i\) across the period

#Answer
lcg = function(a, c, m, rlength, seed){
  z = rep(0, rlength)
  z[1] = seed
  for(i in 1:(rlength-1)){
    z[i+1] = (a*z[i]+c)%%m
  }
  U = z/m
  return(list(z=z,U=U))
}

#A
A = lcg(9, 1, 16, 20, 5)
UvalsA = A$U

UvalsA = UvalsA[1:16]

sumA = sum(UvalsA)

avgA = sumA/16
cat("mean value of Ui of A across the period", avgA)

## mean value of Ui of A across the period 0.46875

#B
B = lcg(7, 3, 32, 20, 10)
UvalsB = B$U

UvalsB = UvalsB[1:8]

sumB = sum(UvalsB)

avgB = sumB/8

cat("mean value of Ui of B across the period", avgB)

## mean value of Ui of B across the period 0.421875

Quesion 3 (f) By providing a plot of density (histogram) discuss the uniformity of both of the generators

#Answer
lcg = function(a, c, m, rlength, seed){
  z = rep(0, rlength)
  z[1] = seed
  for(i in 1:(rlength-1)){
    z[i+1] = (a*z[i]+c)%%m
  }
  U = z/m
  return(list(z=z,U=U))
}

#A
A = lcg(9, 1, 16, 5000, 5)
Ha = A$U
hist(Ha, main = "Histogram of A", xlab = "Ui of A")

#B
B = lcg(7, 3, 32, 5000, 10)
Hb = B$U
hist(Hb, main = "Histogram of B", xlab = "Ui of B")

#observation
cat("Observation", "The histogram of A is uniform in most areas, but there are a few values where there are no frequencies at all, so we can't say the distribution is actually uniform. The histogram of B is not uniform either and is more sporatic, as some values have high frequencies and many values have no frequencies.")

## Observation The histogram of A is uniform in most areas, but there are a few values where there are no frequencies at all, so we can't say the distribution is actually uniform. The histogram of B is not uniform either and is more sporatic, as some values have high frequencies and many values have no frequencies.

Question 4. Using the inverse transform method

Question 4.(a) Develop an algorithm for the random variable with cumulative distribution function F(x) below.

\[ F(x) = 1 - e^{-(x/\lambda)^k} \]

where \(x \geq 0\), \(\lambda \geq 0\), and \(k \geq 0\).

# Answer 4a
# This might be hard to print using R - you can write the step in your notebook and then include image here / or you can include as seperate file when upload


#using lambda = 1
lambda = 1

#using k = 5
k = 5

#generate Ui from U[0,1] RNG
u = runif(3, 0, 1)

#set seed for runif 
set.seed(0)

#solve for x from the inverse transform function
x = ((-log(1-u))^(1/k))*lambda

#random values generated
cat("random values generated:", x)

## random values generated: 1.112257 0.9457742 0.9280298

Quesion 4 (b) Use the first three Ui values from part (a) and generator [A] of previous problem to create 3 values from the random variable when, \(\lambda = 1\), \(𝑘 = 5\)

#Answer
set.seed(0)
lambda = 1
k = 5

#same Ui from part a because set to same seed
u = runif(3, 0, 1)
x = ((-log(1-u))^(1/k))*lambda
cat("3 values from the random variable", x)

## 3 values from the random variable 1.178172 0.790447 0.8581587

Quesion 4 (c) sing R generate 10,000 values from your algorithm when = 1 , 𝑘 = 5 and plot a histogram of the density. Discuss what insights you obtain when you look at the plot generated here vs the plot you generated in question 3 (e)

#Answer
set.seed(0)
lambda = 1
k = 5
u = runif(10000, 0, 1)
x = ((-log(1-u))^(1/k))*lambda
hist(x, freq=FALSE, main = "Histogram of Density")

cat("Insights", "This histogram exhibits a normal curve, while the scatter plot from question 3 part d is uniformly distributed. This generator does not generate uniform random values, while the other one (runif) used in 3d does.")

## Insights This histogram exhibits a normal curve, while the scatter plot from question 3 part d is uniformly distributed. This generator does not generate uniform random values, while the other one (runif) used in 3d does.

Question 5. Develop a Monte Carlo simulation in R that counts the number of uniform [0,1] random numbers that must be summed to get a sum greater than 1. Run a single simulation with n = 10,000 times and find the mean of the number of counts. Do you think this number looks somewhat familiar. Every student might get slightly different value any ideas why?

#Answer
set.seed(0)
n=10000
count = 0
total_count = 0
sum = 0
for(i in 1:n){
  while(sum<=1){
    sum = sum + runif(1)
    count = count + 1
  }
  total_count = total_count + count
  count = 0
  sum = 0
}

mean = total_count/n
cat("Mean of Number of Counts", mean)

## Mean of Number of Counts 2.7245

cat("Other Questions", "This number is close to Euler's Number. Every student might get slightly different values of means since we could all choose different seeds, so the number of uniform random numbers that must be added together to get a sum greater than 1 could vary between trials in the different seeds.")

## Other Questions This number is close to Euler's Number. Every student might get slightly different values of means since we could all choose different seeds, so the number of uniform random numbers that must be added together to get a sum greater than 1 could vary between trials in the different seeds.

Question 6. For our specific random variable, we know that its PDF is defined by the function \(f(x) = 10x(1-x)\). Utilize the accept-reject algorithm to draw samples from this distribution. Take a look at Lecture 10 slide 8 and 9

#Answer
accept.reject = function(f, c, g, rg, n){
  n.accepts = 0
  result.sample = rep(NA, n)
  while(n.accepts<n){
    x.star = rg(1)
    u = runif(1, 0, 1)
    if(u<f(x.star)/(c*g(x.star))){
      n.accepts = n.accepts+1
      result.sample[n.accepts]=x.star
    }
  }
  result.sample
}

f = function(x)10*x*(1-x)
g = function(x)x/x
rg = function(n)runif(n,0,1)
c = 3  #f(0.5) = 2.5 --> 3
vals = accept.reject(f, c, g, rg, 10000)
hist(vals, xlab = "Values", breaks=30, freq=FALSE, main="Sample vs True Density of this Acceptance-Rejection Algorithm")

Homework 3

Question 1. Consider the Binomial distribution with \(n = 24\) and \(p = .9\) that is used to model the number of correctly received bits on a satellite link that transmits data in \(24\)-bit blocks.

Quesion 1 (a) Plot the probability density function and cumulative distribution function for the number of correctly received bits

Quesion 1 (b) What is the mean number of correctly received bits? What is the standard deviation?

Quesion 1 (c) What is the probability of more than \(3\)-bit errors in the block of \(24\)?

Quesion 1 (d) What is the median of this distribution? What can you interpret from the median value?

Quesion 1 (e) What is the 60th quantile of this distribution? What can you interpret from this value?

Quesion 2 (a) What is the probability an operator can process 20 calls in an hour? Repeat for 30 calls in an hour?

Quesion 2 (b) Given that 240 calls occurred in an hour, what is the probability that the ten operators can process them all assuming they are split equally among the operators?

Quesion 2 (c) Now, If the aggregate call rate per hour \(\lambda_c\) , is measured by \(\lambda_c = 85\) calls per hour, what is the probability of more than \(100\) calls in an hour

Question 3. For the two random number generator below A and B(don’t forget to add your R code)

- [A] \(Z_i = (9Z_{i-1} + 1) \mod 16\) with \(Z_0 = 5\).

- [B] \(Z_i = (7Z_{i-1} + 3) \mod 32\) with \(Z_0 = 10\),

Quesion 3 (a) Compute \(Z_i\) and \(U_i\) for values of \(i\) until a number is repeated, what is the period of both the generator? Provide your comments about the period of both RGN

Quesion 3 (b) Which of these parameters effect the period of LCG – \(a\), \(b\), \(Z_0\)

Quesion 3 (c) For both generators plot a scatter diagram of the Zi values 1 apart. What are your observations from these plots? What do you think about the property of randomness of these generators?

Quesion 3 (d) Randomness of default random generator of R – Run runif command to generate 100 random numbers and plot the scatter diagram of these numbers ( values 1 apart) and discuss your observations about randomness of this random generator.

Quesion 3 (e) Compute the mean value of \(U_i\) across the period

Quesion 3 (f) By providing a plot of density (histogram) discuss the uniformity of both of the generators

Question 4. Using the inverse transform method

Question 4.(a) Develop an algorithm for the random variable with cumulative distribution function F(x) below.

Quesion 4 (b) Use the first three Ui values from part (a) and generator [A] of previous problem to create 3 values from the random variable when, \(\lambda = 1\), \(𝑘 = 5\)

Quesion 4 (c) sing R generate 10,000 values from your algorithm when = 1 , 𝑘 = 5 and plot a histogram of the density. Discuss what insights you obtain when you look at the plot generated here vs the plot you generated in question 3 (e)

Question 6. For our specific random variable, we know that its PDF is defined by the function \(f(x) = 10x(1-x)\). Utilize the accept-reject algorithm to draw samples from this distribution. Take a look at Lecture 10 slide 8 and 9