Question 1. Consider the Binomial distribution with \(n = 24\) and \(p
= .9\) that is used to model the number of correctly received
bits on a satellite link that transmits data in \(24\)-bit blocks.
Quesion 1 (a) Plot the probability
density function and cumulative distribution function for the number of
correctly received bits
#Answer 1a
# Install and load necessary libraries
install.packages("ggplot2")
trying URL 'https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/ggplot2_3.4.3.tgz'
Content type 'application/x-gzip' length 3337790 bytes (3.2 MB)
==================================================
downloaded 3.2 MB
The downloaded binary packages are in
/var/folders/5k/z62m95450wjd282d5sh635lw0000gn/T//Rtmp3XujC8/downloaded_packages
library(ggplot2)
# Define parameters
n <- 24
p <- 0.9
x <- 0:n
# Calculate PDF and CDF
pdf <- dbinom(x, n, p)
cdf <- pbinom(x, n, p)
# Plot PDF
ggplot(data.frame(x=x, y=pdf), aes(x=x, y=y)) +
geom_bar(stat="identity") +
labs(title="Probability Density Function", x="Number of correctly received bits", y="Probability") +
theme_minimal()

# Plot CDF
ggplot(data.frame(x=x, y=cdf), aes(x=x, y=y)) +
geom_line() +
labs(title="Cumulative Distribution Function", x="Number of correctly received bits", y="Cumulative Probability") +
theme_minimal()

Quesion 1 (b) What is the mean number of
correctly received bits? What is the standard deviation?
#Answer 1b
#(make sure that what you print make sense to me) - for example to print mean you can use cat command once you done with the calculation and have value in the variable
#variable_name=10
#cat("Mean number of correctly received bits ", variable_name)
n <- 24
p <- 0.9
mean_val <- n * p
std_dev <- sqrt(n * p * (1 - p))
cat("Mean number of correctly received bits:", mean_val, "\n")
Mean number of correctly received bits: 21.6
cat("Standard deviation of correctly received bits:", std_dev, "\n")
Standard deviation of correctly received bits: 1.469694
Quesion 1 (c) What is the probability of
more than \(3\)-bit errors in the block
of \(24\)?
#Answer 1c
n <- 24
k <- 3
p <- .1
prob <- 1 - pbinom(k, n, p)
print(prob)
[1] 0.2142622
Quesion 1 (e) What is the 60th quantile
of this distribution? What can you interpret from this
value?
#Answer 1e
n <- 24
p <- 0.9
quantile_60 <- qbinom(0.6, n, p)
cat("60th quantile of correctly received bits:", quantile_60, "\n")
60th quantile of correctly received bits: 22
print("It means that 60% of the data is below this point")
[1] "It means that 60% of the data is below this point"
# Question2. The Public Service Answering Point (PSAP) in San Francisco employs $19$ operators in $8$-hour shifts to process $911$ calls. There are at least $5$ operators always answering calls. The number of calls processed per operator can be modeled with a Poisson random variable with rate $\lambda_0 =20$ calls per hour.
### <span style='color: grey;'>Quesion 2 (a) What is the probability an operator can process 20 calls in an hour? Repeat for 30 calls in an hour? </span>
# Question2. The Public Service Answering Point (PSAP) in San Francisco employs $19$ operators in $8$-hour shifts to process $911$ calls. There are at least $5$ operators always answering calls. The number of calls processed per operator can be modeled with a Poisson random variable with rate $\lambda_0 =20$ calls per hour.
### <span style='color: grey;'>Quesion 2 (a) What is the probability an operator can process 20 calls in an hour? Repeat for 30 calls in an hour? </span>
#Answer 2a
lambda_0 <- 20
p20calls <- dpois(20, lambda=lambda_0)
p30calls <- dpois(30, lambda=lambda_0)
cat("20calls:",p20calls)
20calls: 0.08883532
cat("30calls:",p30calls)
30calls: 0.008343536
Quesion 2 (b) Given that 240 calls
occurred in an hour, what is the probability that the ten operators can
process them all assuming they are split equally among the
operators?
#Answer 2b
lambda_0 <- 20
prob_one_operator <- dpois(24, lambda_0)
prob_ten_operators <- prob_one_operator
cat("probability that the ten operators can process them all",prob_ten_operators)
probability that the ten operators can process them all 0.05573456
Quesion 2 (c) Now, If the aggregate call
rate per hour \(\lambda_c\) , is
measured by \(\lambda_c = 85\) calls
per hour, what is the probability of more than \(100\) calls in an hour
#Answer
lambda_c <- 85
lessthan <- ppois(100, lambda=lambda_c)
morethan <- 1 - lessthan
morethan
[1] 0.04934533
Question 3. For the two random number generator below A and B(don’t
forget to add your R code)
- [A] \(Z_i = (9Z_{i-1} + 1) \mod
16\) with \(Z_0 = 5\).
- [B] \(Z_i = (7Z_{i-1} + 3) \mod
32\) with \(Z_0 = 10\),
Quesion 3 (b) Which of these parameters
effect the period of LCG – \(a\), \(b\), \(Z_0\)
#Answer
cat("Answer a,b and Z", "")
Answer a,b and Z
Quesion 3 (c) For both generators plot a
scatter diagram of the Zi values 1 apart. What are your observations
from these plots? What do you think about the property of randomness of
these generators?
#Answer
# Scatter diagram for Generator A
plot(Z_values_A[-length(Z_values_A)], Z_values_A[-1], main="Scatter Diagram for Generator A", xlab="Zi", ylab="Zi+1", pch=19, col="blue")

# Scatter diagram for Generator B
plot(Z_values_B[-length(Z_values_B)], Z_values_B[-1], main="Scatter Diagram for Generator B", xlab="Zi", ylab="Zi+1", pch=19, col="red")

cat("If the scatter diagram forms a distinct pattern (like lines or grids), then the generator might not be producing very random numbers. A more random scattering, where there doesn't seem to be a recognizable pattern, indicates a better level of randomness from the generator.
The quality of randomness is judged based on how unpredictable the sequences are. The more dispersed and pattern-less the scatter diagram, the better the randomness. If there's a discernible pattern in the scatter plot, then the sequences from that generator are more predictable, reducing its quality as a random number generator.")
If the scatter diagram forms a distinct pattern (like lines or grids), then the generator might not be producing very random numbers. A more random scattering, where there doesn't seem to be a recognizable pattern, indicates a better level of randomness from the generator.
The quality of randomness is judged based on how unpredictable the sequences are. The more dispersed and pattern-less the scatter diagram, the better the randomness. If there's a discernible pattern in the scatter plot, then the sequences from that generator are more predictable, reducing its quality as a random number generator.
Quesion 3 (d) Randomness of default
random generator of R – Run runif command to generate 100 random numbers
and plot the scatter diagram of these numbers ( values 1 apart) and
discuss your observations about randomness of this random generator.
#Answer
random_numbers <- runif(100)
plot(random_numbers[-100], random_numbers[-1],col="yellow", xlab="Value at i", ylab="Value at i+1", main="Scatter Diagram of Random Numbers (1 Apart)")

NA
NA
NA
Quesion 3 (e) Compute the mean value of
\(U_i\) across the period
#Answer
Z <- 5 # Starting value
Z_values_A <- c(Z)
m_A <- 16 # Modulus
# Generate sequence until a repetition is found
while (TRUE) {
Z <- (9 * Z + 1) %% m_A
if (Z %in% Z_values_A) {
break
}
Z_values_A <- c(Z_values_A, Z)
}
# Computing U_i values and mean for Generator A
U_values_A <- Z_values_A / m_A
mean_U_A <- mean(U_values_A)
cat("Mean A",mean_U_A)
Mean A 0.46875
Z <- 10 # Starting value
Z_values_B <- c(Z)
m_B <- 32 # Modulus
# Generate sequence until a repetition is found
while (TRUE) {
Z <- (7 * Z + 3) %% m_B
if (Z %in% Z_values_B) {
break
}
Z_values_B <- c(Z_values_B, Z)
}
# Computing U_i values and mean for Generator B
U_values_B <- Z_values_B / m_B
mean_U_B <- mean(U_values_B)
cat("Mean B",mean_U_B)
Mean B 0.421875
Question 6. For our specific random variable, we know that its PDF
is defined by the function \(f(x) =
10x(1-x)\). Utilize the accept-reject algorithm to draw samples
from this distribution. Take a look at Lecture 10 slide 8 and 9
#Answer
f <- function(x) {
return(10*x*(1-x))
}
# Using uniform distribution as proposal distribution
g <- function(x) {
return(1)
}
# Finding c
x_values <- seq(0, 1, 0.01)
c <- max(f(x_values))
accept_reject_sampling <- function(n) {
samples <- numeric(n)
count <- 0
while(count < n) {
X <- runif(1)
U <- runif(1)
if(U <= f(X) / (c * g(X))) {
count <- count + 1
samples[count] <- X
}
}
return(samples)
}
# Drawing samples
samples <- accept_reject_sampling(10000)
hist(samples, main="Histogram of Samples", xlab="x", breaks=10, col="lightblue", border="black")

---
title: " <center><span style='color: blue;'> Homework 3 </span></center>"
output: html_notebook
---


# Question 1.  Consider the Binomial distribution with $n = 24$ and $p = .9$ that is used to model the number of correctly received bits on a satellite link that transmits data in $24$-bit blocks.

### <span style='color: grey;'>Quesion 1 (a) Plot the probability density function and cumulative distribution function for the number of correctly received bits</span>

```{r}
#Answer 1a

install.packages("ggplot2")
library(ggplot2)

# Define parameters
n <- 24
p <- 0.9
x <- 0:n

# Calculate PDF and CDF
pdf <- dbinom(x, n, p)
cdf <- pbinom(x, n, p)

# Plot PDF
ggplot(data.frame(x=x, y=pdf), aes(x=x, y=y)) +
  geom_bar(stat="identity") +
  labs(title="Probability Density Function", x="Number of correctly received bits", y="Probability") +
  theme_minimal()

# Plot CDF
ggplot(data.frame(x=x, y=cdf), aes(x=x, y=y)) +
  geom_line() +
  labs(title="Cumulative Distribution Function", x="Number of correctly received bits", y="Cumulative Probability") +
  theme_minimal()

```

### <span style='color: grey;'>Quesion 1 (b) What is the mean number of correctly received bits? What is the standard deviation?</span>
```{r}
#Answer 1b
#(make sure that what you print make sense to me) - for example to print mean you can use cat command once you done with the calculation and have value in the variable
#variable_name=10 
#cat("Mean number of correctly received bits ",  variable_name)

n <- 24
p <- 0.9


mean_val <- n * p
std_dev <- sqrt(n * p * (1 - p))


cat("Mean number of correctly received bits:", mean_val, "\n")
cat("Standard deviation of correctly received bits:", std_dev, "\n")


```

### <span style='color: grey;'>Quesion 1 (c) What is the probability of more than $3$-bit errors in the block of $24$?</span>
```{r}
#Answer 1c
n <- 24  
k <- 3   
p <- .1  

prob <- 1 - pbinom(k, n, p)

print(prob)


```


### <span style='color: grey;'>Quesion 1 (d) What is the median of this distribution? What can you interpret from the median value? </span>
```{r}
#Answer 1d

n <- 24
p <- 0.9


median_val <- qbinom(0.5, n, p)


cat("Median number of correctly received bits:", median_val, "\n")

print("It is the middle value of Bits recieved")

```


### <span style='color: grey;'>Quesion 1 (e)  What is the 60th quantile of this distribution? What can you interpret from this value?</span>
```{r}
#Answer 1e

n <- 24
p <- 0.9


quantile_60 <- qbinom(0.6, n, p)


cat("60th quantile of correctly received bits:", quantile_60, "\n")


print("It means that 60% of the data is below this point")


# Question2. The Public Service Answering Point (PSAP) in San Francisco employs $19$ operators in $8$-hour shifts to process $911$ calls. There are at least $5$ operators always answering calls. The number of calls processed per operator can be modeled with a Poisson random variable with rate $\lambda_0 =20$ calls per hour.

### <span style='color: grey;'>Quesion 2 (a) What is the probability an operator can process 20 calls in an hour? Repeat for 30 calls in an hour? </span>
```

```{r}
# Question2. The Public Service Answering Point (PSAP) in San Francisco employs $19$ operators in $8$-hour shifts to process $911$ calls. There are at least $5$ operators always answering calls. The number of calls processed per operator can be modeled with a Poisson random variable with rate $\lambda_0 =20$ calls per hour.

### <span style='color: grey;'>Quesion 2 (a) What is the probability an operator can process 20 calls in an hour? Repeat for 30 calls in an hour? </span>
#Answer 2a

lambda_0 <- 20


p20calls <- dpois(20, lambda=lambda_0)


p30calls <- dpois(30, lambda=lambda_0)


cat("20calls:",p20calls)
cat("30calls:",p30calls)


```


### <span style='color: grey;'>Quesion 2 (b) Given that 240 calls occurred in an hour, what is the probability that the ten operators can process them all assuming they are split equally among the operators?</span>
```{r}
#Answer 2b
lambda_0 <- 20
prob_one_operator <- dpois(24, lambda_0)
prob_ten_operators <- prob_one_operator

cat("probability that the ten operators can process them all",prob_ten_operators)


```


### <span style='color: grey;'>Quesion 2 (c) Now, If the aggregate call rate per hour $\lambda_c$ , is measured by $\lambda_c = 85$ calls per hour, what is the probability of more than $100$ calls in an hour</span>
```{r}
#Answer 
lambda_c <- 85
lessthan <- ppois(100, lambda=lambda_c)
morethan <- 1 - lessthan
morethan


```

# Question 3. For the two random number generator below A and B(don’t forget to add your R code)
### - [A] \(Z_i = (9Z_{i-1} + 1) \mod 16\) with \(Z_0 = 5\).
### - [B] \(Z_i = (7Z_{i-1} + 3) \mod 32\) with \(Z_0 = 10\),


### <span style='color: grey;'>Quesion 3 (a) Compute $Z_i$ and $U_i$ for values of $i$ until a number is repeated, what is the period of both the generator? Provide your comments about the period of both RGN </span>

```{r}
#Answer 3a
# Generator A
Z_0_A <- 5
Z_values_A <- c(Z_0_A)
U_values_A <- c(Z_0_A / 16)

while(TRUE) {
  Z_next_A <- (9 * Z_values_A[length(Z_values_A)] + 1) %% 16
  if(Z_next_A %in% Z_values_A) break
  Z_values_A <- append(Z_values_A, Z_next_A)
  U_values_A <- append(U_values_A, Z_next_A / 16)
}


cat("Period of A ",period_A)

# Generator B
Z_0_B <- 10
Z_values_B <- c(Z_0_B)
U_values_B <- c(Z_0_B / 32)

while(TRUE) {
  Z_next_B <- (7 * Z_values_B[length(Z_values_B)] + 3) %% 32
  if(Z_next_B %in% Z_values_B) break
  Z_values_B <- append(Z_values_B, Z_next_B)
  U_values_B <- append(U_values_B, Z_next_B / 32)
}


cat(" ,Period of B ",period_B)


```


### <span style='color: grey;'>Quesion 3 (b) Which of these parameters effect the period of LCG – $a$, $b$, $Z_0$ </span>
```{r}
#Answer 
cat("Answer a,b and Z", "")
```


### <span style='color: grey;'>Quesion 3 (c) For both generators plot a scatter diagram of the Zi values 1 apart. What are your observations from these plots? What do you think about the property of randomness of these generators? </span>
```{r}
#Answer
# Scatter diagram for Generator A
plot(Z_values_A[-length(Z_values_A)], Z_values_A[-1], main="Scatter Diagram for Generator A", xlab="Zi", ylab="Zi+1", pch=19, col="blue")

# Scatter diagram for Generator B
plot(Z_values_B[-length(Z_values_B)], Z_values_B[-1], main="Scatter Diagram for Generator B", xlab="Zi", ylab="Zi+1", pch=19, col="red")

cat("If the scatter diagram forms a distinct pattern (like lines or grids), then the generator might not be producing very random numbers. A more random scattering, where there doesn't seem to be a recognizable pattern, indicates a better level of randomness from the generator.
The quality of randomness is judged based on how unpredictable the sequences are. The more dispersed and pattern-less the scatter diagram, the better the randomness. If there's a discernible pattern in the scatter plot, then the sequences from that generator are more predictable, reducing its quality as a random number generator.")

```


### <span style='color: grey;'>Quesion 3 (d) Randomness of default random generator of R – Run runif command to generate 100 random numbers and plot the scatter diagram of these numbers ( values 1 apart) and discuss your observations about randomness of this random generator. </span>
```{r}
#Answer

random_numbers <- runif(100)


plot(random_numbers[-100], random_numbers[-1],col="yellow", xlab="Value at i", ylab="Value at i+1", main="Scatter Diagram of Random Numbers (1 Apart)")



```


### <span style='color: grey;'>Quesion 3 (e) Compute the mean value of $U_i$ across the period</span>
```{r}
#Answer
Z <- 5  
Z_values_A <- c(Z)
m_A <- 16  


while (TRUE) {
  Z <- (9 * Z + 1) %% m_A
  if (Z %in% Z_values_A) {
    break
  }
  Z_values_A <- c(Z_values_A, Z)
}


U_values_A <- Z_values_A / m_A
mean_U_A <- mean(U_values_A)


cat("Mean A",mean_U_A)

Z <- 10  # Starting value
Z_values_B <- c(Z)
m_B <- 32  # Modulus


while (TRUE) {
  Z <- (7 * Z + 3) %% m_B
  if (Z %in% Z_values_B) {
    break
  }
  Z_values_B <- c(Z_values_B, Z)
}

# Computing U_i values and mean for Generator B
U_values_B <- Z_values_B / m_B
mean_U_B <- mean(U_values_B)


cat("Mean B",mean_U_B)


```


### <span style='color: grey;'>Quesion 3 (f) By providing a plot of density (histogram) discuss the uniformity of both of the generators </span>
```{r}
#Answer
# Plot histogram for U_values_A
hist(U_values_A, breaks=20, main="Histogram of U_i Values for Generator A", xlab="U_i", col="lightblue", probability=TRUE, border="black")

# Adding a density line
lines(density(U_values_A), col="blue", lwd=2)

# Plot histogram for U_values_B
hist(U_values_B, breaks=20, main="Histogram of U_i Values for Generator B", xlab="U_i", col="lightgreen", probability=TRUE, border="black")

# Adding a density line
lines(density(U_values_B), col="green", lwd=2)


```

# Question 4. Using the inverse transform method

### <span style='color: grey;'>Question 4.(a) Develop an algorithm for the random variable with cumulative distribution function F(x) below.</span>

\[ F(x) = 1 - e^{-(x/\lambda)^k} \]

where \( x \geq 0 \), \( \lambda \geq 0 \), and \( k \geq 0 \).

```{r}
# Answer 4a
# This might be hard to print using R - you can write the step in your notebook and then include image here / or you can include as seperate file when upload


if (!require(IRdisplay, quietly = TRUE)) {
  
  install.packages("IRdisplay")
  
  
  library(IRdisplay)
} else {
  
  library(IRdisplay)
}

inverse_transform_method <- function(lambda, k, n) {
  # n is the number of random numbers you want to generate
  
  
  U <- runif(n)
  
  # Apply the inverse transform
  x <- lambda * (-log(1 - U))^(1/k)
  
  return(x)
}


random_numbers <- inverse_transform_method(lambda = 1, k = 2, n = 1000)
hist(random_numbers, main="Histogram of Generated Random Numbers", xlab="x", breaks=50, col="lightblue", border="black")


# Embed an image
#display_png(file = "example.png")


```


### <span style='color: grey;'>Quesion 4 (b) Use the first three Ui values from part (a) and generator [A] of previous problem to create 3 values from the random variable when, $\lambda = 1$, $𝑘 = 5$</span>
```{r}
#Answer

Z <- 5  
Z_values_A <- c(Z)
m_A <- 16  

for(i in 1:2) {
  Z <- (9 * Z + 1) %% m_A
  Z_values_A <- c(Z_values_A, Z)
}


U_values_A <- Z_values_A / m_A


lambda <- 1
k <- 5
x_values <- lambda * (-log(1 - U_values_A[1:3]))^(1/k)

print(x_values)


```


### <span style='color: grey;'>Quesion 4 (c) sing R generate 10,000 values from your algorithm when  \lambda = 1 , 𝑘 = 5  and plot a histogram of the density. Discuss what insights you obtain when you look at the plot generated here vs the plot you generated in question 3 (e) </span>
```{r}
#Answer

generate_values <- function(lambda, k, n) {
  U <- runif(n)  # Generate n uniformly distributed random numbers
  
 
  x <- lambda * (-log(1 - U))^(1/k)
  return(x)
}


lambda <- 1
k <- 5
n <- 10000
generated_values <- generate_values(lambda, k, n)

# Plot the histogram
hist(generated_values, main="Histogram for Inverse Transform Generated Values", xlab="x", breaks=50, probability=TRUE, col="lightblue", border="black", ylim=c(0, 0.6))

# Adding a density line
lines(density(generated_values), col="blue", lwd=2)
cat("The histogram from 4c, shaped by the exponential-like distribution
F(x), likely shows a pronounced skew towards lower values, diverging from the more evenly spread values of the generator in 3(e), which aimed for uniformity.If the histogram from 3(e) had relatively consistent bar heights, it'd indicate the LCG's success at approximating a uniform distribution; in contrast, varying bar heights in 4c's histogram are expected given its non-uniform target distribution.")

```

# Question 5. Develop a Monte Carlo simulation in R that counts the number of uniform [0,1] random numbers that must be summed to get a sum greater than 1. Run a single simulation with n = 10,000 times and find the mean of the number of counts. Do you think this number looks somewhat familiar. Every student might get slightly different value any ideas why?
```{r}
#Answer
monte_carlo_simulation <- function() {
  sum <- 0
  count <- 0
  while(sum <= 1) {
    sum <- sum + runif(1) # Generate a random number between 0 and 1
    count <- count + 1
  }
  return(count)
}


n <- 10000
results <- replicate(n, monte_carlo_simulation())

# Finding the mean count
mean_count <- mean(results)
print(mean_count)

```

# Question 6. For our specific random variable, we know that its PDF is defined by the function $f(x) = 10x(1-x)$. Utilize the accept-reject algorithm to draw samples from this distribution. Take a look at Lecture 10 slide 8 and 9

```{r}
#Answer
f <- function(x) {
  return(10*x*(1-x))
}

# Using uniform distribution as proposal distribution
g <- function(x) {
  return(1)
}

# Finding c
x_values <- seq(0, 1, 0.01)
c <- max(f(x_values))

accept_reject_sampling <- function(n) {
  samples <- numeric(n)
  count <- 0
  while(count < n) {
    X <- runif(1)
    U <- runif(1)
    if(U <= f(X) / (c * g(X))) {
      count <- count + 1
      samples[count] <- X
    }
  }
  return(samples)
}

# Drawing samples
samples <- accept_reject_sampling(10000)
hist(samples, main="Histogram of Samples", xlab="x", breaks=10, col="lightblue", border="black")

```


### <span style='color: grey;'>Quesion 1 (b) </span>
```{r}
#Answer

```

