Joel Park
Choose independently two numbers B and C at random from the interval [0, 1] with uniform density. Prove that B and C are proper probability distributions. Note that the point (B,C) is then chosen at random in the unit square. Find the probability that:
(A) B + C < 1/2
Suppose that we choose two random real numbers in [0,1] and add them together. Let Z bet the sum of B + C. We will now derive expressions for the cumulative distribution function and the density function of Z.
Here we take for our sample space \(\Omega\) the unit square in R2 with uniform density. A point \(\omega\epsilon\Omega\) then consists of a pain \((x,y)\) of numbers chosen at random. Then \(O \leq Z \leq 2\). Let \(E_Z\) denote the event that \(Z \leq z\). Let’s create a simulation and plot a histogram to determine the probability distribution and density.
As of note, this particular example, is seen in Example 2.14 from the Textbook, ‘Introduction to Probability’.
# Create 10k random numbers between 0 to 1 for both B and C
B <- runif(10000, min = 0, max = 1)
C <- runif(10000, min = 0, max = 1)
# Creating Z from B + C
Z <- B + C
# Plot the histogram
hist(Z,
main = "Histogram of Distribution of Z",
xlab = "Z",
border = "blue",
col = "green",
xlim=c(0,2.5),
ylim=c(0,1.2),
las = 1,
breaks = 10,
probability = TRUE)
# Drawing in the lines to create the density function curve
abline(0,1)
abline(2,-1)
Again, this example is seen in the textbook, and they have concluded that they had obtained a similar finding.
As you can see, for \(0 \leq Z \leq 2\), the simulation above appears to take on a density function of:
\[f_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ z, & \text{if } 0 \leq z \leq 1 \\ 2-z, & \text{if } 1 \leq z \leq 2 \\ 0, & \text{if } 2 < z \\ \end{cases}\]
And to get the continuous distribution function, we take the integral of the density function:
\[F_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ (1/2)z^2, & \text{if } 0 \leq z \leq 1 \\ 1-(1/2)(2-z)^2, & \text{if } 1 \leq z \leq 2 \\ 1, & \text{if } 2 < z \\ \end{cases}\]
Before we check to see what the probability of B + C < 1/2 is, we must verify that these functions satisfy the definition of the Probability Density Function as noted below.
Criteria 1: For all \(f_Z(z)\) for \(z > 0\), we can see that for \(f_Z(z)\) is indeed positive everywhere.
Criteria 2: The area under the curve \(f_Z(z)\) or otherwise known as \(F_Z(z)\) equals 1 such that: \(\int_{Z} f_Z(z) dz = 1\)
# Define the integrated function
integrand1 <- function(x){x}
integrand2 <- function(x){2-x}
# Integrate the function from 0 to 1, and then 1 to 2
ans1 <- integrate(integrand1, lower = 0, upper = 1)
ans2 <- integrate(integrand2, lower = 1, upper = 2)
# Add them together
total <- ans1$value + ans2$value
print(paste0("Does the area under the curve of f(z) == 1? ", total == 1))
## [1] "Does the area under the curve of f(z) == 1? TRUE"
Criteria 3: If \(f_Z(z)\) is the PDF, then the probability that z belongs to A, where A is some interval, is given by the integral of \(f(z)\) over that interval. This is true as well.
Given that \(f_Z(z)\) fulfills the criteria, let’s find what the probability of B + C < 1/2 is.
# Using the integral function above
ans3 <- integrate(integrand1, lower = 0, upper = 1/2)
print(paste0("The probability of B + C < 1/2 is: ", ans3$value))
## [1] "The probability of B + C < 1/2 is: 0.125"
(B) BC < 1/2
Let us create the simulation.
# Create 10k random numbers between 0 to 1 for both B and C
B2 <- runif(10000, min = 0, max = 1)
C2 <- runif(10000, min = 0, max = 1)
# Creating Z from B2 * C2
Z2 <- B2 * C2
# Plot the histogram
histinfo <- hist(Z2,
main = "Histogram of Distribution of Z",
xlab = "Z",
border = "blue",
col = "green",
xlim=c(0,1.2),
las = 1,
breaks = 10,
probability = TRUE)
# Adding density curve to the histogram
lines(density(Z2))
Now, let’s simulate its continuous distribution curve.
CDFcolor <- rgb(1,0,0)
plot(ecdf(Z2), col = CDFcolor, main="Continuous Density Curve")
Though we do not have a formula as noted in example (A), we can certainly work with the simulation data to determine the probability of BC .As noted above, we need to satisfy the definition of the Probability Density Function.
In Criteria 1, from the simulation above, f(Z) > 0 for \(O \leq Z \leq 1\). By visualizing the diagram, it is satisfied.
In Criteria 2 and 3, again goes hand in hand. The area under the curve should equal 1. (Or in other words, the sum of all the histogram plots should equal to 1.)
# In the above function, I had broken down the histogram into 10 bins. Given that each bin width = 0.1, we need to multiply each height of the density by 0.1 and sum all of the figures up.
hist_density <- sum(histinfo$density * .1)
print(paste0("Does the area under the curve == 1? ", hist_density == 1))
## [1] "Does the area under the curve == 1? TRUE"
Given that we satisfied all the criteria, let’s take the first five bins and add them. This would then create the probability of BC < 1/2
hist_density_0.5 <- sum(histinfo$density[1:5]*.1)
print(paste0("BC < 1/2 probability: ", hist_density_0.5))
## [1] "BC < 1/2 probability: 0.8535"
Before we continue onto part (C), I like to take a moment to take a look at this problem in a completely different view; a geometric view. Suppose instead of using B and C, we use points x and y, where x lies on the x axis and y lies on the y axis. Let’s solve for y from the above equation.
\[xy < 1/2 \rightarrow y < 0.5/x \hspace{0.2cm} | \hspace{0.2cm}0 < x,y \leq 1\]
Let us plot this graph out.
curve(0.5/x, 0, 1, col = "red", ylim = c(0,1))
abline(h=1, col = "violet")
abline(v=1)
abline(v=0.5, lty=2)
As you can see, if we take the area under the curve, with the x and y boundaries from [0,1], we can obtain the probability that \(xy < 0.5\). We can split the graph into two components. If we take the area of x from [0,0.5] and y from [0,1] (\(0.5*1\))+ \(\int_{0.5}^1 (1/x)dx\), we will obtain the probability of \(xy < 0.5\).
B_function <- function(x){0.5/x}
integrated_B <- integrate(B_function, 0.5,1)
area <- 0.5 + integrated_B$value
print(paste0("The Probability that xy (or BC) < 0.5: ", area))
## [1] "The Probability that xy (or BC) < 0.5: 0.846573590279973"
And as you can see that the answer is quite very close to the answer I had obtained from simulation.
(C) |B - C| < 1/2
Let us create the simulation.
# Create 10k random numbers between 0 to 1 for both B and C
B3 <- runif(10000, min = 0, max = 1)
C3 <- runif(10000, min = 0, max = 1)
# Creating Z from B2 * C2
Z3 <- abs(B3 - C3)
# Let's plot this out
histinfo3 <- hist(Z3,
main = "Histogram of Distribution of Z",
xlab = "Z",
border = "blue",
col = "yellow",
xlim=c(-.2,1.2),
las = 1,
breaks = 10,
probability = TRUE)
# Draw the proposed density function
abline(2,-2)
In fact, geometrically, we can determine the density function from the plot above.
\[f_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ 2-2z, & \text{if } 0 \leq z \leq 1 \\ 0, & \text{if } 1 < z \\ \end{cases}\]
And to get the distributive function, we will integrate the above function.
\[F_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ 2z-z^2, & \text{if } 0 \leq z \leq 1 \\ 0, & \text{if } 1 < z \\ \end{cases}\]
Now to verify that the function \(f_Z(z)\) satisfies all three criteria.
\(f_Z(z)\) is positive everywhere. Satisfied
The area under the curve \(f_Z(z)\) == 1:
integrand3 <- function(x){2-2*x}
total1 <- integrate(integrand3, lower = 0, upper = 1)
print(paste0("Does the area under the curve of f(z) == 1? ", total1$value == 1))
## [1] "Does the area under the curve of f(z) == 1? TRUE"
Now what is the probability that |B - C| < 1/2?
total2 <- integrate(integrand3, lower = 0, upper = 0.5)
print(paste0("What is the probability that |B - C| < 1/2? ", total2$value))
## [1] "What is the probability that |B - C| < 1/2? 0.75"
(D) max{B,C} < 1/2
Suppose we set Z = max{B,C}.
# Create 10k random numbers between 0 to 1 for both B and C
B4 <- runif(10000, min = 0, max = 1)
C4 <- runif(10000, min = 0, max = 1)
# Creating Z from maximum pairwise from either B4 or C4
# Reference: https://stackoverflow.com/questions/19994543/how-can-i-take-pairwise-maximum-between-two-vectors
Z4 <- pmax(B4, C4)
# Creating a histogram for Z4
histinfo4 <- hist(Z4,
main = "Histogram of Distribution of Z",
xlab = "Z",
border = "blue",
col = "red",
xlim=c(-.2,1.2),
ylim=c(0, 2),
las = 1,
breaks = 10,
probability = TRUE)
# Creating a density function that would fit this curve
abline(0,2)
Judging from this simulation, the density function appears to be:
\[f_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ 2z, & \text{if } 0 \leq z \leq 1 \\ 0, & \text{if } 1 < z \\ \end{cases}\]
And to get the distributive function, we will integrate the above function.
\[F_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ z^2, & \text{if } 0 \leq z \leq 1 \\ 0, & \text{if } 1 < z \\ \end{cases}\]
Let’s verify the conditions:
integrand4 <- function(x){2*x}
integrated_D <- integrate(integrand4, 0, 1)
print(paste0("Area under the curve == 1? ", integrated_D$value == 1))
## [1] "Area under the curve == 1? TRUE"
What is the probability that max{B,C} < 1/2?
total3 <- integrate(integrand4, lower = 0, upper = 0.5)
print(paste0("What is the probability that max{B,C} < 1/2? ", total3$value))
## [1] "What is the probability that max{B,C} < 1/2? 0.25"
(E) min{B,C} < 1/2
# Create 10k random numbers between 0 to 1 for both B and C
B5 <- runif(10000, min = 0, max = 1)
C5 <- runif(10000, min = 0, max = 1)
# Creating Z from minimum pairwise from either B5 or C5
Z5 <- pmin(B4, C4)
# Creating a histogram for Z5
histinfo5 <- hist(Z5,
main = "Histogram of Distribution of Z",
xlab = "Z",
border = "blue",
col = "yellow",
xlim=c(-.2,1.2),
ylim=c(0,2),
las = 1,
breaks = 10,
probability = TRUE)
# Creating a density function that would fit this curve
abline(2,-2)
Judging from this simulation, the density function appears to be:
\[f_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ 2-2z, & \text{if } 0 \leq z \leq 1 \\ 0, & \text{if } 1 < z \\ \end{cases}\]
And to get the distributive function, we will integrate the above function.
\[F_Z(z) = \begin{cases} 0, & \text{if } z < 0 \\ 2z-z^2, & \text{if } 0 \leq z \leq 1 \\ 0, & \text{if } 1 < z \\ \end{cases}\]
Let’s verify the conditions:
integrand5 <- function(x){2-2*x}
integrated_E <- integrate(integrand5, 0, 1)
print(paste0("Area under the curve == 1? ", integrated_E$value == 1))
## [1] "Area under the curve == 1? TRUE"
What is the probability that min{B,C} < 1/2?
total4 <- integrate(integrand5, lower = 0, upper = 0.5)
print(paste0("What is the probability that min{B,C} < 1/2? ", total4$value))
## [1] "What is the probability that min{B,C} < 1/2? 0.75"