Continuous Probability Densities:
Original question:
- Make a function that’ll return two random real numbers in \([0,1]\) and record the first of these two numbers if their sum is less than 1.
- Run this procedure 10,000 times and make a bar graph of the result, breaking the interval \([0,1]\) into 10 intervals
- Compare the graph with the function \(f(x)=2-2x\) and show that there is a constant \(c\) such that the height of \(T\) at the \(x\)-coordinate value \(x\) is \(c\).
- Finally, show that \(\int_0^1 f(x)dx=1\) and determine how you might find the probability that the outcome is between .2 and .5?
Answer:
Below, I’ve created a function triangle_func
that uses a while
loop to to ensure that a number that meets the criteria is always returned. Also included is a sample output.
triangle_func <- function(i){
rand_a <- runif(1)
rand_b <- runif(1)
tot <- rand_a + rand_b
while(tot > 1){
rand_a <- runif(1)
rand_b <- runif(1)
tot <- rand_a + rand_b
}
pt <- data.frame( i = c(i), val = c(rand_a), stringsAsFactors = F)
return(pt)
}
triangle_func(1)
## i val
## 1 1 0.2224673
To run this procedure 10,000 times, I’ll create another function, my_sim
which basically constructs a large data frame from the individual results of the triangle_func
- I will also include a small sample output:
my_sim <- function(n){
c <- triangle_func(1)
if(n == 1){
return(c)
}else{
for(i in 2:n){
c <- rbind(c, triangle_func(i))
}
}
my_xb <- c(0, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1)
my_xl <- c("0-.1", ".1-.2", ".2-.3", ".3-.4", ".4-.5",
".5-.6",".6-.7", ".7-.8", ".8-.9", ".9-1")
c$bin <- cut(c$val, breaks = my_xb, labels = my_xl) # add bin column
return(c)
}
my_sim(4)
## i val bin
## 1 1 0.4070053 .4-.5
## 2 2 0.1672384 .1-.2
## 3 3 0.4741705 .4-.5
## 4 4 0.1892686 .1-.2
Below, I’ve used my function to hold the 10,000 records in the variable my_data
.
# couple of small vectors for plotting later
my_data <- my_sim(10000) # create the sim for this problem.
Next, I’ll make a bar plot of the 10K results of the simulation:
my_plot <- ggplot(my_data, x = bin, y = val) +
geom_bar(aes(bin), fill= "black", alpha = .7)
my_plot
The last part asks for me to compare the function \(f(x)=2-2x\) with the values produced by sim and find a value for \(c\) such that \(cf(x) = T\). I found \(c\) by inspection. The small table below shows the results. Note that \(T\) comes from the simulation and is the % of occurrences of a given bin.
ggg <- my_data %>% select(val, bin) %>%
mutate_if(is.factor, as.character) %>%
group_by(bin) %>%
summarise(n = n()) %>% arrange(desc(n)) %>%
mutate(`T` = n/sum(n))
xs <- c(.1, .2, .3, .4, .5, .6, .7, .8, .9, 1) - .05
#length(xs)
combo <- cbind(ggg, data.frame(x_mid = xs, `fx` = (2-2*xs))) %>%
mutate(cfx = fx*(1/10))
bin | \(n\) | \(T\) | \(x_{midbin}\) | \(f(x)\) | \((1/10)f(x)\) |
---|---|---|---|---|---|
0-.1 | 1887 | 0.1887 | 0.05 | 1.9 | 0.19 |
.1-.2 | 1691 | 0.1691 | 0.15 | 1.7 | 0.17 |
.2-.3 | 1555 | 0.1555 | 0.25 | 1.5 | 0.15 |
.3-.4 | 1269 | 0.1269 | 0.35 | 1.3 | 0.13 |
.4-.5 | 1120 | 0.1120 | 0.45 | 1.1 | 0.11 |
.5-.6 | 909 | 0.0909 | 0.55 | 0.9 | 0.09 |
.6-.7 | 713 | 0.0713 | 0.65 | 0.7 | 0.07 |
.7-.8 | 463 | 0.0463 | 0.75 | 0.5 | 0.05 |
.8-.9 | 290 | 0.0290 | 0.85 | 0.3 | 0.03 |
.9-1 | 103 | 0.0103 | 0.95 | 0.1 | 0.01 |
As shown in the table above, I’ve set \(c=\frac{1}{10}\).
To show that \(\int_0^1 f(x)dx=1\), I’ll just integrate and solve:
\(\int_0^1 f(x)dx=F(1)-F(0) \implies\) \(\int_0^1 2-2xdx = 2(1)-(1)^2+c-(2(0)-(0)^2+c)=1\).
To find the prob between .2 and .5, I’d integrate or create random variables from the integrals to calculate similar to the above.