DiscussWk5_605

jbrnbrg

September 25, 2017

Continuous Probability Densities:

Original question:Ch2, Ex10, pg 64 (pdf)

  • Make a function that’ll return two random real numbers in \([0,1]\) and record the first of these two numbers if their sum is less than 1.
  • Run this procedure 10,000 times and make a bar graph of the result, breaking the interval \([0,1]\) into 10 intervals
  • Compare the graph with the function \(f(x)=2-2x\) and show that there is a constant \(c\) such that the height of \(T\) at the \(x\)-coordinate value \(x\) is \(c\).
  • Finally, show that \(\int_0^1 f(x)dx=1\) and determine how you might find the probability that the outcome is between .2 and .5?

Answer:

Below, I’ve created a function triangle_func that uses a while loop to to ensure that a number that meets the criteria is always returned. Also included is a sample output.

triangle_func <- function(i){
  rand_a <- runif(1)
  rand_b <- runif(1)
  tot <- rand_a + rand_b
  
  while(tot > 1){
      rand_a <- runif(1)
      rand_b <- runif(1)
      tot <- rand_a + rand_b
  }
  pt <- data.frame( i = c(i), val = c(rand_a), stringsAsFactors = F)
  return(pt)
}
triangle_func(1)
##   i       val
## 1 1 0.2224673

To run this procedure 10,000 times, I’ll create another function, my_sim which basically constructs a large data frame from the individual results of the triangle_func - I will also include a small sample output:

my_sim <- function(n){
  c <- triangle_func(1)
  
  if(n == 1){
    return(c)
  }else{
    for(i in 2:n){
      c <- rbind(c, triangle_func(i))
    }
  }
  my_xb <- c(0, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1)
  my_xl <- c("0-.1", ".1-.2", ".2-.3", ".3-.4", ".4-.5", 
             ".5-.6",".6-.7", ".7-.8", ".8-.9", ".9-1")
  c$bin <- cut(c$val, breaks = my_xb, labels = my_xl)       # add bin column 
  
  return(c)
}
my_sim(4)
##   i       val   bin
## 1 1 0.4070053 .4-.5
## 2 2 0.1672384 .1-.2
## 3 3 0.4741705 .4-.5
## 4 4 0.1892686 .1-.2

Below, I’ve used my function to hold the 10,000 records in the variable my_data.

# couple of small vectors for plotting later
my_data <- my_sim(10000) # create the sim for this problem. 

Next, I’ll make a bar plot of the 10K results of the simulation:

my_plot <- ggplot(my_data, x = bin, y = val) +
  geom_bar(aes(bin), fill= "black", alpha = .7) 
my_plot

The last part asks for me to compare the function \(f(x)=2-2x\) with the values produced by sim and find a value for \(c\) such that \(cf(x) = T\). I found \(c\) by inspection. The small table below shows the results. Note that \(T\) comes from the simulation and is the % of occurrences of a given bin.

ggg <- my_data %>% select(val, bin) %>% 
  mutate_if(is.factor, as.character) %>% 
  group_by(bin) %>% 
  summarise(n = n()) %>% arrange(desc(n)) %>% 
  mutate(`T` = n/sum(n))

xs <- c(.1, .2, .3, .4, .5, .6, .7, .8, .9, 1) - .05
#length(xs)
combo <- cbind(ggg, data.frame(x_mid = xs, `fx` = (2-2*xs))) %>% 
  mutate(cfx = fx*(1/10))
Finding \(c\) by inspection
bin \(n\) \(T\) \(x_{midbin}\) \(f(x)\) \((1/10)f(x)\)
0-.1 1887 0.1887 0.05 1.9 0.19
.1-.2 1691 0.1691 0.15 1.7 0.17
.2-.3 1555 0.1555 0.25 1.5 0.15
.3-.4 1269 0.1269 0.35 1.3 0.13
.4-.5 1120 0.1120 0.45 1.1 0.11
.5-.6 909 0.0909 0.55 0.9 0.09
.6-.7 713 0.0713 0.65 0.7 0.07
.7-.8 463 0.0463 0.75 0.5 0.05
.8-.9 290 0.0290 0.85 0.3 0.03
.9-1 103 0.0103 0.95 0.1 0.01

As shown in the table above, I’ve set \(c=\frac{1}{10}\).

To show that \(\int_0^1 f(x)dx=1\), I’ll just integrate and solve:

\(\int_0^1 f(x)dx=F(1)-F(0) \implies\) \(\int_0^1 2-2xdx = 2(1)-(1)^2+c-(2(0)-(0)^2+c)=1\).

To find the prob between .2 and .5, I’d integrate or create random variables from the integrals to calculate similar to the above.