For uniform\((0,1)\) random variables \(U_1,U_2,...,\) define \(N=min\left\{n:{\sum_{i=1}^{n}U_i}>1\right\}\)
That is,\(N\) is the number of random numbers that must be asummed to exeed 1.
Estimate E(N) with standard errors by generating 1,000, 2,000, 5,000, 10,000, and 100,000 values of \(N\), and check if there are any patterns in the estimate and its s.e.
Compute the density function of \(N\), \(E(N)\), and \(Var(N)\).
You think you know baseball? Do home run hitters have the highest batting averages? Think about this hypothesis, then analyze the following experience based on a half season with the Braves:
Batting Average(.252,.305,.299,.303,.285,.191,.283)
Home Runs(12,6,4,15,2,2,16)
Batting Average(.272,.310,.266,.215,.211,.244,.320)
Home Runs(6,8,10,0,3,6,7)
Similar to what Efron did in the Law school data example, compute the bootstrap simulation for 50, 100, …, 10,000 replications. But, instead of using the original 15 observations, we want to know if the number of observations plays an important role. Randomly select 10, 15, 20, and 25 observations and then see if the bootstrap variance converges as the number of replications increases. (Note: You also need to compare your results with that of population.)
The block bootstrap can be used in prediction for dependent data. Use the built-in data “sunspot.year” in R, which is can be modeled as an AR(2) model, compare the difference of prediction via block bootstrap and AR(2) model. As a check, you can leave the final 10 observations as “testing” data.
If d is the minimum distance between any pair of n uniformly distributed points from a unit square, then \(n(n-1)d^2\sim Exp(\frac{2}{\pi})\) provided that n is sufficiently large. Using R to check this result: First, write a function to produce n points uniformly distributed from the unit square. Then, write a function to calculate the smallest distance between any pair of n points. Change the value of n, perform simulation, and comment on what you find.