The following problems are copied from the chapter 20 exercises from Introduction to Modern Statistics First Edition by Mine Çetinkaya-Rundel and Johanna Hardin (https://openintro-ims.netlify.app/inference-two-means.html)

The following is a modified version of Question 6 from the book.

6.Lizards running, bootstrap interval. We have data on top speeds (in m/sec) measured on a laboratory race track for two species of lizards: Western fence lizard (Sceloporus occidentalis) and Sagebrush lizard (Sceloporus graciosus).

  1. Construct side-by-side boxplots of the top_speed separated by lizard (common_name). Be sure to label your plot. The data is stored in the lizard_run data set in the openintro packages.
lizard_run %>%  
  ggplot( aes(y = common_name, x = top_speed) ) + 
  geom_boxplot(fill = "navy") + 
  labs(x="Top Speed", title="Top Speed based off of the type of lizard") + 
  theme_bw()

  1. Construct side-by-side histograms of the top_speed separated by lizard (common_name). Use a binwidth of .25. Be sure to label your plot. The data is stored in the lizard_run data set in the openintro packages.
lizard_run %>%  
  ggplot(aes(x = top_speed, y = ..density..) ) + 
  geom_histogram(col = "lightgray", fill = "navy", binwidth = .25) + 
  geom_density() +
  facet_grid(. ~ common_name) + 
  theme_bw()

  1. Calculate summary statistics for lizard top_speed separated by lizard (common_name)..
favstats(lizard_run$top_speed~lizard_run$common_name)
##   lizard_run$common_name  min     Q1 median     Q3  max     mean        sd  n
## 1       Sagebrush lizard 1.08 1.3875  1.650 1.8300 2.23 1.612692 0.3241056 26
## 2   Western fence lizard 1.52 1.8675  2.195 2.6275 3.38 2.314545 0.5554983 22
##   missing
## 1       0
## 2       0

The bootstrap distribution below describes the variability of difference in means captured from 1,000 bootstrap samples of the lizard data. (Adolph 1987)

  1. Why is the bootstrap distribution centered at -.70?

ANSWER: That is the value of the statistic as the average mean of the sagebrush lizard minus the average statistic of the Western Fence Lizard is equal to the value of the statistic, which is -.70.

  1. Use the bootstrap distribution to construct a 90% confidence interval for the difference in average top speed for Sagebrush and Western Fence lizards.

INTERVAL 1.613-2.314 + or - 1.678 times .131 (-.92, -.480)

  1. Interpret the interval in context.

ANSWER: We are 90% confident that the true difference in means for top speed of the sagebrush and western fence lizard lies within the interval (-.92, -.480).

  1. Is it appropriate to construct a confidence interval using the formula method? Explain your answer.

ANSWER: No as neither sample size is above 30.

Date and time completed: Mon Nov 28 09:47:50 2022