DAT 301 - HW #3

03/16/2025

library(knitr)

What is Interval Estimation?

Interval estimation refers to the statistical technique used to estimate a population parameter by calculating an interval within which the parameter is expected to fall with the specified level of confidence. This interval is known as the confidence interval. The width of the interval reflects the precision of the estimate: narrower intervals indicate the more precise estimates while wider intervals suggest the greater uncertainty. A key factor in determining the width of the confidence interval is the margin of error, which represents the maximum expected difference between the sample estimate and the true population parameter. The margin of error directly influences interval estimation by accounting for variability in the data and the chosen confidence level. In the following slides, I will be using the air quality dataset to provide examples of interval estimation.

3D Plotly Plot: Confidence Intervals

##  95% Confidence Interval for Ozone: [35.840, 48.358] 
##  95% Confidence Interval for Solar.R: [167.656, 201.948] 
##  95% Confidence Interval for Wind: [9.270, 10.609]

ggplot2: Ozone Levels with 95% CI

Blue line represents the mean of the data.
Red lines represent the bounds of the 95% CI.

ggplot2: Solar Radiation with 95% CI

Blue line represents the mean of the data.
Red lines represent the bounds of the 95% CI.

ggplot2: Wind Speed with 95% CI

Blue line represents the mean of the data.
Red lines represent the bounds of the 95% CI.

Math using Latex: Sample Mean Formula

\[ \textbf{Formula} \\ \bar{x} = \frac{\sum_{i=1}^{n}x_i}{n} \] \[ \textbf{Values} \\ \text {n = sample size} \]

Math using Latex: Margin of Error Formula

\[ \textbf{Formula} \\ \text{ME} = t_{\alpha/2, n-1} \times \frac{s}{\sqrt{n}} \quad \]

\[ \textbf{Values} \\ \text {t$_{\alpha/2, n-1}$ = t-critical value} \\ \text {s = sample standard deviation} \\ \text {n = sample size} \]

Math using Latex: CI Formula

\[ \textbf{Formula} \\ \text{CI} = (\bar{x} - \text{ME}, \bar{x} + \text{ME}) \]

\[ \textbf{Values} \\ \text {$\bar{x}$ = sample mean} \\ \text {ME = margin of error} \]

R Code for 3D Plotly Plot

library(dplyr)
library(plotly)
library(ggplot2)

data(airquality)
air_quality <- na.omit(airquality)

ozone_c <- t.test(air_quality$Ozone, conf.level=0.95)
solar_c <- t.test(air_quality$Solar.R, conf.level=0.95)
wind_c <- t.test(air_quality$Wind, conf.level=0.95)

cat(
  sprintf(" 95%% Confidence Interval for Ozone: [%.3f, %.3f] \n",
    ozone_c$conf.int[1], ozone_c$conf.int[2]),
  sprintf("95%% Confidence Interval for Solar.R: [%.3f, %.3f] \n",
    solar_c$conf.int[1], solar_c$conf.int[2]),
  sprintf("95%% Confidence Interval for Wind: [%.3f, %.3f]",
    wind_c$conf.int[1], wind_c$conf.int[2])
)

##  95% Confidence Interval for Ozone: [35.840, 48.358] 
##  95% Confidence Interval for Solar.R: [167.656, 201.948] 
##  95% Confidence Interval for Wind: [9.270, 10.609]

R Code for 3D Plotly Plot (continued)

# use the pan feature to move the plot around
# use the rotate features to view from different angles
plot_ly(data=air_quality, x = ~Ozone, y = ~Solar.R, z = ~Wind,
        type="scatter3d", mode="markers", 
        marker = list(color=~Temp, colorscale="Viridis", size=5))