Homework 3

2024-10-20

Interval Estimation

Interval estimation is a range of values used to estimate a population parameter. Unlike point estimation, interval estimation provides a range that is likely to contain the true population parameter with a specified level of confidence.

Key Terms

Point Estimate: A single value estimate of a population parameter (e.g., sample mean).
Confidence Interval (CI): A range of values that is likely to contain the population parameter with a certain level of confidence.
Confidence Level: The proportion of times that the confidence interval would contain the true population parameter if repeated sampling were done.

Math Equation: Confidence Interval Formula

The confidence interval for the population mean (when the population standard deviation is unknown) is given by:

\[ CI = \bar{x} \pm t_{\alpha/2, n-1} \cdot \frac{s}{\sqrt{n}} \]

Where: - \(\bar{x}\) is the sample mean. - \(t_{\alpha/2, n-1}\) is the critical value from the t-distribution. - \(s\) is the sample standard deviation. - \(n\) is the sample size.

Math Equation: P-value Calculation

For a one-tailed test, the P-value is calculated as:

\[ P = P(t > t_{observed}) \]

Where \(t_{observed}\) is the observed value of the test statistic.

Example: Confidence Interval Calculation

Let’s compute a 95% confidence interval for a sample mean.

## 
##  One Sample t-test
## 
## data:  data
## t = 36.027, df = 29, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  60.86573 68.19219
## sample estimates:
## mean of x 
##  64.52896

Plotly Plot: Confidence Interval Visualization

ggplot Example 1: Distribution of Sample Data

We use ggplot2 to visualize the distribution of the sample data.

ggplot Example 2: Confidence Interval for Mean

This plot shows the confidence interval for the sample mean.

Rcode for ggplot Example 1

library(ggplot2)

ggplot(data = data.frame(x=data), aes(x=x)) +

geom_histogram(aes(y=after_stat(density)), binwidth=2, fill=“brown”, alpha=0.5) +

geom_density(color=“grey”, linewidth=1.5) + # Changed size to linewidth

ggtitle(“Distribution of Sample Data”)

Conclusion

Interval estimation provides a range of values that likely contains the true population parameter, giving a measure of reliability to the estimate. Confidence intervals are widely used in research and data analysis to quantify uncertainty.