Week 10 Discussion

Haiding Luo

2023 11 13

Part 1.

library(ggplot2)
x <- seq(-5, 5, length.out = 150)


data_normal <- data.frame(x = x, y = dnorm(x), distribution = 'Normal')


data_2 <- data.frame(x = x, y = dt(x, df = 2), distribution = 't (df=2)')
data_5 <- data.frame(x = x, y = dt(x, df = 5), distribution = 't (df=5)')
data_15 <- data.frame(x = x, y = dt(x, df = 15), distribution = 't (df=15)')
data_30 <- data.frame(x = x, y = dt(x, df = 30), distribution = 't (df=30)')
data_120 <- data.frame(x = x, y = dt(x, df = 120), distribution = 't (df=120)')


data_all <- rbind(data_normal, data_2, data_5, data_15, data_30, data_120)

ggplot(data_all, aes(x = x, y = y, color = distribution)) +
  geom_line() +
  theme_minimal() +
  labs(title = "Normal Distribution vs. t-Distributions",
       x = "Value",
       y = "Density")

Part 2.

library(ggplot2)
set.seed(123)
mu <- 108
sigma <- 7.2


data_values <- rnorm(n = 1000, mean = mu, sd = sigma)

z_scores <- (data_values - mu) / sigma

data_normal <- data.frame(Value = data_values, Type = 'Normal Distribution')
data_z <- data.frame(Value = z_scores, Type = 'Z Score Distribution')
data_all <- rbind(data_normal, data_z)

ggplot(data_all, aes(x = Value)) +
  geom_histogram(binwidth = 0.5, fill = "green", alpha = 0.7) +
  facet_wrap(~ Type, scales = "free") +
  theme_minimal() +
  labs(title = "Comparison of Normal Data and Z Score Distribution",
       x = "Value",
       y = "Density")

I think they have the same distributional shape based on the graph above, Z-score doesn’t change the shape of the distribution.

Part 3.

The p-value in statistics measures the relationship between observed data and a hypothesis. It is a probability, ranging from 0 to 1, indicating the likelihood of the observed results under the hypothesis. A small p-value suggests the observed results may be due to the hypothesis being tested, leading to doubts about the hypothesis’ truth. Conversely, a large p-value implies the results are consistent with the hypothesis, indicating insufficient evidence to reject it. The p-value is key in determining the statistical significance of hypothesis testing.