Haiding Luo
2023 11 13
library(ggplot2)
x <- seq(-5, 5, length.out = 150)
data_normal <- data.frame(x = x, y = dnorm(x), distribution = 'Normal')
data_2 <- data.frame(x = x, y = dt(x, df = 2), distribution = 't (df=2)')
data_5 <- data.frame(x = x, y = dt(x, df = 5), distribution = 't (df=5)')
data_15 <- data.frame(x = x, y = dt(x, df = 15), distribution = 't (df=15)')
data_30 <- data.frame(x = x, y = dt(x, df = 30), distribution = 't (df=30)')
data_120 <- data.frame(x = x, y = dt(x, df = 120), distribution = 't (df=120)')
data_all <- rbind(data_normal, data_2, data_5, data_15, data_30, data_120)
ggplot(data_all, aes(x = x, y = y, color = distribution)) +
geom_line() +
theme_minimal() +
labs(title = "Normal Distribution vs. t-Distributions",
x = "Value",
y = "Density")
library(ggplot2)
set.seed(123)
mu <- 108
sigma <- 7.2
data_values <- rnorm(n = 1000, mean = mu, sd = sigma)
z_scores <- (data_values - mu) / sigma
data_normal <- data.frame(Value = data_values, Type = 'Normal Distribution')
data_z <- data.frame(Value = z_scores, Type = 'Z Score Distribution')
data_all <- rbind(data_normal, data_z)
ggplot(data_all, aes(x = Value)) +
geom_histogram(binwidth = 0.5, fill = "green", alpha = 0.7) +
facet_wrap(~ Type, scales = "free") +
theme_minimal() +
labs(title = "Comparison of Normal Data and Z Score Distribution",
x = "Value",
y = "Density")
I think they have the same distributional shape based on the graph above, Z-score doesn’t change the shape of the distribution.
The p-value in statistics measures the relationship between observed data and a hypothesis. It is a probability, ranging from 0 to 1, indicating the likelihood of the observed results under the hypothesis. A small p-value suggests the observed results may be due to the hypothesis being tested, leading to doubts about the hypothesis’ truth. Conversely, a large p-value implies the results are consistent with the hypothesis, indicating insufficient evidence to reject it. The p-value is key in determining the statistical significance of hypothesis testing.