Normal Distribution
norm_data <- rnorm(1000, mean = 100, sd = 15)
hist_1 <- hist(norm_data, prob = TRUE, main = "Histogram", xlab = "Frequencies", ylab = "test2")
data_mean <- mean(norm_data)
data_std <- sd(norm_data)
x_values_curve <- seq(min(norm_data), max(norm_data), length = 1000)
y_values_curve <- dnorm(x_values_curve, mean = data_mean, sd = data_std)
lines(x_values_curve, y_values_curve, col = "red", lwd = 2)
3. Compute and report the sample mean and standard deviation. Compare
them to population parameters.
print(data_mean)
## [1] 100.2271
print(data_std)
## [1] 15.10953
The population mean is 100, while the population standard deviation is 15. However, the sample mean is 100.431, while the sample standard deviation is 15.02863. Apart from a few decimal points, both the population and sample variables are exactly the same. This follows the concept of the central limit theorem, which states that as the sample size gets larger, the sample values get closer to being normally distributed.
# P(x > 120)
percent_1 <- 1 - pnorm(120, mean = 100.431, sd = 15.02863)
print(percent_1)
## [1] 0.09643859
# P(90<x<110)
percent_2 <- pnorm(110, mean = 100, sd = 15) - pnorm(90, mean = 100, sd = 15)
print(percent_2)
## [1] 0.4950149
qqnorm(norm_data)
qqline(norm_data)
Yes, the data appears to follow a normally distributed pattern. When adding a line that represents a theoretical normal distribution, it exactly fits the line.
unif_data <- runif(1000, min = 50, max = 150)
hist(unif_data, main = "Uniform Distribution", col = "lightgreen")
qqnorm(unif_data, main = "QQ Plot - Uniform")
qqline(unif_data, col = "pink")
Observing the Sampling Distribution:
As the sample size increases, the sampling distribution becomes normally distributed regardless of its initial shape.
As the sample size increases, the spread (or standard deviation) decreases as it becomes more normally distributed.
The center of the sampling distribution will match the population mean as it becomes normally distributed; they might differ in decimal points, but tend to be the exact same number or range.This is important since the central limit theorem helps us make decisions, so if the population mean and sample mean are the same, we can reach to a conclusion faster.
Application of CLT:
Based on my graphs, the results support the Central Limit Theorem. Whether it was normally distributed or uniformly distributed, both graphs ended up becoming normal regardless of their initial shape. As long as their sample size increased, they were able to follow the Central Limit Theorem.
As stated above, a sample distribution only follows the Central Limit Theorem if the sample size is large enough (greater than 30). Assuming the sample size is 5, and that the distribution would be normal, it would lead us to make an incorrect decision regarding the question at hand.