Normal Distribution

  1. Generate a vector of 1000 random variables from a normal distribution with a mean of 100, and a standard deviation of 15
norm_data <- rnorm(1000, mean = 100, sd = 15)
  1. Plot a histogram of the simulated data. Overlay the theoretical normal distribution curve.
hist_1 <- hist(norm_data, prob = TRUE, main = "Histogram", xlab = "Frequencies", ylab = "test2")
data_mean <- mean(norm_data)
data_std <-  sd(norm_data)
x_values_curve <- seq(min(norm_data), max(norm_data), length = 1000)
y_values_curve <- dnorm(x_values_curve, mean = data_mean, sd = data_std)
lines(x_values_curve, y_values_curve, col = "red", lwd = 2)

3. Compute and report the sample mean and standard deviation. Compare them to population parameters.

print(data_mean)
## [1] 100.2271
print(data_std)
## [1] 15.10953
  1. Interpret: How close are the sample values to the population values? Why?

The population mean is 100, while the population standard deviation is 15. However, the sample mean is 100.431, while the sample standard deviation is 15.02863. Apart from a few decimal points, both the population and sample variables are exactly the same. This follows the concept of the central limit theorem, which states that as the sample size gets larger, the sample values get closer to being normally distributed.

  1. Find what percent of values are greater than 120.
# P(x > 120)

percent_1 <- 1 - pnorm(120, mean = 100.431, sd =  15.02863) 
print(percent_1)
## [1] 0.09643859
  1. Find what percent of values are between 90 and 110.
# P(90<x<110)



percent_2 <- pnorm(110, mean = 100, sd = 15) - pnorm(90, mean = 100, sd = 15)
print(percent_2)
## [1] 0.4950149
  1. Normality Assessment with QQ Plot
qqnorm(norm_data)
qqline(norm_data)

  1. Interpret: Does the data appear normally distributed? Explain.

Yes, the data appears to follow a normally distributed pattern. When adding a line that represents a theoretical normal distribution, it exactly fits the line.

  1. Simulate 1000 values from a uniform distribution.
unif_data <- runif(1000, min = 50, max = 150)
  1. Plot histograms and QQ plots to compare with the normal distribution.
hist(unif_data, main = "Uniform Distribution", col = "lightgreen")

qqnorm(unif_data, main = "QQ Plot - Uniform")
qqline(unif_data, col = "pink")

Observing the Sampling Distribution:

  1. How does the shape of the sampling distribution change as the sample size increases?

As the sample size increases, the sampling distribution becomes normally distributed regardless of its initial shape.

  1. What happens to the spread (standard deviation) of the sampling distributions as the sample size increases?

As the sample size increases, the spread (or standard deviation) decreases as it becomes more normally distributed.

  1. Does the center of the sampling distribution (i.e., mean of sample means) appear to match the true population mean? Why is this important?

The center of the sampling distribution will match the population mean as it becomes normally distributed; they might differ in decimal points, but tend to be the exact same number or range.This is important since the central limit theorem helps us make decisions, so if the population mean and sample mean are the same, we can reach to a conclusion faster.

Application of CLT:

  1. Based on your graphs, do the results support the Central Limit Theorem? Why or why not?

Based on my graphs, the results support the Central Limit Theorem. Whether it was normally distributed or uniformly distributed, both graphs ended up becoming normal regardless of their initial shape. As long as their sample size increased, they were able to follow the Central Limit Theorem.

  1. Suppose you were only allowed to collect a small sample (e.g., size = 5). What could go wrong if you assumed the sampling distribution was normal?

As stated above, a sample distribution only follows the Central Limit Theorem if the sample size is large enough (greater than 30). Assuming the sample size is 5, and that the distribution would be normal, it would lead us to make an incorrect decision regarding the question at hand.