2025-02-09

Applications

Statistical methods can be used to organize and suggest conclusions based on observed data. For example, standard deviation can be used to measure how much data is spread out. This observation is based around the mean of the sample data.

This allows the evaluation of statistically significant results of data points to determine reliability. For example, standard deviation can help analyze variations in weight of plants on two different treatments with the purpose of testing different conditions.

Plant Growth

##   weight group
## 1   4.17  ctrl
## 2   5.58  ctrl
## 3   5.18  ctrl
## 4   6.11  ctrl
## 5   4.50  ctrl
## 6   4.61  ctrl

\(IQR = Q_3 - Q_1 \\\) lower limit: \(Q_1 - \frac{1}{2} (IQR) \\\) upper limit: \(Q_3 + \frac{1}{2} (IQR) \\\)

Z-score

How many standard deviations are results from the mean?

\(\sigma = \sqrt\frac{\sum{(x_i-\bar{x}^2)}}{n-1} \\\) \(\mu = \frac{1}{n} \ sum_{i=1}^{n}x_{i} \\\) \(z = (x - \mu) /\sigma \\\)

Code from previous slide

PlantGrowth$z = (PlantGrowth$weight - mean(PlantGrowth$weight)) /
sd(PlantGrowth$weight)

ggplot(data = PlantGrowth, aes(x = group, y = z, fill = group)) + 
  geom_bar(stat = 'identity') + 
  geom_hline(yintercept = 0, linetype = "dashed", color = "darkorchid4", 
  linewidth = 0.7) + 
  labs(title = "Z-score",
       caption = "Z-score indicates how far measured weight is from the 
       mean in standard deviations.") +
  coord_flip() + 
  theme_light() +
  theme(plot.title = element_text(color = "midnightblue", face = "bold", 
  hjust = 0.5)) 

Density Output

The impact of outliers