- Standard deviation is a measure of the amount of variation that exist in data
- This variation is calculated in relation to the mean (average) of the data
- Standard deviation is used in data analysis to show how far values are from the mean
2023-06-08
\(\sigma = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - \overline{x})^2}\)
Calculating from the mean (\(\mu\); in the previous slide, \(\overline{x}\) = the mean of \(x_i\))
<>
set.seed(102)
g1data = floor(rnorm(n=1000,mean = 0, sd = 2))
table_g1data = table(g1data)
dfg1 = as.data.frame(table_g1data)
g1plot =ggplot(dfg1, aes(x=g1data, y = Freq)) +
geom_bar(stat = "identity", fill="green", alpha = 0.5) +
labs(title = "Example of a Low Standard Deviation", x = 'Number',
y= 'Frequency')
g1plot
set.seed(777)
g2data = floor(rnorm(n=1000,mean = 0, sd = 8))
table_g2data = table(g2data)
dfg2 = as.data.frame(table_g2data)
g2plot =ggplot(dfg2, aes(x=g2data, y = Freq)) +
geom_point(stat = "identity", color="blue", alpha = 0.5) +
labs(title = "Example of a High Standard Deviation", x = 'Number',
y= 'Frequency')
g2plot