Standard Deviation and Standard Error
Content you should have understood before watching this video:
- Number 1, ‘Variables’
- Number 2, ‘Variation’
- Number 3, ‘Measuring Variation’
Standard deviation
Standard Deviation and Standard Error
- The variance has one problem: it is measured in units squared
- This isn’t a very meaningful metric so we take the square root value
- This is the standard deviation (\(s\), sometimes \(sd\)):
\[s = \sqrt{\frac{\sum(x_i-\bar{x})^2}{n-1}} = \sqrt{\frac{5.2}{4}} = 1.14\] NB: mostly, the population standard deviation is called \(s\), while the sample standard deviation is called \(\sigma\)
In R:
friends = c(1, 2, 3, 3, 4) sd(friends) [1] 1.140175
Sample standard deviation: why divide by n-1?
Standard Deviation and Standard Error
Standard deviation and standard error
Standard Deviation and Standard Error
Consider this example:
x = c(10, 20) y = c(5, 18, 22, 13, 9, 23) sd(x) [1] 7.071068 sd(y) [1] 7.238784
Standard deviation and standard error
Standard Deviation and Standard Error
So: the standard deviation does not indicate how well we can estimate the mean, for this purpose, we use the standard error of the mean (note sd = s = standard deviation): \[s.e. = \frac{s}{\sqrt{n}}\]
sd(x)/sqrt(2) [1] 5 sd(y)/sqrt(6) [1] 2.955221
Standard deviation and standard error
Standard Deviation and Standard Error
Important to remember
Standard Deviation and Standard Error
- The variance and standard deviation represent the same thing:
- The spread in a variable, how much variability there is
- The higher the value, the higher the variability
- With increasing sample size, we achieve a more precise estimate for the variability
- The standard error
- measures how well we estimate the mean of the population
- decreases with the number of observations because we gain more confidence in the estimate of the mean
Calculating the standard error in R:
friends <- c(1, 2, 3, 3, 4) sd(friends)/sqrt(5) #or: [1] 0.509902 sd(friends)/sqrt(length(friends)) #which of the two is better? [1] 0.509902
The most important in a nutshell
Standard Deviation and Standard Error
- We use the standard error if we want to show how well we can estimate the mean, so most ‘error bars’ will be s.e., not s.d.
- standard deviation and variance characterise the spread of a variable
- In any case, always specify what your error bars mean!