Variation
Content you should have understood before watching this video:
- Number 1, ‘Variables’
Variability in data
Variation
- Almost always occurs
- e.g. if you measure your height 10 times
- e.g. if you measure the height of 10 different people
- e.g. if you measure the height of 10 men and then 10 woman
- The reasons for variability to occur however differ!
- The key goal in statistics is to explain variation
Types of variation and their origin
Variation
- Systematic Variation
- Variation created by a specific experimental manipulation (e.g. administering a drug)
- Variation introduced by unknown factors (both while applying a treatment and while measuring a variable)
- The ‘direction’ of the variation tends to be the same, e.g. in your experiment, if you happen to have more older subjects, they will tend to introduce the same bias, and hence the variation is systematic.
- Unsystematic Variation
- Variation introduced by unknown factors (both while applying a treatment and while measuring a variable)
- The ‘direction’ of the variation tends to be random, e.g. in your experiment, if you happen to get a more heterogeneous sample, subjects will inherently differ more, but not in a systematic way
Systematic and unsystematic variation
Variation
Signal versus noise: Systematic variation adds to the signal (but only if we know its origin!), unsystematic variation adds to the noise!
Systematic and unsystematic variation in a simple experiment
Variation
(Unwanted) systematic and unsystematic variation in the mouse experiment example
Variation
Reason for variation → Type of variation ↓ | Application of manipulation | Measurement error | Natural |
---|---|---|---|
Systematic | Syringe has wrong volume | Experimenter is using a faulty measuring device | A batch of mice has higher growth rates because they come from different origins |
Unsystematic | Solution to be injected is not homogenous | Experimenter is tired, makes random mistakes | Natural variation in growth rates of mice, even though they originate from the same mother |
By the way: what was that again with response and predictor variables…?
Variation
Systematic and unsystematic variation
Variation
- Disentangling the two is the key task of statistics, e.g. by
- reducing unsystematic variation
- explaining systematic variation
- taking several measurements (replication!)
We thus maximise the signal to noise ratio, i.e. we maximise our explanatory power
- Mouse experiment example:
- We are trying to minimise the ‘noise’ by
- applying the injections properly
- measuring the heart rates accurately
- Accounting for additional factors such as sex
- We are trying to minimise the ‘noise’ by
Let’s look at an example!
Variation
The most important in a nutshell
Variation
- Variation is everywhere, things differ, our task is to explain variation!
- There are two types of variation - systematic and
- The more variation we can explain, the higher our signal to noise ratio
- We can increase the signal to noise ratio by increasing replication and reducing the noise (unexplained variation)