08/1/2021

Last year’s digression on populations and samples

  • This made me scratch my head as an undergrad
  • The population is, for example, all students in the room, the sample is the five people I picked out.

A wider meaning for population

  • A population is the whole set of data points that could be conceivably obtained when the experiment was concluded
  • This definition emphasizes that it is impossible to really measure parameters (means etc) of a population as we could never obtain the complete dataset
  • So how can a population be defined?
  • With models linking parameters (lets look at last weeks ANOVAs and regressions)

FLOWERS depends on SEX

  • \[ FLOWERS = \begin{Bmatrix} \mu_{male}\\ \mu_{female} \end{Bmatrix} + \epsilon\]

  • \(\mu\) is the mean when calculated from a population

  • \(\epsilon\) is the error, drawn independently for each point from a normal distribution with a mean of zero and a variance \(\sigma^2\)

Test score depends on IQ

  • \[ GPA = \alpha + \beta.IQ + \epsilon \]

  • \(\alpha\) and \(\beta\) are the intercept and the slope of the line (when talking about a population)

Parameters and estimates

Population parameters Usual null hypotheses Sample estimate
\(\mu, \sigma^2\) \(\mu =0\) \(\bar{y}, s^2\)
\(\mu_{male},\mu_{female}, \sigma^2\) \(\mu_{male} = \mu_{female}\) \(\bar{y}_{male}, \bar{y}_{female}, s^2\)
\(\alpha, \beta, \sigma^2\) \(\beta = 0\) \(a,b,s^2\)
  • We phrase the null hypothesis in terms of population parameters
  • Our sample estimates test the null hypothesis

Next lecture

  • ANOVAs and regressions are the same thing!!!