The sampling distribution of a statistic is the probability distribution of a statistic, i.e.ย what values can the statistic take on and how often will we see these values if we took every possible sample of size \(n\) from the population.
Consider taking repeated samples from a population and computing the statistic for each sample. You would get many different values of the statistic and some values would be more common than others.
Let \(\hat p\) denote the sample proportion of a random sample of \(n\) observations from a population with population proportion \(p\). The mean of the sampling distribution of \(\hat p\) is equal to the population proportion \(p\) \[E(\hat p) = p\]
Let \(\hat p\) denote the sample proportion of a random sample of \(n\) observations from a population with population proportion \(p\). The standard deviation of the sampling distribution of \(\hat p\) is equal to \[\sigma_{\hat p} = \sqrt{\frac{p (1 - p)}{n}}\]
Note: The standard error of \(\hat p\) is another name for the standard deviation of the sampling distribution of \(\hat p\).
Let \(\hat p\) denote the sample proportion of a random sample of \(n\) observations from a population with population proportion \(p\). The sampling distribution of \(\hat p\) will be normal if \(n p \ge 10\) and \(n (1- p) \ge 10\)
Let \(\hat p\) denote the sample proportion of a random sample of \(n\) observations from a population with population proportion \(p\). Then \[\hat p \sim \text{ approximately }N(p,\sqrt{\frac{ p (1- p)}{n}})\]
as long as \(n p \ge 10\) and \(n (1- p) \ge 10\)
NOTE: From now on, we will say that \(\hat p\) has a normal distribution when we more accurately mean an approximately normal distribution
Example 1: Suppose we have a population with proportion \(p = 0.50\) and a random sample of size \(n = 900\) drawn from the population.
Note: You can do it all at once in R with the following code
Example 2: A particular candidate for public office is favored by \(38%\) of all registered voters in the district. A polling organization takes a random sample of \(100\) voters.
Note: You can do it all at once in R with the following code