This lecture note covers the theoretical underpinnings of sampling. It explains not just how to sample, but the mathematical logic of why we can draw conclusions about millions of people by talking to only a few hundred.

Lecture Note: Foundations of Sampling Theory 1. The Core Vocabulary

To understand sampling theory, we must distinguish between the world as it exists and the world we observe.

Population ( 𝑁 N ): The complete set of all items or individuals under investigation (e.g., every adult in a country).

Sample ( 𝑛 n ): A subset of the population used to represent the whole.

Parameter: A numerical characteristic of the population (e.g., the true average height, 𝜇 μ ). Parameters are usually unknown and fixed.

Statistic: A numerical characteristic of the sample (e.g., the average height of 100 people, 𝑥 ˉ x ˉ ). Statistics are known and vary from sample to sample.

Sampling Frame: The actual list or database from which the sample is drawn (e.g., a phone book or voter registration list).

  1. The Logic of Probability Sampling

The goal of sampling theory is to ensure that every member of a population has a known, non-zero chance of being selected. This is the only way to use mathematical probability to calculate error.

Major Probability Designs

Simple Random Sampling (SRS): Every individual has an equal probability of selection. It is the “purest” form but often hard to implement in large populations.

Stratified Sampling: Divide the population into groups (strata) that share a characteristic (e.g., Gender, Income level). Sample randomly within each group.

Benefit: Ensures representation of minority groups and reduces overall variance.

Cluster Sampling: Divide the population into “clusters” (usually geographic, like city blocks). Randomly select entire clusters and survey everyone within them.

Benefit: Much cheaper and logistically easier for large-scale field studies.

Systematic Sampling: Select a starting point at random, then pick every 𝑘 𝑡 ℎ k th element (e.g., every 10th person in a line).

  1. The Mathematical Pillars

Sampling theory rests on two massive mathematical theorems.

A. The Law of Large Numbers (LLN)

The LLN states that as your sample size ( 𝑛 n ) increases, the sample mean ( 𝑥 ˉ x ˉ ) gets closer and closer to the population mean ( 𝜇 μ ).

Insight: Larger samples provide more stable and accurate estimates.

B. The Central Limit Theorem (CLT)

This is the “Magic” of statistics. It states that if you take enough samples, the distribution of the sample means will follow a Normal Distribution (a Bell Curve), even if the underlying population is not normal.

Significance: It allows us to calculate “Confidence Intervals” and “P-values” because we know the mathematical properties of the Normal Distribution.

  1. Sampling Distribution and Standard Error

If you take 1,000 different samples of 100 people, you will get 1,000 different means. The distribution of these means is called the Sampling Distribution.

The Standard Error (SE)

The Standard Error measures how much the sample mean is likely to vary from the true population mean.

𝑆 𝐸 = 𝜎 𝑛 SE= n ​

σ ​

Where:

𝜎 σ = Population Standard Deviation

𝑛 n = Sample Size

Crucial Lesson: To cut your error in half, you must quadruple your sample size (because of the square root).

  1. Properties of a Good Estimator

In theory, we evaluate a sampling method based on two criteria:

Unbiasedness (Accuracy): Does the average of all possible sample means equal the true population mean? If so, the estimator is unbiased.

Efficiency (Precision): How “tight” is the sampling distribution? A more efficient estimator has a smaller Standard Error.

Analogy: Think of a target. Unbiasedness is hitting the bullseye on average. Efficiency is having a tight cluster of arrows.

  1. Sources of Error

In sampling, “Error” doesn’t mean a mistake was made; it refers to the gap between the sample and the population.

A. Sampling Error

The natural difference between a sample and the population. It is unavoidable but measurable (using the Standard Error formula).

B. Non-Sampling Error (The “Real” Danger)

These are systematic flaws that math cannot easily fix:

Selection Bias: The sampling frame excludes certain groups (e.g., an online survey excludes people without internet).

Non-Response Bias: People who refuse to answer may have different opinions than those who do.

Measurement Error: Poorly worded questions or dishonest answers.

  1. The Finite Population Correction (FPC)

Standard sampling theory assumes the population is “infinite” or that we “sample with replacement.” However, if you sample 50% of a small population (e.g., 500 people out of 1,000), your estimate becomes much more accurate than the standard formula suggests.

Formula: 𝑆 𝐸 𝑎 𝑑 𝑗 = 𝑆 𝐸 × 𝑁 − 𝑛 𝑁 − 1 SE adj ​

=SE× N−1 N−n ​

  1. Summary Summary

We sample because it is faster and cheaper.

We use randomization to eliminate bias.

The Central Limit Theorem allows us to use the Bell Curve to estimate how “wrong” our sample might be.

Sample size is the primary tool for controlling the margin of error.

Lab Exercise Preview (R)

In the practical session, we will:

Create a “God-mode” population where we know the true mean.

Take repeated random samples to “discover” the Central Limit Theorem.

Calculate the Standard Error and compare it to the simulated variation.

Experiment with Stratified sampling to see how it reduces error compared to Simple Random Sampling.