Example

Women in love: a cultural revolution in progress, by Shere Hite (1987)

Sample Surveys

Hite’s survey design

Example - Continued

A good sample should reproduce the characteristics of interest in the population, as closely as possible.

We should get answers as accurately as possible

Survey Sampling

Survey Methodology Sampling Statistics
Psychology, Cognitive Science Statistics
Studies Nonsampling error Studies Sampling error
Questionnaire design Sampling design, estimation

Sir Francis Galton (1822-1911)

Jerzy Neyman (1894-1981)

Morris Hansen (1910-1990)

Why sampling?

Introduction

1. Study Objectives

2. Target population

3. Data Collection

Survey Process

Survey Process: Define precise objectives

Example

Survey Process: Develop Data Collection Protocols

Survey Process: Represent population in a frame

Survey Process: Sample Design and Selection

Survey Process: Collect and prepare data

Survey Process: Data Analysis

  1. Exploratory analysis
    • Check for missing values, outliers, potential errors
    • Examine relationships between survey responses and auxiliary information from external sources
  2. Estimation
    • Compute a “survey weight” that projects the sample onto the larger population
    • Estimation methods are tied to the survey design
  3. Variance estimation
    • Quantify the uncertainty in the estimator
    • Standard error, confidence interval, coefficient of variation

Survey Process and Stat 421 emphasis

Stat 421: Sample Design and Estimation

Probability Sampling Designs

Choosing the probability sampling design

Survey Estimation

Broad Syllabus

Part 2: Foundations of Survey Sampling

Survey Design

  • Survey design involves selecting methods to address all phases of the survey process
    • Objectives
    • Sample Design
    • Data Collection
    • Analysis approach
      • Weights
      • Estimation
      • Variance estimation

Population and Sample

Definition

  • Target population
    • The entire set of units for which the survey data are to be used to make inferences.
    • Thus, the target population defines those units for which the findings of the survey are meant to generalize.
  • Survey Population
    • The population from which the sample can be taken.
  • Sampling frame
    • A realized list of survey population
  • Observational Units (elements)
    • An object on which a measurement is taken; the members of the population

Finite Population

  • The target population contains a FINITE NUMBER of units
    • \(N\) = Total number of elements in the population
  • Differs from notions of a population in other statistics courses
    • Infinite population defined by all possible realizations from a distribution, such as a normal distribution
    • For analysis, we act as if the population is infinite
  • In Stat 421, we only consider finite population
    • Population is a finite collection of \(N\) units.

Example

  • Suppose that we are interested in the readership of the Des Moines register among Iowa adults
  • We decide to estimate the percent of adults (ages 18 or older) residing in Iowa who read the Des Moines register during the week of Jan 8th-13th 2020
    • Target population: All adults ages 18 or older residing in Iowa during the week of Jan 13th-18th of 2020.
    • Element: Adult (individual 18 or older)
    • Population size: N = 3.16 million (Census Bureau 2019 estimate)

Target population: Complexities

  • Target populations are often difficult to define
  • Example: Political poll for an election – What population should we target?
    • Registered voters?
    • Voters in the last election?
    • Those “likely to vote” in the next election?

  • Example: 1994 Democratic gubernatorial primary election in Arizona

    • Target population was defined as registered voters who voted in the last election
    • Poll prediction: Eddie Basha would lose by at least 9 percentage points
    • Election outcome: Basha won 37% of the vote; the other candidates won 35% and 28%, respectively.
    • What happened? Misspecification of the target population!
    • Basha had strong support from demographic groups who had not voted before

Element/Observation Unit: Complexities

  • Some surveys have multiple levels of observation units

  • Example: Survey of Namibian households
    • Some measurements are taken at the household level, while other measurements are taken for individuals living in the household.
    • Household level measurement: Does this household have access to clean drinking water?
    • Individual level measurement: For each individual in the household, what is the highest level of education attained?

Sampling Frame

  • Telephone survey: sampling frame may be a list of telephone numbers
  • Face-to-face interview survey: sampling frame may be a list of addresses
  • Agricultural survey: sampling frame may be a map of areas containing farms

Sampling Frame: Complexities

  • Constructing a sampling frame that accurately reflects the target population can be a challenge.

    • Units in the population may be excluded from the frame (This is called the undercoverage problem)
    • Units in the frame may not be in the target population
  • If “frame”\(\neq\) “target population”, it is called coverage error.

  • Example: What is the average payroll among Iowa businesses with more than 5 employees in 2020?
    • Frame = list of businesses with more than 5 employees from 2019 tax records
    • New businesses in 2020: In the population but not the frame
    • Businesses that closed in 2020: In the frame but not the population

Sampling Frame Types: List and Area Frames

  • List frames
    • Examples: telephone numbers, addresses
    • Strength: may contain good auxiliary information about the population
    • Weakness: may exclude members of the population
  • Area frames: geographic representation
    • Examples: Map, area divided into parcels or tracts
    • Strength: may completely cover the population
    • Weakness: may have little auxiliary information; may contain ineligible units

List and Area Frame Examples

  • National Crime and Victimization Survey
    • What percent of US households were victimized by crime in 2019?
    • Frame: list of households from US Census information and building permits
  • Census Bureau area frame
    • Divides US area into tracts, block groups, and blocks
    • Blocks are clusters of households
    • Block groups are clusters of blocks
    • Tracts are clusters of block groups (and blocks)

Census Tract and Block Groups

Sample

  • Sampling unit (SU): The unit that we actually sample
  • Observational Unit (OU): An object on which a measurement is taken
  • Not necessarily “SU = OU” holds.

  • Example: survey of students at public schools
    • We have a frame of schools, not students
    • Select a sample of schools and interview students in selected schools

      • Sampling unit: school
      • Observation unit: student

Sample

  • Sample: A subset of the survey population
  • Sampled population: Collection of all possible observation units that might have been chosen in a sample
    • Ideally, the sampled population is equal to the target population
    • Why might the sampled population differ from the target population?

Population