Survey Weights in Social Research

Yuxin Zhang yuxin.zhang@unitn.it

Research Design Lab,
01-02 April 2026

What is a survey weight?

A survey weight is a numerical value assigned to each observation
It tells us how much each observation should contribute to representing population parameters
In other words, survey weight adjusts the sample to better reflect a population using prior knowledge

Common types of survey weights

1. Design weight

Used to adjust for known differences in the probability of being selected
\(p_i\) = probability of selection of unit \(i\)
\(d_i = \frac{1}{p_i}\) (sampling or design weight)
Scenario 1: Equal selection probability for all units
- if \(p_i = \frac{1}{10}\), then \(d_i = 10\)
- population size = 50, sample size = 5
- Each sampled individual represents 10 individuals in the population
Scenario 2: Unequal selection probability across groups
- if group A: \(p_A = \frac{1}{2}\), \(d_A = 2\), group B: \(p_B = \frac{1}{8}\), \(d_B = 8\)
- each sampled individual in group A represents 2 individuals in the population
- each sampled individual in group B represents 8 individuals in the population
Example: In a random address-based survey, individuals living alone are more likely to be sampled than those living with others
- \(p_i = 1\) for single-person households, \(p_i = \frac{1}{n}\) for households with n members

Then, not everyone sampled will actually take part in a survey (missing contact information/nonresponse/rejection /…)

And this can vary systematically across groups in a population

Image source here

2. Nonresponse weight

Used to adjust for known differences in the probability of responding
People with different characteristics are not as likely to respond to the survey
E.g., by gender/age/education/income/ethnicity…
- \(\theta_i\) = response probability of unit \(i\)
- \(r_i = \frac{1}{\theta_i}\) (nonresponse weight)

More than one weight?

In analysis, one weight per observation is applied
If several adjustments are needed, they are multiplied into a single final weight
E.g., combining design and nonresponse weights: \(w_i = d_i \cdot r_i\)

3. Post-stratification / Calibration weight

In stratified sampling, the sample design requires strata to be defined in advance.

Therefore, strata should be constructed based on known population characteristics using external sources (e.g., information from the census).

Image source here

Exercise 1

Population: 50% men, 50% women

Sample: 30% men, 70% women

Which group should get higher weights?

Men get higher weight, women get lower weight

Why?

Weighting corrects unequal representation in the sample: men are underrepresented, women are overrepresented
- underrepresented individuals → large weights
- overrepresented individuals → small weights

Weighted estimator

The unweighted mean is defined as:

\[ \bar{y} = \frac{\sum_{i=1}^{n} y_i}{n} \]

The weighted mean is defined as:

\[ \bar{y}_w = \frac{\sum_{i=1}^{n} w_i y_i}{\sum_{i=1}^{n} w_i} \]

\(y_i\): observed value
\(w_i\): weight assigned to observation \(i\)

Exercise 2: Calculating weighted average

Suppose the true population and your sample differ:

Group	Population count (\(N_g\))	Sample count (\(n_g\))	Score (\(y_g\))
Men	50	30	6
Women	50	70	8

Task

Compute the weight for each group: \(w_g = \frac{\% population_g}{\% sample_g}\)
Assign the weight to each observation (as weights live at the observation level!)
Compute the unweighted and weighted mean scores, and compare them

Hint: 
Men are underrepresented → weight should be > 1  
Women are overrepresented → weight should be < 1

Solution 2

Group	Population count (\(N_g\))	Sample count (\(n_g\))	Score (\(y_g\))	Weight (\(w_g\))
Men	50	30	6	1.67
Women	50	70	8	0.71

By steps:

Step 1: Compute weights
- \(w_{men} = \frac{0.5}{0.3} \approx 1.67\)
- \(w_{women} = \frac{0.5}{0.7} \approx 0.71\)
Step 2: Compute means
- Unweighted average score:
- \(\bar{y} = \frac{(30 \cdot 6) + (70 \cdot 8)}{30 + 70} = \frac{180 + 560}{100} = 7.4\)
- Weighted average score:
- \(\bar{y}_w = \frac{(1.67 \cdot 30 \cdot 6) + (0.71 \cdot 70 \cdot 8)}{(1.67 \cdot 30) + (0.71 \cdot 70)} \approx 7.0\)

Exercise 3: Extreme weights

Please calculate unweighted and weighted average scores using:

id	score	weight
1	50	1
2	60	1
3	55	1
4	70	1
5	65	1
6.	40	1
7	45	1
8	50	1
9	75	1
10	90	50

Solution 3

mean(extreme_df$score) # unweighted

## [1] 60

weighted.mean(extreme_df$score, w = extreme_df$weight) # weighted

## [1] 84.91525

Extreme weights can heavily influence overall estimates. So, always inspect weight distribution before using it.

Summary: Tradeoffs of using weights?

Pros:

Can adjust the sample to better reflect the target population, improving the accuracy of estimates
Can ensure that underrepresented or hard-to-reach groups are appropriately reflected in the weighted data

Cons:

Can give disproportionate influence to a small number of atypical observations
Can introduce additional bias if the prior information for weighting is incorrect
Can increase the variance of estimates (see Korn & Graubard (1999) Section 4.3 for details and examples)
- large weights → noisy estimates
- small samples + weights → unstable inference
- A weighted estimate is not always “better”: it may improve population representativeness, but often at a cost of precision.

Weighted standard error

Step 1: Weighted mean: \[ \bar{y}_w = \frac{\sum_{i=1}^n w_i y_i}{\sum_{i=1}^n w_i} \]

Step 2: Weighted variance:
\[ s_w^2 = \frac{\sum_{i=1}^n w_i (y_i - \bar{y}_w)^2}{\sum_{i=1}^n w_i - 1} \]

Step 3: Effective sample size: \[ n_{\text{eff}} = \frac{\left(\sum_{i=1}^n w_i\right)^2}{\sum_{i=1}^n w_i^2} \]

Step 4: Standard error: \[ \mathrm{SE}(\bar{y}_w) = \sqrt{\frac{s_w^2}{n_{\rm eff}}} \]

Note: The conventional standard error can be misleading when weights are highly unequal, because most of the information comes from a few heavily weighted observations. To adjust for this, we use the effective sample size \(n_{\text{eff}}\) rather than \(\sum_{i=1}^n w_i -1\) when computing the standard error:

When weights become even, \(n_{\text{eff}}\) approaches original sample size \(n\); when weights are dominated by a single (or a few) observation(s), \(n_{\text{eff}}\) approaches 1, reflecting the fact that the estimate is based on very limited independent information.

When weights are equal: \(n_{\text{eff}} = n\)

When weights are highly unequal:

\(\sum w_i^2 \uparrow\)

\(n_{\text{eff}} \downarrow\)

\(\mathrm{SE}(\bar{y}_w) \uparrow\)

Exercise 4.1 (Worked Example): Weighted mean and SE

Consider 5 observations with scores:

id	score	weight
1	50	1
2	80	1
3	55	1
4	70	1
5	65	1

Calculate the weighted mean and the SE of the weighted mean?

Solution 4.1

Weighted mean (all weights = 1, reduces to regular mean): \[ \bar{y}_w = \frac{\sum_{i=1}^n w_i y_i}{\sum_{i=1}^n w_i} = \frac{1\cdot50 + 1\cdot80 + 1\cdot55 + 1\cdot70 + 1\cdot65}{1+1+1+1+1} = 64 \]

Weighted variance: \[ s_w^2 = \frac{\sum_{i=1}^n w_i (y_i - \bar{y}_w)^2}{\sum_{i=1}^n w_i - 1} \]

\[ = \frac{(50-64)^2 + (80-64)^2 + (55-64)^2 + (70-64)^2 + (65-64)^2}{5-1} = \frac{570}{4} = 142.5 \]

Effective sample size: \[ n_{\rm eff} = \frac{(\sum_{i=1}^n w_i)^2}{\sum_{i=1}^n w_i^2} = \frac{(1+1+1+1+1)^2}{1^2+1^2+1^2+1^2+1^2} = \frac{25}{5} = 5 \]

is equal to the actual sample size \(n\) because all weights are equal

Variance and SE of weighted mean: \[ \mathrm{SE}(\bar{y}_w) = \sqrt{\frac{s_w^2}{n}} = \sqrt{\frac{142.5}{5}} = \sqrt{28.5} \approx 5.34 \]

Exercise 4.2 (Worked Example): Weighted mean and SE

Now consider one extreme weight:

id	score	weight
1	50	100
2	80	100
3	55	1
4	70	1
5	65	1

Calculate again the weighted mean and the SE of the weighted mean?

Solution 4.2

Weighted mean: \[ \bar{y}_w = \frac{\sum_i w_i y_i}{\sum_i w_i} = \frac{100\cdot50 + 100\cdot80 + 1\cdot55 + 1\cdot70 + 1\cdot65}{100+100+1+1+1} \approx 65.0 \]

Weighted variance: \[ s_w^2 = \frac{\sum_{i=1}^n w_i (y_i - \bar{y}_w)^2}{\sum_{i=1}^n w_i - 1} \]

\[ = \frac{100\cdot(50-65)^2 + 100\cdot(80-65)^2 + 1\cdot(55-65)^2 + 1\cdot(70-65)^2 + 1\cdot(65-65)^2}{100+100+1+1+1 - 1} \]

\[ = \frac{45125}{202} \approx 223.4 \]

Effective sample size: \[ n_{\rm eff} = \frac{(\sum_{i=1}^n w_i)^2}{\sum_{i=1}^n w_i^2} = \frac{(100+100+1+1+1)^2}{100^2 + 100^2 + 1^2 + 1^2 + 1^2} = = \frac{41209}{20003} \approx 2.06 \]

Variance and SE of weighted mean: \[ \mathrm{SE}(\bar{y}_w) = \sqrt{\frac{s_w^2}{n_{\rm eff}}} = \sqrt{\frac{223.4}{2.06}} \approx \sqrt{108.4} \approx 10.4 \]

Compare results with equal vs extreme weights from the examples

Scenario	Weighted Mean \(\bar{y}_w\)	Weighted Variance \(s_w^2\)	Effective n \(n_{\rm eff}\)	SE of mean \(\mathrm{SE}(\bar{y}_w)\)
Equal weights	64	142.5	5	5.34
Extreme weights	65	223.4	2.06	10.4

You can also do the calculation using R or Stata:

# Solution in R 

df <- data.frame(
  y = c(50, 80, 55, 70, 65),
  w1 = c(1,   1,   1, 1, 1),   # equal weights
  w2 = c(100, 100, 1, 1, 1)  # large unequal weights
  )

# create a function
compute_stats <- function(y, w) {
  ybar <- sum(w * y) / sum(w)                 # weighted mean
  sw2 <- sum(w * (y - ybar)^2) / (sum(w) - 1) # weighted variance
  neff <- (sum(w)^2) / sum(w^2)               # effective sample size
  se <- sqrt(sw2 / neff)                      # standard error
  
  return(data.frame(
    mean = ybar,
    weighted_variance = sw2,
    n_eff = neff,
    se = se))
  } 

# 4.1 equal weights
res_equal <- compute_stats(df$y, df$w1)

# 4.2 unequal weights
res_unequal <- compute_stats(df$y, df$w2)

# show results
results <- rbind(
  Equal_Weights   = res_equal,
  Unequal_Weights = res_unequal)
round(results, 3) |> 
  gt::gt()

mean	weighted_variance	n_eff	se
64.000	142.50	5.00	5.339
64.975	223.39	2.06	10.413

Extreme weight pulls the mean toward that observation
Even though the weighted variance \(s_w^2\) is moderate, the effective sample size drops drastically → inflates SE
This answers why highly unequal weights increase uncertainty

Summary: When it might go wrong?

Dataset has weights → must use them [X]
- E.g., using design weights for descriptive statistics when analysis is only within a homogeneous subgroup.
Extreme weights
- E.g., one respondent with a very large weight represents 1,000 people, disproportionately affecting the estimate.
Weights in regression models? Still debated
- Weights is more crucial when estimating population descriptive statistics but less so when examining the relationship between predictors and the outcome.
- Weights may reduce bias but increase variance.

Poorly specified models → weights may help reduce biases
Well-specified models → weights not needed and they inflate SEs

Summary: When should we apply weights?

Using weights without justifying underlying assumptions is bad practice. So, before computing or using them, ask yourself:

    1. What is my target population?
    2. Is my sample representative of this population?
    3. Do I have enough prior knowledge on the population?
    4. Am I doing description or modeling?
    5. Is selection related to my outcome of interest?

Final note:

Weights are essential for descriptive analysis, particularly when estimating population totals, means, and distributions, as they restore representativeness of the sample.
In regression analysis, however, the role of weights is more conditional and their use depends on the estimand and the data-generating process. Weights are not inherently required for unbiased estimation and may, in some cases, reduce efficiency if not theoretically justified.

And in case…

“It is good practice to report both weighted and unweighted estimates[…] because the contrast serves as a useful joint test against model misspecification and/or misunderstanding of the sampling process” (Solon et al., 2015).

Image source here