Surveys are frequently used to measure political behavior such as
voter turnout, but some researchers are concerned about the accuracy of
self-reports. In particular, they worry about possible social
desirability bias where in post-election surveys, respondents who
did not vote in an election lie about not having voted because they may
feel that they should have voted. Is such a bias present in the American
National Election Studies (ANES)? The ANES is a nation-wide survey that
has been conducted for every election since 1948. The ANES conducts
face-to-face interviews with a nationally representative sample of
adults. The table below displays the names and descriptions of variables
in the turnout.csv data file.
| Name | Description |
|---|---|
year |
Election year |
VEP |
Voting Eligible Population (in thousands) |
VAP |
Voting Age Population (in thousands) |
total |
Total ballots cast for highest office (in thousands) |
ANES |
Turnout estimated from the American National Election Survey (in percentages) |
felons |
Total ineligible felons (in thousands) |
noncit |
Total non-citizens (in thousands) |
overseas |
Total eligible overseas voters (in thousands) |
osvoters |
Total ballots counted by overseas voters (in thousands) |
turnout <- read.csv("C:/Users/Mr Laptop/Desktop/QM/turnout.csv")
turnout
## year VEP VAP total ANES felons noncit overseas osvoters
## 1 1980 159635 164445 86515 71 802 5756 1803 NA
## 2 1982 160467 166028 67616 60 960 6641 1982 NA
## 3 1984 167702 173995 92653 74 1165 7482 2361 NA
## 4 1986 170396 177922 64991 53 1367 8362 2216 NA
## 5 1988 173579 181955 91595 70 1594 9280 2257 NA
## 6 1990 176629 186159 67859 47 1901 10239 2659 NA
## 7 1992 179656 190778 104405 75 2183 11447 2418 NA
## 8 1994 182623 195258 75106 56 2441 12497 2229 NA
## 9 1996 186347 200016 96263 73 2586 13601 2499 NA
## 10 1998 190420 205313 72537 52 2920 14988 2937 NA
## 11 2000 194331 210623 105375 73 3083 16218 2937 NA
## 12 2002 198382 215462 78382 62 3168 17237 3308 NA
## 13 2004 203483 220336 122295 77 3158 18068 3862 NA
## 14 2008 213314 230872 131304 78 3145 19392 4972 263
library(qss)
data("turnout")
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(qsslearnr)
turnout
## year VEP VAP total ANES felons noncit overseas osvoters
## 1 1980 159635 164445 86515 71 802 5756 1803 NA
## 2 1982 160467 166028 67616 60 960 6641 1982 NA
## 3 1984 167702 173995 92653 74 1165 7482 2361 NA
## 4 1986 170396 177922 64991 53 1367 8362 2216 NA
## 5 1988 173579 181955 91595 70 1594 9280 2257 NA
## 6 1990 176629 186159 67859 47 1901 10239 2659 NA
## 7 1992 179656 190778 104405 75 2183 11447 2418 NA
## 8 1994 182623 195258 75106 56 2441 12497 2229 NA
## 9 1996 186347 200016 96263 73 2586 13601 2499 NA
## 10 1998 190420 205313 72537 52 2920 14988 2937 NA
## 11 2000 194331 210623 105375 73 3083 16218 2937 NA
## 12 2002 198382 215462 78382 62 3168 17237 3308 NA
## 13 2004 203483 220336 122295 77 3158 18068 3862 NA
## 14 2008 213314 230872 131304 78 3145 19392 4972 263
Load the data into R and check the dimensions of the data. Also, obtain a summary of the data. How many observations are there? What is the range of years covered in this data set?
dim(turnout)
## [1] 14 9
summary(turnout)
## year VEP VAP total
## Min. :1980 Min. :159635 Min. :164445 Min. : 64991
## 1st Qu.:1986 1st Qu.:171192 1st Qu.:178930 1st Qu.: 73179
## Median :1993 Median :181140 Median :193018 Median : 89055
## Mean :1993 Mean :182640 Mean :194226 Mean : 89778
## 3rd Qu.:2000 3rd Qu.:193353 3rd Qu.:209296 3rd Qu.:102370
## Max. :2008 Max. :213314 Max. :230872 Max. :131304
##
## ANES felons noncit overseas osvoters
## Min. :47.00 Min. : 802 Min. : 5756 Min. :1803 Min. :263
## 1st Qu.:57.00 1st Qu.:1424 1st Qu.: 8592 1st Qu.:2236 1st Qu.:263
## Median :70.50 Median :2312 Median :11972 Median :2458 Median :263
## Mean :65.79 Mean :2177 Mean :12229 Mean :2746 Mean :263
## 3rd Qu.:73.75 3rd Qu.:3042 3rd Qu.:15910 3rd Qu.:2937 3rd Qu.:263
## Max. :78.00 Max. :3168 Max. :19392 Max. :4972 Max. :263
## NA's :13
Answer: There are 14 observations. The range of the dataset is 1980:2008.
Calculate the turnout rate based on the voting age population or VAP. Note that for this data set, we must add the total number of eligible overseas voters since the VAP variable does not include these individuals in the count. Next, calculate the turnout rate using the voting eligible population or VEP. What difference do you observe?
TurnoutRate1 <- turnout$total/(turnout$VAP+turnout$overseas)*100
TurnoutRate1
## [1] 52.03972 40.24522 52.53748 36.07845 49.72260 35.93884 54.04097 38.03086
## [9] 47.53376 34.83169 49.34211 35.82850 54.54777 55.67409
TurnoutRate2 <- turnout$total/turnout$VEP*100
TurnoutRate2
## [1] 54.19551 42.13701 55.24860 38.14115 52.76848 38.41895 58.11384 41.12625
## [9] 51.65793 38.09316 54.22449 39.51064 60.10084 61.55433
data.frame (TurnoutRate1,TurnoutRate2)
## TurnoutRate1 TurnoutRate2
## 1 52.03972 54.19551
## 2 40.24522 42.13701
## 3 52.53748 55.24860
## 4 36.07845 38.14115
## 5 49.72260 52.76848
## 6 35.93884 38.41895
## 7 54.04097 58.11384
## 8 38.03086 41.12625
## 9 47.53376 51.65793
## 10 34.83169 38.09316
## 11 49.34211 54.22449
## 12 35.82850 39.51064
## 13 54.54777 60.10084
## 14 55.67409 61.55433
Answer: he turnout rates calculated using VAP are higher than those calculated using VEP
Compute the difference between VAP and ANES estimates of turnout rate. How big is the difference on average? What is the range of the difference? Conduct the same comparison for the VEP and ANES estimates of voter turnout. Briefly comment on the results.
differenceVAP <- (turnout$ANES-TurnoutRate1)
differenceVAP
## [1] 18.96028 19.75478 21.46252 16.92155 20.27740 11.06116 20.95903 17.96914
## [9] 25.46624 17.16831 23.65789 26.17150 22.45223 22.32591
mean(differenceVAP)
## [1] 20.32914
range(differenceVAP)
## [1] 11.06116 26.17150
differenceVEP <- (turnout$ANES-TurnoutRate2)
differenceVEP
## [1] 16.804491 17.862987 18.751404 14.858846 17.231520 8.581054 16.886160
## [8] 14.873745 21.342072 13.906838 18.775507 22.489359 16.899156 16.445672
mean(differenceVEP)
## [1] 16.83634
range(differenceVEP)
## [1] 8.581054 22.489359
Answer: On average, the difference between VAP and ANES is bigger than that between VEP and ANES. Also, the VAP and ANES has a broader range in their difference.
Compare the VEP turnout rate with the ANES turnout rate separately for presidential elections and midterm elections. Note that the data set excludes the year 2006. Does the bias of the ANES vary across election types?
turnout <- turnout %>%
mutate(type = if_else(year %in% c(1980, 1984, 1988, 1992, 1996, 2000, 2004, 2008),
'pres',
'midterm'))
turnout %>% group_by(type) %>% summarise(differenceVAP=mean(differenceVAP),differenceVEP=mean(differenceVEP))
## # A tibble: 2 × 3
## type differenceVAP differenceVEP
## <chr> <dbl> <dbl>
## 1 midterm 20.3 16.8
## 2 pres 20.3 16.8
Answer: The bias of ANAS is greater in presidential elections.
Divide the data into half by election years such that you subset the data into two periods. Calculate the difference between the VEP turnout rate and the ANES turnout rate separately for each year within each period. Has the bias of the ANES increased over time?
FirstPeriod <- turnout %>% slice(1:7)
SecondPeriod <- turnout %>% slice(8:14)
(FirstPeriod$total/FirstPeriod$VEP*100) - FirstPeriod$ANES
## [1] -16.804491 -17.862987 -18.751404 -14.858846 -17.231520 -8.581054 -16.886160
(SecondPeriod$total/SecondPeriod$VEP*100) - SecondPeriod$ANES
## [1] -14.87375 -21.34207 -13.90684 -18.77551 -22.48936 -16.89916 -16.44567
Answer: It seems that the bias of ANES has increased over time and ranges more widely.
The ANES does not interview overseas voters and prisoners. Calculate an adjustment to the 2008 VAP turnout rate. Begin by subtracting the total number of ineligible felons and non-citizens from the VAP to calculate an adjusted VAP. Next, calculate an adjusted VAP turnout rate, taking care to subtract the number of overseas ballots counted from the total ballots in 2008. Compare the adjusted VAP turnout with the unadjusted VAP, VEP, and the ANES turnout rate. Briefly discuss the results.
AdjustedVAP <- turnout$VAP-turnout$felons-turnout$noncit
AdjustedTurnout <- turnout$total/AdjustedVAP*100
263 ->turnout$osvoters[14]
AdjustedTurnout
## [1] 54.79552 42.67959 56.03515 38.64073 53.53897 38.99517 58.93660 41.65151
## [9] 52.36551 38.70601 55.07730 40.18415 61.42082 63.02542
UnadjustedTurnout <- (turnout$total/turnout$VAP)*100
UnadjustedTurnout
## [1] 52.61030 40.72566 53.25038 36.52780 50.33937 36.45217 54.72591 38.46501
## [9] 48.12765 35.32996 50.03015 36.37857 55.50387 56.87307
VEPTurnout <- (turnout$total/turnout$VEP)*100
VEPTurnout
## [1] 54.19551 42.13701 55.24860 38.14115 52.76848 38.41895 58.11384 41.12625
## [9] 51.65793 38.09316 54.22449 39.51064 60.10084 61.55433
ANESTurnout <- (turnout$total*turnout$ANES)/100
ANESTurnout
## [1] 61425.65 40569.60 68563.22 34445.23 64116.50 31893.73 78303.75
## [8] 42059.36 70271.99 37719.24 76923.75 48596.84 94167.15 102417.12
data.frame (AdjustedTurnout, UnadjustedTurnout, VEPTurnout, turnout$ANES)
## AdjustedTurnout UnadjustedTurnout VEPTurnout turnout.ANES
## 1 54.79552 52.61030 54.19551 71
## 2 42.67959 40.72566 42.13701 60
## 3 56.03515 53.25038 55.24860 74
## 4 38.64073 36.52780 38.14115 53
## 5 53.53897 50.33937 52.76848 70
## 6 38.99517 36.45217 38.41895 47
## 7 58.93660 54.72591 58.11384 75
## 8 41.65151 38.46501 41.12625 56
## 9 52.36551 48.12765 51.65793 73
## 10 38.70601 35.32996 38.09316 52
## 11 55.07730 50.03015 54.22449 73
## 12 40.18415 36.37857 39.51064 62
## 13 61.42082 55.50387 60.10084 77
## 14 63.02542 56.87307 61.55433 78
Answer: Adjusted Turnout is more similar to VEP Turnout, and less different from ANES than unadjusted turnout. I was not able to substract the overseas cast ballots for 2008.