Hi all–
Great lab! A wide variety of approaches were observed. I hope your groups’ semi-futile efforts banging away at dlpyr
syntax was at least somewhat pedagogically constructive.
I promise, we’ll spend more time in a future lab on the relevant verbs of dplyr
.
The first questions asks us about this weird ANES survey device, the “feeling thermometer”, which measures a respondent’s affection for different social groups. We want the estimates of warmth by respondents’ ideology (ie, how did liberals evaluate big business? how did moderates? etc…)
Ok–we’ll start by loading the data
library(plyr)
library(tidyverse)
library(magrittr)
d1 <- "https://github.com/thomasjwood/ps4160/raw/master/anes_timeseries_2020.rds" %>%
url %>%
gzcon %>%
readRDS
We can check the feeling thermometers, whose range is given by the question (V202159:V202187
).
d1 %>%
select(V202159:V202187)
An initial complication is that our feeling thermometers are factors
, not numeric
(ie, they’re categorical, and we can’t do math with them.) They also have a set of non substantive answers–indicators that a respondent refused to answer a question.
We could do something quite laborious, recoding each variable one at a time
d1$V202159 <- d1$V202159 %>%
mapvalues(
c("-9. Refused",
"-7. No post-election data, deleted due to incomplete interview",
"-6. No post-election interview",
"-5. Interview breakoff (sufficient partial IW)",
"-4. Technical error",
"998. Don't know",
"999. Don't recognize"),
c(NA) %>%
rep(7)
) %>%
as.character %>%
as.numeric
This is saying
NA
.numeric
variable that we can do math on.Now, all I do below is to take that process and wrap it with a map
, which applies the same function to each variable.
d1[
,
(d1 %>%
names %>%
equals("V202159") %>%
which):
(d1 %>%
names %>%
equals("V202187") %>%
which)
] %<>%
map(
function(i)
i %>%
mapvalues(
c("-9. Refused",
"-7. No post-election data, deleted due to incomplete interview",
"-6. No post-election interview",
"-5. Interview breakoff (sufficient partial IW)",
"-4. Technical error",
"998. Don't know",
"999. Don't recognize"),
c(NA) %>%
rep(7)
) %>%
as.character %>%
as.numeric
)
Almost ready for our table. But we need a nice three party ideology measure:
d1$ideo_3 <- d1$V201200 %>%
plyr::mapvalues(
d1$V201200 %>%
levels,
c(NA,
"liberal",
"moderate",
"conservative",
NA) %>%
rep(c(2, 2, 3, 2, 1))
) %>%
factor(
c("liberal",
"moderate",
"conservative")
)
Now we can do the calculation with dplyr
d1 %>%
select(
V200010a, ideo_3, V202159:V202187
) %>%
gather(
group, eval, -c(V200010a, ideo_3)
) %>%
na.omit %>%
group_by(ideo_3, group) %>%
summarize(
mu = eval %>% weighted.mean(V200010a)
) %>%
slice(
c(
mu %>%
which.max,
mu %>%
which.min
)
)
Would return
ideo_3 group mu
<fct> <chr> <dbl>
1 liberal Scientists 90.0
2 liberal NRA 15.8
3 moderate Scientists 80.8
4 moderate Socialists 39.7
5 conservative Christians 85.7
6 conservative Socialists 12.7
(with some modest adjustments for labelling the groups…).
Now we want to test the level of retrospective economic impressions among mid to high education Democrats, during 2020 (that is, when their partisan impulse was in conflict with their receptivity to objective evidence).
First, some demographics–party and education
d1$partyid_3 <- d1$V201231x %>%
mapvalues(
d1$V201231x %>%
levels,
c(NA,
"democrat",
"independent",
"republican"
) %>%
rep(
c(2, 2, 3, 2)
)
)
d1$educ_3 <- d1$V201511x %>%
mapvalues(
d1$V201511x %>%
levels,
c(NA,
"hsd or less",
"some college",
"ba or more"
) %>%
rep(
c(3, 2, 1, 2)
)
)
Now, let’s clean up the economic impression variables
d1$econ_natl <- d1$V201327x %>%
mapvalues(
d1$V201327x %>%
levels,
c(NA,
"better",
"same",
"worse") %>%
rep(
c(1, 2, 1, 2)
)
) %>%
factor(
c("better",
"same",
"worse")
)
d1$econ <- d1$V201330x %>%
mapvalues(
d1$V201330x %>%
levels,
c(NA,
"better",
"same",
"worse") %>%
rep(
c(1, 2, 1, 2)
)
) %>%
factor(
c("better",
"same",
"worse")
)
d1$econ_unemp <- d1$V201333x %>%
mapvalues(
d1$V201333x %>%
levels,
c(NA,
"better",
"same",
"worse") %>%
rep(
c(1, 2, 1, 2)
)
)
Then it’s all the dplyr
verbs
d1 %>%
select(
V200010b,
partyid_3,
educ_3,
starts_with("econ")
) %>%
filter(
partyid_3 %>%
equals("democrat") &
educ_3 %>%
equals("hsd or less") %>%
not
) %>%
gather(
econ, ans, starts_with("econ")
) %>%
na.omit %>%
group_by(
partyid_3, econ, ans
) %>%
tally(
V200010b
) %>%
mutate(
perc = n %>%
divide_by(
n %>% sum
) %>%
multiply_by(100)
) %>%
select(-n) %>%
spread(ans, perc) %>%
arrange(desc(better))
should return
partyid_3 econ better same worse
<fct> <chr> <dbl> <dbl> <dbl>
1 democrat econ 30.1 36.9 33.0
2 democrat econ_natl 5.00 14.5 80.5
3 democrat econ_unemp 3.12 9.10 87.8
hey good job, moderate to high education Democrats–you balanced your economic expectations for the coming year against the expectation that your copartisan president would be elected!