Provide code and answer.
Prompt: in the tutorial, we calculated the average trust in others for France and visualized it. Using instead the variable ‘Trust in Parliament’ (trstplt) and the country of Spain (country file provided on course website), visualize the average trust by survey year. You can truncate the y-axis if you wish. Provide appropriate titles and labels given the changes. What are your main takeaways based on the visual (e.g., signs of increase, decrease, or stall)?
packages <- c("tidyverse", "fst", "modelsummary", "viridis")
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)
lapply(packages, library, character.only = TRUE)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: viridisLite
## [[1]]
## [1] "lubridate" "forcats" "stringr" "dplyr" "purrr" "readr"
## [7] "tidyr" "tibble" "ggplot2" "tidyverse" "stats" "graphics"
## [13] "grDevices" "utils" "datasets" "methods" "base"
##
## [[2]]
## [1] "fst" "lubridate" "forcats" "stringr" "dplyr" "purrr"
## [7] "readr" "tidyr" "tibble" "ggplot2" "tidyverse" "stats"
## [13] "graphics" "grDevices" "utils" "datasets" "methods" "base"
##
## [[3]]
## [1] "modelsummary" "fst" "lubridate" "forcats" "stringr"
## [6] "dplyr" "purrr" "readr" "tidyr" "tibble"
## [11] "ggplot2" "tidyverse" "stats" "graphics" "grDevices"
## [16] "utils" "datasets" "methods" "base"
##
## [[4]]
## [1] "viridis" "viridisLite" "modelsummary" "fst" "lubridate"
## [6] "forcats" "stringr" "dplyr" "purrr" "readr"
## [11] "tidyr" "tibble" "ggplot2" "tidyverse" "stats"
## [16] "graphics" "grDevices" "utils" "datasets" "methods"
## [21] "base"
setwd("/Users/kaylapatricia/Desktop/soc222/homework 2")
spain_data <- read.fst("spain_data.fst")
spain_trstplt <- spain_data %>%
filter(cntry == "ES") %>%
select(trstplt)
spain_trstplt$y <- spain_trstplt$trstplt
table(spain_trstplt$y)
##
## 0 1 2 3 4 5 6 7 8 9 10 77 88 99
## 5165 1830 2329 2441 2085 2890 1154 639 355 80 71 46 336 31
spain_trstplt$y[spain_trstplt$y %in% 77:99] <- NA
spain_data <- spain_data %>%
mutate(
trstplt = ifelse(trstplt %in% c(77, 88, 99), NA, trstplt),
)
table(spain_data$trstplt)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 5165 1830 2329 2441 2085 2890 1154 639 355 80 71
spain_data$year <- NA
replacements <- c(2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018, 2020)
for(i in 1:10){
spain_data$year[spain_data$essround == i] <- replacements[i]
}
table(spain_data$year)
##
## 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020
## 1729 1663 1876 2576 1885 1889 1925 1958 1668 2283
trstplt_by_year <- spain_data %>%
group_by(year) %>%
summarize(mean_trstplt = mean(trstplt, na.rm = TRUE))
trstplt_by_year
## # A tibble: 10 × 2
## year mean_trstplt
## <dbl> <dbl>
## 1 2002 3.41
## 2 2004 3.66
## 3 2006 3.49
## 4 2008 3.32
## 5 2010 2.72
## 6 2012 1.91
## 7 2014 2.23
## 8 2016 2.40
## 9 2018 2.55
## 10 2020 1.94
ggplot(trstplt_by_year, aes(x = year, y = mean_trstplt)) +
geom_line(color = "blue", size = 1) +
geom_point(color = "red", size = 3) +
labs(title = "Trust in Parliament in Spain (2002-2020)",
x = "Survey Year",
y = "Average Trust (0-10 scale)") +
ylim(0, 10) + # Setting the y-axis limits from 0 to 10
theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
My main takeaways based on this visual are that the level of trust in
parliament in Spain has fluctuated, with the lowest level of trust in
2012. Trust in parliament in Spain is lower in every proceeding year
from the first survey year, indicating trust in parliament in Spain
overall is declining.
Provide answer only.
Prompt and question: Based on the figure we produced above called task2_plot, tell us: what are your main takeaways regarding France relative to Italy and Norway? Make sure to be concrete and highlight at least two important comparative trends visualized in the graph.
Two comparative trends visualized in the graph are that France follows a more stable and consistent downwards trend in proportion of respondents saying ‘yes’ to feeling close to a party by cohort, while Norway and Italy follow more fluctuating trends as visualized by the graph. It also shows that Norway has the greatest proportion of respondents indicating they feel close to a party across cohorts, as it has the most shallow slope as compared to the steep, downwards slopes of France and Italy.
Provide code and answer.
Question: What is the marginal percentage of Italian men who feel close to a particular political party?
italy_data <- read.fst("italy_data.fst")
italy_data <- italy_data %>%
mutate(
gndr = case_when(
gndr == 1 ~ "Male",
gndr == 2 ~ "Female",
TRUE ~ NA_character_
),
clsprty = case_when(
clsprty %in% 1 ~ "Yes",
clsprty %in% 2 ~ "No",
TRUE ~ NA_character_
)
)
clsprty_percentages <- italy_data %>%
filter(!is.na(clsprty), !is.na(gndr)) %>%
group_by(gndr, clsprty) %>%
summarise(count = n(), .groups = 'drop') %>%
mutate(percentage = count / sum(count) * 100)
clsprty_percentages
## # A tibble: 4 × 4
## gndr clsprty count percentage
## <chr> <chr> <int> <dbl>
## 1 Female No 3228 34.2
## 2 Female Yes 1686 17.9
## 3 Male No 2593 27.5
## 4 Male Yes 1936 20.5
The marginal percentage of Italian men who feel close to a particular political party is 20.5%.
Provide code and output only.
Prompt: In the tutorial, we calculated then visualized the percentage distribution for left vs. right by gender for France. Your task is to replicate the second version of the visualization but for the country of Sweden instead.
sweden_data <- read.fst("sweden_data.fst")
sweden_data <- sweden_data %>%
mutate(
gndr = case_when(
gndr == 1 ~ "Male",
gndr == 2 ~ "Female",
TRUE ~ NA_character_
),
lrscale = case_when(
lrscale %in% 0:3 ~ "Left",
lrscale %in% 7:10 ~ "Right",
TRUE ~ NA_character_
)
)
lrscale_percentages <- sweden_data %>%
filter(!is.na(lrscale), !is.na(gndr)) %>%
group_by(gndr, lrscale) %>%
summarise(count = n(), .groups = 'drop') %>%
mutate(percentage = count / sum(count) * 100)
lrscale_percentages
## # A tibble: 4 × 4
## gndr lrscale count percentage
## <chr> <chr> <int> <dbl>
## 1 Female Left 2296 23.0
## 2 Female Right 2530 25.3
## 3 Male Left 2062 20.6
## 4 Male Right 3107 31.1
lrscale_plot_v2 <- ggplot(lrscale_percentages,
aes(x = percentage,
y = reorder(gndr, -percentage),
fill = gndr)) +
geom_col() +
coord_flip() +
guides(fill = "none") +
facet_wrap(~ lrscale, nrow = 1) +
labs(x = "Percentage of Respondents",
y = NULL,
title = "Political Orientation by Gender",
subtitle = "Comparing the percentage distribution of left vs. right for Sweden (2002-2020)") +
theme(plot.title = element_text(size = 16, face = "bold"),
plot.subtitle = element_text(size = 12),
axis.title.y = element_blank(),
legend.position = "bottom")
lrscale_plot_v2
Provide code and answer: In Hungary, what is the conditional probability of NOT feeling close to any particular party given that the person lives in a rural area?
hungary_data <- read.fst("hungary_data.fst")
hungary_data <- hungary_data %>%
mutate(
geo = recode(as.character(domicil),
'1' = "Urban",
'2' = "Urban",
'3' = "Rural",
'4' = "Rural",
'5' = "Rural",
'7' = NA_character_,
'8' = NA_character_,
'9' = NA_character_)
) %>%
filter(!is.na(clsprty), !is.na(geo))
cond <- hungary_data %>%
count(clsprty, geo) %>%
group_by(geo) %>%
mutate(prob = n / sum(n))
cond
## # A tibble: 10 × 4
## # Groups: geo [2]
## clsprty geo n prob
## <dbl> <chr> <int> <dbl>
## 1 1 Rural 5055 0.429
## 2 1 Urban 2283 0.472
## 3 2 Rural 6275 0.532
## 4 2 Urban 2395 0.495
## 5 7 Rural 234 0.0199
## 6 7 Urban 88 0.0182
## 7 8 Rural 219 0.0186
## 8 8 Urban 70 0.0145
## 9 9 Rural 4 0.000339
## 10 9 Urban 4 0.000826
Given that someone lives in a rural area, the conditional probability of them not feeling close to any particular party is 80%.