1. Introduction

In this assignment, I will show how different political parties see immigrants in the US. So I will be using the following variables:

Independent variable: partyreg_baseline, which breaks into two groups: Republicans and Democrats.
Categorical dependent variables: immi_naturalize_2019, immi_makedifficult_2019 and immi_muslim_2019.
Continuous dependent variables: ft_immig_2017.

2. Description of Variables & Data Preparation

I am interested in knowing how Republicans and Democrats differ their view points about the immigrants. I will be comparing them based on:

immi_naturalize_2019: Whether they support the illegal immigrants in the US to become U.S. citizen in a legal way or not?
immi_makedifficult_2019: Do they think it should be easier or harder for foreigners to immigrate to the US legally than it is currently?
immi_muslim_2019: How do they favor or oppose temporarily banning Muslims immigrants from other countries from entering the United States?
ft_immig_2017: How they feel towards immigrants?

The following steps will be showing the different political party views on the immigrant.

(a) Import Data

Loading the necessary packages. Importing data into R and named it Voter_Data.

library(readr)
library(ggplot2)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Voter_Data = read_csv("/Users/sakif/Desktop/Data 333/Voter Data 2019.csv")

## 
## ── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────────────────
## cols(
##   .default = col_double(),
##   weight_18_24_2018 = col_logical(),
##   izip_2019 = col_character(),
##   housevote_other_2019 = col_character(),
##   senatevote_other_2019 = col_character(),
##   senatevote2_other_2019 = col_character(),
##   SenCand1Name_2019 = col_character(),
##   SenCand1Party_2019 = col_character(),
##   SenCand2Name_2019 = col_character(),
##   SenCand2Party_2019 = col_character(),
##   SenCand3Name_2019 = col_character(),
##   SenCand3Party_2019 = col_character(),
##   SenCand1Name2_2019 = col_character(),
##   SenCand1Party2_2019 = col_character(),
##   SenCand2Name2_2019 = col_character(),
##   SenCand2Party2_2019 = col_character(),
##   SenCand3Name2_2019 = col_character(),
##   SenCand3Party2_2019 = col_character(),
##   governorvote_other_2019 = col_character(),
##   GovCand1Name_2019 = col_character(),
##   GovCand1Party_2019 = col_character()
##   # ... with 108 more columns
## )
## ℹ Use `spec()` for the full column specifications.

## Warning: 800 parsing failures.
##  row               col           expected           actual                                                file
## 2033 weight_18_24_2018 1/0/T/F/TRUE/FALSE .917710168467982 '/Users/sakif/Desktop/Data 333/Voter Data 2019.csv'
## 2828 weight_18_24_2018 1/0/T/F/TRUE/FALSE 1.41022291345592 '/Users/sakif/Desktop/Data 333/Voter Data 2019.csv'
## 4511 weight_18_24_2018 1/0/T/F/TRUE/FALSE 1.77501243840922 '/Users/sakif/Desktop/Data 333/Voter Data 2019.csv'
## 7264 weight_18_24_2018 1/0/T/F/TRUE/FALSE 1.29486870319614 '/Users/sakif/Desktop/Data 333/Voter Data 2019.csv'
## 7277 weight_18_24_2018 1/0/T/F/TRUE/FALSE 1.44972719707603 '/Users/sakif/Desktop/Data 333/Voter Data 2019.csv'
## .... ................. .................. ................ ...................................................
## See problems(...) for more details.

Voter_Data

## # A tibble: 9,548 x 1,282
##    weight_2016 weight_2017 weight_panel_20… weight_latino_2… weight_18_24_20…
##          <dbl>       <dbl>            <dbl>            <dbl> <lgl>           
##  1       0.358       0.438            0.503               NA NA              
##  2       0.563       0.366            0.389               NA NA              
##  3       0.552       0.550            0.684               NA NA              
##  4       0.208      NA               NA                   NA NA              
##  5       0.334       0.346            0.322               NA NA              
##  6       0.207       0.148            0.594               NA NA              
##  7       0.456       0.378           NA                   NA NA              
##  8       1.05        0.993            0.965               NA NA              
##  9       0.478       1.03            NA                   NA NA              
## 10       0.417       0.377            0.516               NA NA              
## # … with 9,538 more rows, and 1,277 more variables: weight_overall_2018 <dbl>,
## #   weight_2019 <dbl>, weight1_2018 <dbl>, weight1_2019 <dbl>,
## #   weight2_2019 <dbl>, weight3_2019 <dbl>, cassfullcd <dbl>,
## #   vote2020_2019 <dbl>, trumpapp_2019 <dbl>, fav_trump_2019 <dbl>,
## #   fav_obama_2019 <dbl>, fav_hrc_2019 <dbl>, fav_sanders_2019 <dbl>,
## #   fav_putin_2019 <dbl>, fav_schumer_2019 <dbl>, fav_pelosi_2019 <dbl>,
## #   fav_comey_2019 <dbl>, fav_mueller_2019 <dbl>, fav_mcconnell_2019 <dbl>,
## #   fav_kavanaugh_2019 <dbl>, fav_biden_2019 <dbl>, fav_warren_2019 <dbl>,
## #   fav_harris_2019 <dbl>, fav_gillibrand_2019 <dbl>, fav_patrick_2019 <dbl>,
## #   fav_booker_2019 <dbl>, fav_garcetti_2019 <dbl>, fav_klobuchar_2019 <dbl>,
## #   fav_gorsuch_2019 <dbl>, fav_kasich_2019 <dbl>, fav_haley_2019 <dbl>,
## #   fav_bloomberg_2019 <dbl>, fav_holder_2019 <dbl>, fav_avenatti_2019 <dbl>,
## #   fav_castro_2019 <dbl>, fav_landrieu_2019 <dbl>, fav_orourke_2019 <dbl>,
## #   fav_hickenlooper_2019 <dbl>, fav_pence_2019 <dbl>, add_confirm_2019 <dbl>,
## #   izip_2019 <chr>, votereg_2019 <dbl>, votereg_f_2019 <dbl>,
## #   regzip_2019 <dbl>, region_2019 <dbl>, turnout18post_2019 <dbl>,
## #   tsmart_G2018_2019 <dbl>, tsmart_G2018_vote_type_2019 <dbl>,
## #   tsmart_P2018_2019 <dbl>, tsmart_P2018_party_2019 <dbl>,
## #   tsmart_P2018_vote_type_2019 <dbl>, housevote_2019 <dbl>,
## #   housevote_other_2019 <chr>, senatevote_2019 <dbl>,
## #   senatevote_other_2019 <chr>, senatevote2_2019 <dbl>,
## #   senatevote2_other_2019 <chr>, SenCand1Name_2019 <chr>,
## #   SenCand1Party_2019 <chr>, SenCand2Name_2019 <chr>,
## #   SenCand2Party_2019 <chr>, SenCand3Name_2019 <chr>,
## #   SenCand3Party_2019 <chr>, SenCand1Name2_2019 <chr>,
## #   SenCand1Party2_2019 <chr>, SenCand2Name2_2019 <chr>,
## #   SenCand2Party2_2019 <chr>, SenCand3Name2_2019 <chr>,
## #   SenCand3Party2_2019 <chr>, governorvote_2019 <dbl>,
## #   governorvote_other_2019 <chr>, GovCand1Name_2019 <chr>,
## #   GovCand1Party_2019 <chr>, GovCand2Name_2019 <chr>,
## #   GovCand2Party_2019 <chr>, GovCand3Name_2019 <chr>,
## #   GovCand3Party_2019 <chr>, inst_court_2019 <dbl>, inst_media_2019 <dbl>,
## #   inst_congress_2019 <dbl>, inst_justice_2019 <dbl>, inst_FBI_2019 <dbl>,
## #   inst_military_2019 <dbl>, inst_church_2019 <dbl>, inst_business_2019 <dbl>,
## #   Democrats_2019 <dbl>, Republicans_2019 <dbl>, Men_2019 <dbl>,
## #   Women_2019 <dbl>, wm_2019 <dbl>, ww_2019 <dbl>, bm_2019 <dbl>,
## #   bw_2019 <dbl>, hm_2019 <dbl>, hw_2019 <dbl>, rwm_2019 <dbl>,
## #   rww_2019 <dbl>, rbm_2019 <dbl>, rbw_2019 <dbl>, pwm_2019 <dbl>, …

(b) Recoding Data

Recode Voter_Data from their numeric form to their labeled form. Store this prepared data in a new object named Immigrants_Data.

Immigrants_Data = Voter_Data %>%
  mutate(Political_Party = ifelse(partyreg_baseline == 1, "Democrat",
                           ifelse(partyreg_baseline == 2, "Republican", NA)),
         Immigrants_Naturalize = ifelse(immi_naturalize_2019 == 1, "Favor",
                                 ifelse(immi_naturalize_2019 == 2, "Oppose",
                                 ifelse(immi_naturalize_2019 == 8, "Don't know", NA))),
         Immigrants_Hardness = ifelse(immi_makedifficult_2019 == 1, "Much easier",
                               ifelse(immi_makedifficult_2019 == 2, "Slightly easier",
                               ifelse(immi_makedifficult_2019 == 3, "No change",
                               ifelse(immi_makedifficult_2019 == 4, "Slightly harder",
                               ifelse(immi_makedifficult_2019 == 5, "Much harder",
                               ifelse(immi_makedifficult_2019 == 8, "Don't know", NA)))))),
         Immigrants_Hardness = factor(Immigrants_Hardness, levels = c("Much easier",
                                                                      "Slightly easier",
                                                                      "No change",
                                                                      "Slightly harder",
                                                                      "Much harder",
                                                                      "Don't know")),
         Banning_Muslims = ifelse(immi_muslim_2019 == 1, "Strongly favor",
                           ifelse(immi_muslim_2019 == 2, "Somewhat favor",
                           ifelse(immi_muslim_2019 == 3, "Somewhat oppose",
                           ifelse(immi_muslim_2019 == 4, "Strongly oppose",
                           ifelse(immi_muslim_2019 == 8, "Don't know", NA))))),
         Banning_Muslims = factor(Banning_Muslims, levels = c("Strongly favor",
                                                              "Somewhat favor",
                                                              "Somewhat oppose",
                                                              "Strongly oppose",
                                                              "Don't know")),
         Feeling_About_Immigrants = ifelse(ft_immig_2017 > 100, NA, ft_immig_2017)) %>%
  select(Political_Party, Immigrants_Naturalize, Immigrants_Hardness, Banning_Muslims, Feeling_About_Immigrants) %>%
  filter(!is.na(Political_Party), !is.na(Immigrants_Naturalize), !is.na(Immigrants_Hardness), !is.na(Banning_Muslims), !is.na(Feeling_About_Immigrants))

Immigrants_Data

## # A tibble: 2,017 x 5
##    Political_Party Immigrants_Natu… Immigrants_Hard… Banning_Muslims
##    <chr>           <chr>            <fct>            <fct>          
##  1 Democrat        Favor            No change        Strongly oppose
##  2 Democrat        Favor            Much easier      Strongly oppose
##  3 Democrat        Favor            Much easier      Strongly oppose
##  4 Republican      Oppose           Slightly harder  Strongly favor 
##  5 Democrat        Favor            No change        Strongly oppose
##  6 Republican      Oppose           Much harder      Somewhat oppose
##  7 Republican      Oppose           Much harder      Somewhat oppose
##  8 Republican      Oppose           Slightly harder  Somewhat favor 
##  9 Republican      Oppose           Much harder      Strongly favor 
## 10 Democrat        Favor            Slightly easier  Strongly oppose
## # … with 2,007 more rows, and 1 more variable: Feeling_About_Immigrants <dbl>

3. Analysis: Political_Party (IV) x Immigrants_Naturalize (DV)

Analysing Political_Party and Immigrants_Naturalize from different views.

(a) Crosstab

table(Immigrants_Data$Immigrants_Naturalize, Immigrants_Data$Political_Party) %>%
  prop.table(2)

##             
##                Democrat Republican
##   Don't know 0.07383774 0.13260870
##   Favor      0.79398359 0.30760870
##   Oppose     0.13217867 0.55978261

Interpretation:

It’s clearly showing from the crosstab that 79% Democrats are favoring immigrants in contrast, only 30% Republican are in favor. 13% Democrats are opposing immigrants meanwhile, almost 56% Republicans are opposing immigrants.

(b) Visualization: Stacked barchart

Immigrants_Data %>%
  group_by(Political_Party, Immigrants_Naturalize) %>%
  summarize(n=n()) %>%
  mutate(Percent = n/sum(n)) %>%
  ggplot()+
  geom_col(aes(x = Political_Party, y = Percent, fill = Immigrants_Naturalize))

## `summarise()` regrouping output by 'Political_Party' (override with `.groups` argument)

Interpretation:

From the visualization, it’s clearly showing that most Democrats are favoring immigrants where on the other side most Republican opposing immigrants.

(c) Statistical Test: Chi-square test

options(scipen = 999)
chisq.test(Immigrants_Data$Immigrants_Naturalize, Immigrants_Data$Political_Party)

## 
##  Pearson's Chi-squared test
## 
## data:  Immigrants_Data$Immigrants_Naturalize and Immigrants_Data$Political_Party
## X-squared = 503.66, df = 2, p-value < 0.00000000000000022

Interpretation:

So, there is a statistically significant relationship between Political_Party, and what they think about providing a legal way for illegal immigrants already in the US.

4. Analysis: Political_Party (IV) x Immigrants_Hardness (DV)

Analysing Political_Party and Immigrants_Hardness from different views.

(a) Crosstab

table(Immigrants_Data$Immigrants_Hardness, Immigrants_Data$Political_Party) %>%
  prop.table(2)

##                  
##                     Democrat Republican
##   Much easier     0.19781222 0.05543478
##   Slightly easier 0.24521422 0.15434783
##   No change       0.25068368 0.24456522
##   Slightly harder 0.11394713 0.20217391
##   Much harder     0.13947129 0.31847826
##   Don't know      0.05287147 0.02500000

Interpretation:

The crosstab about making immigration prrocess easier or harder, almost 44% Democrats wants it to be easier but from Republican it’s only 21%. Also about 25% both wants no change. But only 25% Democrats wants it to be harder where 52% Republicants wants it to be harder.

(b) Visualization: Stacked barchart

Immigrants_Data %>%
  group_by(Political_Party, Immigrants_Hardness) %>%
  summarize(n=n()) %>%
  mutate(Percent = n/sum(n)) %>%
  ggplot()+
  geom_col(aes(x = Political_Party, y = Percent, fill = Immigrants_Hardness))

## `summarise()` regrouping output by 'Political_Party' (override with `.groups` argument)

Interpretation:

From the visualization, it’s clearly showing that most Democrats want this process to be easier where most Republicants want it to be harder.

(c) Statistical Test: Chi-square test

options(scipen = 999)
chisq.test(Immigrants_Data$Immigrants_Hardness, Immigrants_Data$Political_Party)

## 
##  Pearson's Chi-squared test
## 
## data:  Immigrants_Data$Immigrants_Hardness and Immigrants_Data$Political_Party
## X-squared = 204.14, df = 5, p-value < 0.00000000000000022

Interpretation:

So, there is a statistically significant relationship between Political_Party, and their thinking on immigration process for immigrants.

5. Analysis: Political_Party (IV) x Banning_Muslims (DV)

Analysing Political_Party and Banning_Muslims from different views.

(a) Crosstab

table(Immigrants_Data$Banning_Muslims, Immigrants_Data$Political_Party) %>%
  prop.table(2)

##                  
##                     Democrat Republican
##   Strongly favor  0.10300820 0.43260870
##   Somewhat favor  0.10300820 0.27065217
##   Somewhat oppose 0.20510483 0.11413043
##   Strongly oppose 0.50501367 0.09347826
##   Don't know      0.08386509 0.08913043

Interpretation:

The crosstab about banning muslims immigrant showing that, only 20% of Democrats favor it where as 70% Republicans want to ban Muslims. Also 70% Democrats oppose it, where from Republicans it’s only 20%.

(b) Visualization: Stacked barchart

Immigrants_Data %>%
  group_by(Political_Party, Banning_Muslims) %>%
  summarize(n=n()) %>%
  mutate(Percent = n/sum(n)) %>%
  ggplot()+
  geom_col(aes(x = Political_Party, y = Percent, fill = Banning_Muslims))

## `summarise()` regrouping output by 'Political_Party' (override with `.groups` argument)

Interpretation:

From the visualization, it’s clearing showing that most Democrats want to oppose Muslims banning where most Republicans in favor of banning Muslims.

(c) Statistical Test: Chi-square test

options(scipen = 999)
chisq.test(Immigrants_Data$Banning_Muslims, Immigrants_Data$Political_Party)

## 
##  Pearson's Chi-squared test
## 
## data:  Immigrants_Data$Banning_Muslims and Immigrants_Data$Political_Party
## X-squared = 585.46, df = 4, p-value < 0.00000000000000022

Interpretation:

So, there is a statistically significant relationship between Political_Party, and their thinking on banning Muslim immigrants.

6. Analysis: Political_Party (IV) x Feeling_About_Immigrants (DV)

Analysing Political_Party and Feeling_About_Immigrants from different views.

(a) Table comparing means

Immigrants_Data %>%
  group_by(Political_Party) %>%
  summarise(Average_Feeling_About_Immigrants = mean(Feeling_About_Immigrants))

## `summarise()` ungrouping output (override with `.groups` argument)

## # A tibble: 2 x 2
##   Political_Party Average_Feeling_About_Immigrants
##   <chr>                                      <dbl>
## 1 Democrat                                    70.0
## 2 Republican                                  52.7

Interpretation:

From the table, it’s cleary showing Democrats average feeling toward immigrants are 18% more than Republicans.

(b) Visualization: Bar chart comparing means.

Immigrants_Data %>%
  group_by(Political_Party) %>%
  summarise(Average_Feeling_About_Immigrants = mean(Feeling_About_Immigrants)) %>%
  ggplot()+
  geom_col(aes(x = Political_Party, y = Average_Feeling_About_Immigrants, fill = Political_Party))

## `summarise()` ungrouping output (override with `.groups` argument)

Interpretation:

From the visualization, it’s clearing showing that Democrats feel more favorable towards immigrtant compared to Republicans.

(c) Visualization: Histogram comparing population distributions

Immigrants_Data %>%
  ggplot()+
  geom_histogram(aes(x = Feeling_About_Immigrants)) +
  facet_wrap(~Political_Party)

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Interpretation:

Democrat has more density in the range of (50-100) where Republican has more density in the range of (0-50) when asked about their feeling towards immigrants. Moreover, Republican gave score of 50 than Democrat but Democrat gave score of 80, 90 and 100 more than Republic.

(d) Visualization: Histogram comparing sampling distributions

Democrat_Feeling_About_Immigrants = Immigrants_Data %>%
  filter(Political_Party == "Democrat")

Republican_Feeling_About_Immigrants = Immigrants_Data %>%
  filter(Political_Party == "Republican")

Democrat_Sample_Dist = replicate(10000, sample(Democrat_Feeling_About_Immigrants$Feeling_About_Immigrants, 40) %>%
  mean(na.rm = TRUE)) %>%
  data.frame() %>%
  rename("mean" = 1)

Republican_Sample_Dist = replicate(10000, sample(Republican_Feeling_About_Immigrants$Feeling_About_Immigrants, 40) %>%
  mean(na.rm = TRUE)) %>%
  data.frame() %>%
  rename("mean" = 1)

ggplot()+
  geom_histogram(data = Democrat_Sample_Dist, aes(x = mean), fill = "red") +
  geom_histogram(data = Republican_Sample_Dist, aes(x = mean), fill = "blue")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Interpretation:

It’s clearly showing here that, Republicans range of feeling toward immigrants are (40-65) where for Democrates it’s (65-80).

(e) Statistical Test: T-test

options(scipen = 999)
t.test(Feeling_About_Immigrants~Political_Party, data = Immigrants_Data )

## 
##  Welch Two Sample t-test
## 
## data:  Feeling_About_Immigrants by Political_Party
## t = 15.007, df = 1867.8, p-value < 0.00000000000000022
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  15.02043 19.53651
## sample estimates:
##   mean in group Democrat mean in group Republican 
##                 70.00456                 52.72609

Interpretation:

There is a statistically significant difference between Republican & Democrates in their rating (0-100) of feeling towards immigrants.

7. Conclusions

After analysing both polital parties based on 4 different variables we end up with the conclusion that, “Democrats attitude/behavior is more positive and favorable towards immigrants”. Since Democrats are favoring more to provide legal way for immigrants, Democrats want the process of immigratition should be easier, and they oppose banning of Muslims. Moreover, Democrats’ feeling toward immigrants is much more than Republicans.

Recoding Variables (Final Report)

Sakif Shadman

1. Introduction

2. Description of Variables & Data Preparation

(a) Import Data

(b) Recoding Data

3. Analysis: Political_Party (IV) x Immigrants_Naturalize (DV)

(a) Crosstab

Interpretation:

(b) Visualization: Stacked barchart

Interpretation:

(c) Statistical Test: Chi-square test

Interpretation:

4. Analysis: Political_Party (IV) x Immigrants_Hardness (DV)

(a) Crosstab

Interpretation:

(b) Visualization: Stacked barchart

Interpretation:

(c) Statistical Test: Chi-square test

Interpretation:

5. Analysis: Political_Party (IV) x Banning_Muslims (DV)

(a) Crosstab

Interpretation:

(b) Visualization: Stacked barchart

Interpretation:

(c) Statistical Test: Chi-square test

Interpretation:

6. Analysis: Political_Party (IV) x Feeling_About_Immigrants (DV)

(a) Table comparing means

Interpretation:

(b) Visualization: Bar chart comparing means.

Interpretation:

(c) Visualization: Histogram comparing population distributions

Interpretation:

(d) Visualization: Histogram comparing sampling distributions

Interpretation:

(e) Statistical Test: T-test

Interpretation:

7. Conclusions