This data analysis and research focuses on the viewpoints on immigration from an educational perspective. This analysis uses data from the 2018 voter survey that was conducted by the survey firm YouGov. YouGov’s survey gathered and studied the responses of 6,005 adults from ages eighteen onwards on common issues ranging from political candidate peferrence to varying issues and rating their importance. According to the YouGov survey the reported margin of error is plus or minus two percent. For the purposes of this research one of the variables that will be studied is immigration. This data analysis research intends to determine if a respondents educational level affects their view of immigrants and the issue of immigration. This study is trying to determine if one’s educational level affects their tolerance for the immigration issue. The variables being used to determine if there is any correlation between eudcation level and immigration are as follows:

“educ_2018” an ordinal variable that lists respondents’ educational level from those who never went to high school to those who graduated from a post graduate program. For the purposes of this study it will focus solely on only two categories, “No HS” (those who did not go to high school) and “Post-grad” (those who went to a post graduate program). The variable was recoded to “edulevel”

“imiss_c_2017” an ordinal variable that determines whether or not the issue of immigration is immportant to the respondent. The variable was recoded to “immigstance”

“immi_contribution_a_2017” an ordinal variable that determines how respondents view immigrants contribution to the US economy. Recoded to “immigcon”

“immi_naturalize_2017” this variable determines if respondents favor or oppose making a legal way for illegal immigrants already residing in the United States to become United States citizens. recoded to “immignatural”

“immi_makedifficult_2017” askS respondents whether or not they would like to make immigrating to the United States for foreigners easier or harder. Recoded to “immigharder”

and finally, “ft_immig_2017” an contiunous variable that measures respondents feelings towards immigrants from the value zero to one hundred. Zero being a negative view and one hundred being a postive view of immigrants. Recoded to “immi_feel”

To begin a crosstabulation was conducted to determine repondents feeling towards immigration based on their educational level. The null hypothesis analysis was also conducted for comparison. Subsequently after the sample distrubtion was conducted for those who never went to high school and those that went to post graduate program, the actual frequency distribution study was conduted and the null hypothesis was tested for each of the variables selected for the study to see what the result would be. Here are the results as follows:

As it is shown in the results below the null hypothesis for each variable those value percents will be obtained if there is no statistical significance between eudcation level and the immigration variables selected. Meaning that each variable is completely independent of each other in short no dependence. For example if we look at the null hypothesis for immigrant contribution we see that the value for mostly a drain will be thirty-seven percent. If the actual distribution is the same then we will have to accept the null hypothesis however, is the numbers are significant different we will have to reject the null hypothesis.

All these aforementioned variables after being recoded were anaylized to determine if there was any statistical significance between the independent variable eudcation level and the dependent variable immigration. Using the summarize command, the average from zero to one hundred to rate the feeling towards “immigrants” by survey respondents based on their educational level. A T.test was also conducted to determine statistical significance. See table and data below:

Average Feelings Towards Immigrants and T. Test for Significance

votecensus%>%

  group_by(edulevel)%>%
  summarize(avg_ft_immig = mean(immi_feel, na.rm = TRUE))%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
edulevel avg_ft_immig
No HS 46.81176
Post-grad 68.44562
t.test(immi_feel~edulevel, data = votecensus)
## 
##  Welch Two Sample t-test
## 
## data:  immi_feel by edulevel
## t = -6.6384, df = 98.701, p-value = 0.000000001731
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -28.10046 -15.16726
## sample estimates:
##     mean in group No HS mean in group Post-grad 
##                46.81176                68.44562
table(votecensus$edulevel)%>%
  prop.table()%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Var1 Freq
No HS 0.2
Post-grad 0.8

If the median score is fifty which indicate neutrality we can see that there is some correlation between education and immigration. The data shows that those who did not attend high schools had an average of 46 which skews to the negative end of the spectrum. While those who went to a post grad program had an average of 68 which skews more towards the postive end of the spectrum. The p-value result obtained from the t-test conducted is less that .05 which means there is statistical significance between educaion level and feeling towards immigrants.It also should be noted that those numbers are far differen from the null hypothesis shown above of eighty percent for post grads and twnety percent for those with no high school. Secondly, a sample distribution was conducted by selecting random forty samples and replicating the results ten thousand times. Many samples where drawn from a set of numbers with variance taking the average of each sample to see the normal distribution. The sample distribution was conducted for both those who did not attend high school and those that attended a post graduate program. See histograms below:

Sample Distributions for No High School and Post Grads

voternoHS <-votecensus%>%
  filter(edulevel  == "No HS")

replicate(10000, sample(voternoHS$immi_feel,40)%>%mean(na.rm=TRUE))%>%
  data.frame()%>%
  rename("mean"=1)%>%
  ggplot()+
  geom_histogram(aes(x=mean), fill ="red")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

voterPostGrad <-votecensus%>%
  filter(edulevel  == "Post-grad")
replicate(10000, sample(voterPostGrad$immi_feel,40)%>%mean(na.rm=TRUE))%>%
  data.frame()%>%
  rename("mean"=1)%>%
  ggplot()+
  geom_histogram(aes(x=mean), fill ="blue")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The sample distribution for those who did not attend high school is indicated by the color red. As shown in the histogram the responses from respondents who fall in this category spreads mostly between the means of thirty and sixty-five. Most of the distribution is spread mostly under the the neutral mark of fifty which indicates that those who did not attend high school have negative feelings of immigrants. The sample distribution for those who attended post graduate school is indicated by the color blue. As shown by the histogram the scores from respondents hover between the scores sixty and eighty. Most of the distribution is spread mostly under toward the mark of sixty five onwards which indicates that those who attended post graduate programs have postive feelings towards immigrants. Next a series of crosstabs were conducted between the categorical variable “education level” and other immigration related categorical variables to determine if the actual frequency distribution is different that the null hypothesises above. Addtionally, Chi Square testS were conducted to determine in each instance if those two categorical variables had any coorelation. The first crosstab was conducted between educational level and the recoded variable immigration stance “immigstance” which determines respondents’ feelings towards the issue of immigration. See below:

Immigration Importance Issue

table(votecensus$immigstance, votecensus$edulevel)%>%
  prop.table(2)%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
No HS Post-grad
Not Very Important 0.16 0.14
Somewhat Important 0.24 0.43
Unimportant 0.06 0.02
Very Important 0.54 0.41
 chisq.test(votecensus$edulevel, votecensus$immigstance)
## Warning in chisq.test(votecensus$edulevel, votecensus$immigstance): Chi-squared
## approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  votecensus$edulevel and votecensus$immigstance
## X-squared = 13.891, df = 3, p-value = 0.003058
 table(votecensus$immigstance)%>%
  prop.table()%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Var1 Freq
Not Very Important 0.14
Somewhat Important 0.41
Unimportant 0.03
Very Important 0.42

The results of the crosstabulation indicated that fifty four percent of those who did not attend high school thought the issue of immigration was “very immportant.” While those who attended a post gradruate program thought the issue of immigration was very important were forty one percent. Its should also be noted that respondents from the post grauated education level category also had about a forty three percent reponse on the issue of immigration importance in the “somewhat immportant” category. If both the somewhat important and the very important categories where to be totaled for both groups of repondents. The result for those who did not attend high school would be seventy-eight percent while for post grads it would be eighty-four percent. While there is not much disparity in those numbers. The results for post grads are higher on the postive scale to view immigration favorably. The chisquare test conducted indicates that there is statisitcal siginificance between repondents educational level and their views on the issue of immigration as the p-value was 0.003058 which is less that .05. The value for those who did not attend high school of fifty four percent is also over the null hypothesis of forty two percent. The next crosstabluation conducted to see if there is any correlation between repondents views on immigrant contribution to the United States economy is affected by their educational level. See table below:

Immigration Contribution

table(votecensus$immigcon, votecensus$edulevel)%>%
  prop.table(2)%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
No HS Post-grad
Don’t know 0.12 0.04
Mostly a drain 0.59 0.35
Mostly make a contribution 0.16 0.53
Neither 0.12 0.08
chisq.test(votecensus$edulevel, votecensus$immigcon)
## Warning in chisq.test(votecensus$edulevel, votecensus$immigcon): Chi-squared
## approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  votecensus$edulevel and votecensus$immigcon
## X-squared = 26.118, df = 3, p-value = 0.000009011
table(votecensus$immigcon)%>%
  prop.table()%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Var1 Freq
Don’t know 0.05
Mostly a drain 0.37
Mostly make a contribution 0.49
Neither 0.08

As shown in the table thrity-five percent of those that went to a post graduate program believe that immigrants are mostly a drain on the American economy while fifty-three percent are of the view that immigrants mostly make a contribution. While sixteen percent of those who did not attended high school believe that immigrants make a contribution while fifty-nine percent of those who did not attend high school believe that immigrant are mostly a drain on the American economy. There is clearly some correlation between the independent variable education and the dependent variable immigration. As it is shown the values for both mostly a drain and mostly a contribution are over the null different from the null hypothesis in each instance. The next crosstsabulation conducted to see if respondent views towards making a legal way for illegal immigrants to become naturalized citizens. See table below:

Illegal Immigrant Naturalization

table(votecensus$immignatural, votecensus$edulevel)%>%
  prop.table(2)%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
No HS Post-grad
Don’t know 0.01 0.01
Favor 0.44 0.68
Oppose 0.55 0.31
chisq.test(votecensus$edulevel, votecensus$immignatural)
## Warning in chisq.test(votecensus$edulevel, votecensus$immignatural): Chi-squared
## approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  votecensus$edulevel and votecensus$immignatural
## X-squared = 20.055, df = 2, p-value = 0.00004417
table(votecensus$immignatural)%>%
  prop.table()%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Var1 Freq
Don’t know 0.01
Favor 0.66
Oppose 0.33

The crosstabulation between eudcation level and the recoded variable immigration naturalization “immignatural” indicates that of those who did not attend high school fifty-five percent are opposed to the idea of making a way for illegal immigrants to become United States citizens. While sixty-eight percent of those who attended post graduate program favored the idea of making a way for illegal immigrant to become a naturalized citizen. Here again the actual distribution differs from the null hypothesis proving that there most be some correlation. The final cross tabulation conducted was to determine if there was any connection between a respondents education level and their views on making immigrating to the United States harder. The following were the results:

Making Immigration Harder

table(votecensus$immigharder, votecensus$edulevel)%>%
  prop.table(2)%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
No HS Post-grad
Don’t know 0.04 0.04
Much easier 0.04 0.13
Much harder 0.43 0.14
No change 0.22 0.29
Slightly easier 0.09 0.23
Slightly harder 0.18 0.16
chisq.test(votecensus$edulevel, votecensus$immigharder)
## Warning in chisq.test(votecensus$edulevel, votecensus$immigharder): Chi-squared
## approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  votecensus$edulevel and votecensus$immigharder
## X-squared = 54.542, df = 5, p-value = 0.0000000001622
table(votecensus$immigharder)%>%
  prop.table()%>%
  round(2)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Var1 Freq
Don’t know 0.04
Much easier 0.12
Much harder 0.17
No change 0.28
Slightly easier 0.22
Slightly harder 0.17

those who did not attend high school forty three percent of those respondent wanted it to much harder for immigants to immigrate to the United States compared to post grads fourteen percent. Thirteen percent of those that attended post graduate porgrams wanted immigration to be much easier compared to those with no high school four percent. However, there was a smaller gap in number when respondents were rated on the no change category both post grads and those with no high school hover aound the mark of twenty to thirty percent. Chisquare test where conducted for all these cross tabulation aforementioned and in each instance the p-value was less than .05 which indicates statistical significance for each categorical variable when it was crosstublated with the independent varaible education. Once again the actual frequency distribution differs from the null hypothesis. Therefore, these variable are not independent of each other. The null hypothesis tables when compared to the actual distrubtion in each instance proves that there is correlation with eudcation level and all the the variables selected when looking at immigration. All these generated crosstabs, tables and average indicates to the researcher that there is statistical significance as a repondents educational level does affect how they view education. The more years the repondents spend in school the more likely they are to favor immigrants and the immigration issue. This could be do to the access to knowledge. As the more informed an individual is the more rational and open-mind their line of reasoning might be.

Conclusion

In closing, this study finds that there is statistical significance in the correlation between the repondent’s education level and their views an attitudes towards immigration. Based on the data it determines that the longer the repondent spent completing their education made the repondent more tolerant of immigrant and immigration issues. Due to their access to information their train of thought maybe more rational as a result making the respondent more openminded. This was determined by filtering those respondents in the study at the lowest educational level, those without an high school diploma and comparing to those at the highest educational level in the study, those who went to a post graduate program. Based on the variables selected and the chisquare tests in each instance, and t. test conducted the p-value was less than .05 every time. Therefore, the correlation is deemed statistically significant. The independent variable X= education level does correlated to the dependent variable Y= Views on Immigration. Therefore, the null hypothesis has to be rejected.