Title: Homework 10
Author: Brandon Flores
Date: Nov. 17th, 2021
For this analysis the dependent variable will be a "feeling scale" from 0 to 100 on how they would rate Planned Parenthood. The key predictor variables that will be used are age, gender, and cohabitation. Each of these variables will be in seperate OLS regression models. This is to help observe the association between feelings toward Planned Parenthood and these predictors. 

The first regresion model shown is between age and feelings toward Planned Parenthood. When the x variable is at 0 to intercept rests at a rating of 76. This shows a relativley high baseline for the relationship between the two variables with younger populations. The association between these two variables is statistically significant at the .001 (***) level with a slope coefficient of -0.21599. With every year increase in age the feelings toward Planned Parenthood decrease by about .216. This negative associations shows that as you increase in age feelings towards Planned Parenthood decrease. This association explains little of the total interaction with the dependent variable with a r-square of 0.01538. This shows that the variable in the model explains about 1.5%  of the interaction between Planned Parenthood and other potential predictor variables. These findings although statsitiscally signficant they are of little influence due to the extreamly small r-square. Yet it is still to be shown that with age one tends to not favor Planned Parenthood as they would in younger age groups. This follows the concepts of increasing conservative beliefs politically as one increases with age. Usually those older age groups are more conservative in ideologies than their younger counterparts.

The second regression model covers the association between feelings toward Planned Parenthood and gender. Once gender was dummy coded with men being 0 and women being 1; it was placed in the model which showed a statsitically significant relationship at the .001 (***) level. The intercept showed a value of 60 being that men at baseline generally rated Planned Parenthood at this rateing. The slope coefficient between the variables as at 9.0234. This shows a postive association between the two variables being that as you move from men to women the rating towards Planned Parenthood increases by about 9. The r-square of this association is at 0.01882 meaning that about 1.8% of the interaction between Planned Parenthood is explained by the variable in the model. Again a small r-square is observed yet the association is statstically significant being that women tend to rate Planned Parenthood at higher more favorbale ratings than their male counterparts. This is also consistent with the literature being that Planned Parenthood is reflective of increased womens rights thus being more favorable from a womens perspective. The statstical signficant relationship also signals men who may not show as high of a rating as women; possibly for more conservative view points. 

For the third regression model a OLS regression was conducted between ratings of Planned Parenthood and Cohabitaiton status. Cohabitation was dummy coded with 0 being living with a partner and 1 not living with a partner. This relationship is statstically significant a the .001 (***) level showing a negative association. When x is zero the intercept is about 73. This means that those who cohbitate have a baseline Planned Parenthood rating of 73. With a slope coefficient of -9.213 this shows that as you move from those who cohabitate to those who do not, the rating towards Planned Parenthood drops by about 9. The r-square of this model is 0.01258 being that only 1.2% of the interaction is explained between Y and the variable in the model. This again is a small r-square. This model shows that those who do not cohabitate tend not to favor Planned Parenthood unlike their cohbaitating counterparts who do show high favor towards Planned Parenthood. This could be becasue those who are in cohbitating relationship may be more pro-choice and tend to need contrceptives more so than those who are not living with a partner. Also those who cohabitate may tend to hold more liberal views toward Planned Parenthood being that more conservatively relgious persons may not cohabitate due to their norms and may not hold Planned Parenthood to high ratings because of those same religious norms. 
library(haven)
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(scales)
library(sur)
anes2020<-read_dta("C:\\Users\\BTP\\Downloads\\anes2020.dta")
anes2020 <- filter(anes2020, V202185 >= 0 & V202185 <= 100)
anes2020 <- filter(anes2020, V201600 >= 1)
anes2020 <- filter(anes2020, V201507x >= 18)
anes2020 <- filter(anes2020, V201508 >= 1 & V201508 <= 6)
anes2020 <- filter(anes2020, V201509 >= 1 & V201509 <=2)
anes2020 %>% 
  
tabyl (V201508)
##  V201508    n    percent
##        3  465 0.14334155
##        4 1048 0.32305795
##        5  124 0.03822441
##        6 1607 0.49537608
anes2020 %>%
ggplot(mapping = aes(V202185))+
geom_histogram()+
ggtitle(label="Planned Parenthood Rating Distribution")+
xlab(label="Planned Parenthood Feelings")
## Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

anes2020 %>% 
ggplot(mapping = aes(V202185, stat=..density..))+geom_density()+ggtitle(label="Planned Parenthood Distribution")+xlab("Planned Parenethood Feelings")
## Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous.

qqnorm(anes2020$V202185)

ggplot(anes2020) + geom_point(mapping = aes(x=V201507x, y=V202185))
## Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous.
## Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous.

scatter.smooth(anes2020$V201507x,anes2020$V202185)

lmAgePParent = lm(V202185~V201507x, data = anes2020)
summary(lmAgePParent)
## 
## Call:
## lm(formula = V202185 ~ V201507x, data = anes2020)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -72.746 -18.696   6.973  28.076  40.645 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 76.63399    1.58330  48.401  < 2e-16 ***
## V201507x    -0.21599    0.03005  -7.187  8.2e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.73 on 3242 degrees of freedom
## Multiple R-squared:  0.01568,    Adjusted R-squared:  0.01538 
## F-statistic: 51.65 on 1 and 3242 DF,  p-value: 8.205e-13
anes2020 %>% 
  ggplot(mapping=aes(y=V202185, x=factor(V201600)))+
  geom_boxplot()+ 
  ggtitle(label="Distribution of Planned Parenthood Feelings by Gender") +
  xlab(label="Planned Parenthood Feelings")
## Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous.

anes2020$gender.f <- factor(anes2020$V201600)
tapply(anes2020$V202185, anes2020$gender.f, mean)
##        1        2 
## 60.57011 69.59353
contr.treatment(2)
##   2
## 1 0
## 2 1
contrasts(anes2020$gender.f) = contr.treatment(2)
summary(lm(V202185~gender.f, anes2020))
## 
## Call:
## lm(formula = V202185 ~ gender.f, data = anes2020)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -69.59 -19.59   9.43  30.41  39.43 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  60.5701     0.8792  68.894  < 2e-16 ***
## gender.f2     9.0234     1.1351   7.949 2.57e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.67 on 3242 degrees of freedom
## Multiple R-squared:  0.01912,    Adjusted R-squared:  0.01882 
## F-statistic: 63.19 on 1 and 3242 DF,  p-value: 2.567e-15
anes2020$cohab.f <- factor(anes2020$V201509)

contr.treatment(2)
##   2
## 1 0
## 2 1
contrasts(anes2020$cohab.f) = contr.treatment(2)
summary(lm(V202185~cohab.f, anes2020))
## 
## Call:
## lm(formula = V202185 ~ cohab.f, data = anes2020)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -73.427 -14.214   5.786  26.573  35.786 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   73.427      1.273  57.678  < 2e-16 ***
## cohab.f2      -9.213      1.416  -6.505 8.95e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.78 on 3242 degrees of freedom
## Multiple R-squared:  0.01288,    Adjusted R-squared:  0.01258 
## F-statistic: 42.32 on 1 and 3242 DF,  p-value: 8.952e-11
anes2020 %>% 
  ggplot(mapping=aes(y=V202185, x=factor(cohab.f)))+ 
  geom_boxplot()+
  ggtitle(label="Distribution of Planned Parenthood Feelings by Cohab Status")+
  xlab(label="Planned Parenthood Feelings")
## Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous.