Research Question

Is there a statistically significant difference between the mean feeling towards feminism in females and males?

Variables

Independent Variable

gender

This variable will be separated into two groups: “Female” and “Male”.

Dependent Variable

ft_fem_2017

Respondents rated their feeling towards feminism using a 0-100 scale; 100 indicates absolute positive feeling, 50 indicates neutral feeling, and 0 implies absolutely negative feeling.

Data Prep

library(readr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)
Voter_Dataset<- read_csv("Abbreviated Voter Dataset Labeled.csv")

## 
## ── Column specification ─────────────────────────────────────────────────────────
## cols(
##   .default = col_character(),
##   NumChildren = col_double(),
##   Immigr_Economy_GiveTake = col_double(),
##   ft_fem_2017 = col_double(),
##   ft_immig_2017 = col_double(),
##   ft_police_2017 = col_double(),
##   ft_dem_2017 = col_double(),
##   ft_rep_2017 = col_double(),
##   ft_evang_2017 = col_double(),
##   ft_muslim_2017 = col_double(),
##   ft_jew_2017 = col_double(),
##   ft_christ_2017 = col_double(),
##   ft_gays_2017 = col_double(),
##   ft_unions_2017 = col_double(),
##   ft_altright_2017 = col_double(),
##   ft_black_2017 = col_double(),
##   ft_white_2017 = col_double(),
##   ft_hisp_2017 = col_double()
## )
## ℹ Use `spec()` for the full column specifications.

head(Voter_Dataset)

## # A tibble: 6 x 53
##   gender race  education familyincome children region urbancity Vote2012
##   <chr>  <chr> <chr>     <chr>        <chr>    <chr>  <chr>     <chr>   
## 1 Female White 4-year    Prefer not … No       West   Suburb    Barack …
## 2 Female White Some Col… $60K-$69,999 No       West   Rural Ar… Mitt Ro…
## 3 Male   White High Sch… $50K-$59,999 No       Midwe… City      Mitt Ro…
## 4 Male   White Some Col… $70K-$79,999 No       South  City      Barack …
## 5 Male   White 4-year    $40K-$49,999 No       South  Suburb    Mitt Ro…
## 6 Female White 2-year    $30K-$39,999 No       West   Suburb    Barack …
## # … with 45 more variables: Vote2016 <chr>, TrumpSanders <chr>,
## #   PartyRegistration <chr>, PartyIdentification <chr>,
## #   PartyIdentification2 <chr>, PartyIdentification3 <chr>,
## #   NewsPublicAffairs <chr>, DemPrimary <chr>, RepPrimary <chr>,
## #   ImmigrantContributions <chr>, ImmigrantNaturalization <chr>,
## #   ImmigrationShouldBe <chr>, Abortion <chr>, GayMarriage <chr>,
## #   DeathPenalty <chr>, DeathPenaltyFreq <chr>, TaxWealthy <chr>,
## #   Healthcare <chr>, GlobWarmExist <chr>, GlobWarmingSerious <chr>,
## #   AffirmativeAction <chr>, Religion <chr>, ReligiousImportance <chr>,
## #   ChurchAttendance <chr>, PrayerFrequency <chr>, NumChildren <dbl>,
## #   areatype <chr>, GunOwnership <chr>, EconomyBetterWorse <chr>,
## #   Immigr_Economy_GiveTake <dbl>, ft_fem_2017 <dbl>, ft_immig_2017 <dbl>,
## #   ft_police_2017 <dbl>, ft_dem_2017 <dbl>, ft_rep_2017 <dbl>,
## #   ft_evang_2017 <dbl>, ft_muslim_2017 <dbl>, ft_jew_2017 <dbl>,
## #   ft_christ_2017 <dbl>, ft_gays_2017 <dbl>, ft_unions_2017 <dbl>,
## #   ft_altright_2017 <dbl>, ft_black_2017 <dbl>, ft_white_2017 <dbl>,
## #   ft_hisp_2017 <dbl>

gen_fem<-Voter_Dataset%>%
  select(gender,ft_fem_2017)%>%
  filter(!is.na(ft_fem_2017))

Comparison of Means

Table

Null Hypothesis

gen_fem%>%
  summarise(avg_ft_fem=mean(ft_fem_2017))

## # A tibble: 1 x 1
##   avg_ft_fem
##        <dbl>
## 1       52.1

If a person’s gender does not make a difference in their feeling toward feminism, it is expected that the group-wise averages for both “Females” and “Males” are near 52.1.

Actual Observations

gen_fem%>%
  group_by(gender)%>%
  summarise(avg_ft_fem=mean(ft_fem_2017))

## `summarise()` ungrouping output (override with `.groups` argument)

## # A tibble: 2 x 2
##   gender avg_ft_fem
##   <chr>       <dbl>
## 1 Female       58.3
## 2 Male         45.3

Visualization

gen_fem%>%
  group_by(gender)%>%
  summarise(avg_ft_fem=mean(ft_fem_2017))%>%
  ggplot()+geom_col(aes(x=gender,y=avg_ft_fem,fill=gender))

## `summarise()` ungrouping output (override with `.groups` argument)

Interpretation

As shown by the actual observations table and the chart above, the group-wise mean feeling towards feminism for females and males are almost the same distance from 52.1, in opposite directions. Therefore, the conclusion that there is no difference in the mean between groups cannot be made.

Comparison of Distributions

Visualization

gen_fem%>%
  ggplot()+ geom_histogram(aes(x=ft_fem_2017,fill=gender))+facet_wrap(~gender)

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Interpretation

The sampling distributions of the females and males in this study differ in their skew. The sampling distribution of female respondents is right skewed when it comes to their feeling towards feminism. However, that of male respondents is skewed more to the left.

Sampling Distribution

Females

F_gen_fem<-gen_fem%>%
  filter(gender=="Female")

mean(F_gen_fem$ft_fem_2017)

## [1] 58.34626

replicate(10000,
          sample(F_gen_fem$ft_fem_2017,40)%>%
  mean(na.rm=TRUE))%>%
  data.frame()%>%
  rename("mean"=1)%>%
  ggplot()+geom_histogram(aes(x=mean),fill="salmon")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Males

M_gen_fem<-gen_fem%>%
  filter(gender=="Male")

mean(M_gen_fem$ft_fem_2017)

## [1] 45.34614

replicate(10000,
          sample(M_gen_fem$ft_fem_2017,40)%>%
  mean(na.rm=TRUE))%>%
  data.frame()%>%
  rename("mean"=1)%>%
  ggplot()+geom_histogram(aes(x=mean),fill="turquoise3")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

T-test

options(scipen = 999)
t.test(ft_fem_2017~gender,data=gen_fem)

## 
##  Welch Two Sample t-test
## 
## data:  ft_fem_2017 by gender
## t = 13.787, df = 4682.9, p-value < 0.00000000000000022
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  11.15155 14.84871
## sample estimates:
## mean in group Female   mean in group Male 
##             58.34626             45.34614

The t-test produced a p-value of less than 0.05. Therefore, there is a statistically significant difference between the mean feeling towards feminism of females and males.

Analysis of Continuous Data

Victoria Sparandera

Research Question

Variables

Independent Variable

Dependent Variable

Data Prep

Comparison of Means

Table

Null Hypothesis

Actual Observations

Visualization

Interpretation

Comparison of Distributions

Visualization

Interpretation

Sampling Distribution

Females

Males

T-test