The General Social Survey (GSS) is part of a continuing study of American public opinion and values since 1972. The purpose is to study the trends in attitudes and behaviors of the American society based on the information gathered.
There are 57061 rows and 114 of variables in the dataset
## Rows: 57,061
## Columns: 114
## $ caseid <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18~
## $ year <int> 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1~
## $ age <int> 23, 70, 48, 27, 61, 26, 28, 27, 21, 30, 30, 56, 54, 49, 41, 5~
## $ sex <fct> Female, Male, Female, Female, Female, Male, Male, Male, Femal~
## $ race <fct> White, White, White, White, White, White, White, White, Black~
## $ hispanic <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ uscitzn <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ educ <int> 16, 10, 12, 17, 12, 14, 13, 16, 12, 12, 13, 6, 9, 8, 9, 14, 1~
## $ paeduc <int> 10, 8, 8, 16, 8, 18, 16, 16, 12, 10, 12, NA, 5, NA, NA, NA, 8~
## $ maeduc <int> NA, 8, 8, 12, 8, 19, 12, 14, 12, 7, NA, 8, 5, 10, 3, 0, 8, 8,~
## $ speduc <int> NA, 12, 11, 20, 12, NA, NA, NA, NA, 11, 12, 9, 8, NA, 8, 14, ~
## $ degree <fct> Bachelor, Lt High School, High School, Bachelor, High School,~
## $ vetyears <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ sei <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ wrkstat <fct> "Working Fulltime", "Retired", "Working Parttime", "Working F~
## $ wrkslf <fct> Someone Else, Someone Else, Someone Else, Someone Else, Someo~
## $ marital <fct> Never Married, Married, Married, Married, Married, Never Marr~
## $ spwrksta <fct> NA, "Keeping House", "Working Fulltime", "Working Fulltime", ~
## $ sibs <int> 3, 4, 5, 5, 2, 1, 7, 1, 2, 7, 7, 6, 2, 2, 0, 7, 0, 2, 2, 7, 2~
## $ childs <int> 0, 5, 4, 0, 2, 0, 2, 0, 2, 4, 1, 5, 1, 2, 5, 2, 2, 3, 3, 0, 2~
## $ agekdbrn <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ incom16 <fct> Average, Above Average, Average, Average, Below Average, Aver~
## $ born <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ parborn <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ granborn <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ income06 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ coninc <int> 25926, 33333, 33333, 41667, 69444, 60185, 50926, 18519, 3704,~
## $ region <fct> E. Nor. Central, E. Nor. Central, E. Nor. Central, E. Nor. Ce~
## $ partyid <fct> "Ind,Near Dem", "Not Str Democrat", "Independent", "Not Str D~
## $ polviews <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ relig <fct> Jewish, Catholic, Protestant, Other, Protestant, Protestant, ~
## $ attend <fct> Once A Year, Every Week, Once A Month, NA, NA, Once A Year, E~
## $ natspac <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natenvir <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natheal <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natcity <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natcrime <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natdrug <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ nateduc <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natrace <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natarms <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ nataid <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natfare <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natroad <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natsoc <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natmass <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ natpark <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ confinan <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conbus <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conclerg <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ coneduc <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ confed <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conlabor <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conpress <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conmedic <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ contv <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conjudge <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ consci <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conlegis <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ conarmy <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ joblose <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ jobfind <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ satjob <fct> A Little Dissat, NA, Mod. Satisfied, Very Satisfied, NA, Mod.~
## $ richwork <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ jobinc <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ jobsec <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ jobhour <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ jobpromo <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ jobmeans <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ class <fct> Middle Class, Middle Class, Working Class, Middle Class, Work~
## $ rank <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ satfin <fct> Not At All Sat, More Or Less, Satisfied, Not At All Sat, Sati~
## $ finalter <fct> Better, Stayed Same, Better, Stayed Same, Better, Better, Bet~
## $ finrela <fct> Average, Above Average, Average, Average, Above Average, Abov~
## $ unemp <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ govaid <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ getaid <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ union <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ getahead <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ parsol <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ kidssol <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ abdefect <fct> Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, No, No, Yes,~
## $ abnomore <fct> Yes, No, Yes, No, Yes, Yes, No, Yes, No, No, No, No, No, Yes,~
## $ abhlth <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, NA, No, No,~
## $ abpoor <fct> Yes, No, Yes, Yes, Yes, Yes, No, Yes, No, Yes, NA, Yes, No, Y~
## $ abrape <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, NA, Yes, NA, No, NA, ~
## $ absingle <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, No, No, NA, Yes, No, ~
## $ abany <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ pillok <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ sexeduc <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ divlaw <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ premarsx <fct> Not Wrong At All, Always Wrong, Always Wrong, Always Wrong, S~
## $ teensex <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ xmarsex <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ homosex <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ suicide1 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ suicide2 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ suicide3 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ suicide4 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ fear <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ owngun <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ pistol <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ shotgun <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ rifle <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ news <fct> Everyday, Everyday, Everyday, Once A Week, Everyday, Everyday~
## $ tvhours <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ racdif1 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ racdif2 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ racdif3 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ racdif4 <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ helppoor <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ helpnot <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ helpsick <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ helpblk <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## caseid year age sex race hispanic uscitzn educ paeduc maeduc speduc
## 1 1 1972 23 Female White <NA> <NA> 16 10 NA NA
## 2 2 1972 70 Male White <NA> <NA> 10 8 8 12
## 3 3 1972 48 Female White <NA> <NA> 12 8 8 11
## 4 4 1972 27 Female White <NA> <NA> 17 16 12 20
## 5 5 1972 61 Female White <NA> <NA> 12 8 8 12
## 6 6 1972 26 Male White <NA> <NA> 14 18 19 NA
## degree vetyears sei wrkstat wrkslf marital
## 1 Bachelor <NA> NA Working Fulltime Someone Else Never Married
## 2 Lt High School <NA> NA Retired Someone Else Married
## 3 High School <NA> NA Working Parttime Someone Else Married
## 4 Bachelor <NA> NA Working Fulltime Someone Else Married
## 5 High School <NA> NA Keeping House Someone Else Married
## 6 High School <NA> NA Working Fulltime Someone Else Never Married
## spwrksta sibs childs agekdbrn incom16 born parborn granborn
## 1 <NA> 3 0 NA Average <NA> <NA> NA
## 2 Keeping House 4 5 NA Above Average <NA> <NA> NA
## 3 Working Fulltime 5 4 NA Average <NA> <NA> NA
## 4 Working Fulltime 5 0 NA Average <NA> <NA> NA
## 5 Temp Not Working 2 2 NA Below Average <NA> <NA> NA
## 6 <NA> 1 0 NA Average <NA> <NA> NA
## income06 coninc region partyid polviews relig
## 1 <NA> 25926 E. Nor. Central Ind,Near Dem <NA> Jewish
## 2 <NA> 33333 E. Nor. Central Not Str Democrat <NA> Catholic
## 3 <NA> 33333 E. Nor. Central Independent <NA> Protestant
## 4 <NA> 41667 E. Nor. Central Not Str Democrat <NA> Other
## 5 <NA> 69444 E. Nor. Central Strong Democrat <NA> Protestant
## 6 <NA> 60185 E. Nor. Central Ind,Near Dem <NA> Protestant
## attend natspac natenvir natheal natcity natcrime natdrug nateduc
## 1 Once A Year <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 2 Every Week <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 Once A Month <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 Once A Year <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## natrace natarms nataid natfare natroad natsoc natmass natpark confinan conbus
## 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## conclerg coneduc confed conlabor conpress conmedic contv conjudge consci
## 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## conlegis conarmy joblose jobfind satjob richwork jobinc jobsec
## 1 <NA> <NA> <NA> <NA> A Little Dissat <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> Mod. Satisfied <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> Very Satisfied <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> Mod. Satisfied <NA> <NA> <NA>
## jobhour jobpromo jobmeans class rank satfin finalter
## 1 <NA> <NA> <NA> Middle Class NA Not At All Sat Better
## 2 <NA> <NA> <NA> Middle Class NA More Or Less Stayed Same
## 3 <NA> <NA> <NA> Working Class NA Satisfied Better
## 4 <NA> <NA> <NA> Middle Class NA Not At All Sat Stayed Same
## 5 <NA> <NA> <NA> Working Class NA Satisfied Better
## 6 <NA> <NA> <NA> Middle Class NA More Or Less Better
## finrela unemp govaid getaid union getahead parsol kidssol abdefect
## 1 Average <NA> <NA> <NA> <NA> <NA> <NA> <NA> Yes
## 2 Above Average <NA> <NA> <NA> <NA> <NA> <NA> <NA> Yes
## 3 Average <NA> <NA> <NA> <NA> <NA> <NA> <NA> Yes
## 4 Average <NA> <NA> <NA> <NA> <NA> <NA> <NA> No
## 5 Above Average <NA> <NA> <NA> <NA> <NA> <NA> <NA> Yes
## 6 Above Average <NA> <NA> <NA> <NA> <NA> <NA> <NA> Yes
## abnomore abhlth abpoor abrape absingle abany pillok sexeduc divlaw
## 1 Yes Yes Yes Yes Yes <NA> <NA> <NA> <NA>
## 2 No Yes No Yes Yes <NA> <NA> <NA> <NA>
## 3 Yes Yes Yes Yes Yes <NA> <NA> <NA> <NA>
## 4 No Yes Yes Yes Yes <NA> <NA> <NA> <NA>
## 5 Yes Yes Yes Yes Yes <NA> <NA> <NA> <NA>
## 6 Yes Yes Yes Yes Yes <NA> <NA> <NA> <NA>
## premarsx teensex xmarsex homosex suicide1 suicide2 suicide3 suicide4
## 1 Not Wrong At All <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 2 Always Wrong <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 3 Always Wrong <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 4 Always Wrong <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 5 Sometimes Wrong <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 6 Sometimes Wrong <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## fear owngun pistol shotgun rifle news tvhours racdif1 racdif2 racdif3
## 1 <NA> <NA> <NA> <NA> <NA> Everyday NA <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA> Everyday NA <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA> Everyday NA <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA> Once A Week NA <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA> Everyday NA <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA> Everyday NA <NA> <NA> <NA>
## racdif4 helppoor helpnot helpsick helpblk
## 1 <NA> <NA> <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA>
As the participants from across the United States were randomly selected based on the addresses for the survey and from each household, an adult member will be randomly selected to complete the interview, random sampling was used and we can assume that everyone in the community has an equal chance to be selected. Thus, generalizability is achieved. In other words, the findings can be generalized to the U.S population. However, as this is not a randomized experiment, random assignment was not used, causal relationship between the variables cannot be established, no causality can be inferred from these findings.
In some countries, female and male do not have equal chance to attend school, especially for higher education, less women can achieve higher education level. Women were expected to put the family first instead of achieving own successes. However, how is the situation in developed country like United States? It will be interesting to explore if there is some correlation between sex and degree levels within 1972 to 2012 in United States.
degree_df<- select(gss, c(sex, degree))
# count the sum of na values
total_navalues<- sum(is.na(degree_df))
total_navalues## [1] 1010
Let’s visualize the proportions in contingency table:
## Cell Contents
## |-------------------------|
## | Count |
## |-------------------------|
##
## ================================================================================
## degree_df$degree
## degr_df$sx Lt Hgh Sch High Schol Junir Cllg Bachelor Graduate Total
## --------------------------------------------------------------------------------
## Male 5153 12340 1272 3822 2091 24678
## --------------------------------------------------------------------------------
## Female 6669 16947 1798 4180 1779 31373
## --------------------------------------------------------------------------------
## Total 11822 29287 3070 8002 3870 56051
## ================================================================================
As shown above, we could see that majority of the participants of both sex completed high school as their highest education level followed by Lt High School. On the other hand, only minority of participants of both sex achieve Junior College followed by Graduate education.
## sex degree
## Male :24678 Lt High School:11822
## Female:31373 High School :29287
## Junior College: 3070
## Bachelor : 8002
## Graduate : 3870
##
## Lt High School High School Junior College Bachelor Graduate
## Male 0.09193413 0.22015664 0.02269362 0.06818790 0.03730531
## Female 0.11898093 0.30234965 0.03207793 0.07457494 0.03173895
In terms of proportions, the numbers of female taking part in the survey is higher than male. Thus, there is not much difference in proportion between both sex in attending Lt High School, High School and Junior College. However, it is obvious that male has higher proportion in receiving Graduate level education than female.
H0: There is no significant correlation between sex and degree level, sex and degree levels are independent.
HA: There is some significant correlation between sex and degree level, sex and degree levels are dependent.
Since both sex and degree level are categorical variables (degree level with more than 2 levels), chi-square test will be performed to test on the hypothesis.
Check on conditions:
As the participants were randomly selected in the study, random sampling were used without replacement. Since only 57061 samples were collected in this dataset, the number of participants are less than 10% of the population. Each case only contributes to one cell in the table. Thus, the independence condition is met.
Each particular scenario has at least 5 expected cases. The sample size condition is also met.
Therefore, chi-square goodness of fit test will be performed.
##
## Pearson's Chi-squared test
##
## data: degree_df$sex and degree_df$degree
## X-squared = 254.35, df = 4, p-value < 2.2e-16
Based on the chi-suqare test, the X-squared value is 254.35 and p-value is less than 0.05. Thus, we will reject the null hypothesis and conclude that the data provide strong evidence that there is some significant correlation between sex and degree level. In other words, sex and degree level are dependent.
However, there are na values from the data collected and that the proportion between both male and female is not equal. Also, the age of the participants plays a part as well on the level of education they are receiving at that time (some participants are below 20 years old). Thus, there is also possibility that the conclusion will be different because participants might still receive higher education in older age. For this specific question, it might be helpful to just analyze participants from older age category.