I am going to use General Social Survey (GSS) data.
General Social Survey (GSS): A sociological survey used to collect data on demographic characteristics and attitudes of residents of the United States. The codebook below lists all variables, the values they take, and the survey questions associated with them. There are a total of 57,061 cases and 114 variables in this dataset. Note that this is a cumulative data file for surveys conducted between 1972 - 2012 and that not all respondents answered all questions in all years.
Here is the names of variables from the dataset.
## [1] "caseid" "year" "age" "sex" "race" "hispanic"
## [7] "uscitzn" "educ" "paeduc" "maeduc" "speduc" "degree"
## [13] "vetyears" "sei" "wrkstat" "wrkslf" "marital" "spwrksta"
## [19] "sibs" "childs" "agekdbrn" "incom16" "born" "parborn"
## [25] "granborn" "income06" "coninc" "region" "partyid" "polviews"
## [31] "relig" "attend" "natspac" "natenvir" "natheal" "natcity"
## [37] "natcrime" "natdrug" "nateduc" "natrace" "natarms" "nataid"
## [43] "natfare" "natroad" "natsoc" "natmass" "natpark" "confinan"
## [49] "conbus" "conclerg" "coneduc" "confed" "conlabor" "conpress"
## [55] "conmedic" "contv" "conjudge" "consci" "conlegis" "conarmy"
## [61] "joblose" "jobfind" "satjob" "richwork" "jobinc" "jobsec"
## [67] "jobhour" "jobpromo" "jobmeans" "class" "rank" "satfin"
## [73] "finalter" "finrela" "unemp" "govaid" "getaid" "union"
## [79] "getahead" "parsol" "kidssol" "abdefect" "abnomore" "abhlth"
## [85] "abpoor" "abrape" "absingle" "abany" "pillok" "sexeduc"
## [91] "divlaw" "premarsx" "teensex" "xmarsex" "homosex" "suicide1"
## [97] "suicide2" "suicide3" "suicide4" "fear" "owngun" "pistol"
## [103] "shotgun" "rifle" "news" "tvhours" "racdif1" "racdif2"
## [109] "racdif3" "racdif4" "helppoor" "helpnot" "helpsick" "helpblk"
My research question is if educated people are more satisfied with their financial situation.
I choose the dataset which the course instructor provides for us, so I follow the citation.
The case in this dataset is the unit of observation, one person who took the survey. Each case has one unique caseid.
Smith, Tom W., Michael Hout, and Peter V. Marsden. General Social Survey, 1972-2012 [Cumulative File]. ICPSR34802-v1. Storrs, CT: Roper Center for Public Opinion Research, University of Connecticut /Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributors], 2013-09-11. doi:10.3886/ICPSR34802.v1
Persistent URL: http://doi.org/10.3886/ICPSR34802.v1
[Excerpted from the GSS project description]
Since 1972, the General Social Survey (GSS) has been monitoring societal change and studying the growing complexity of American society. The GSS aims to gather data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes; to examine the structure and functioning of society in general as well as the role played by relevant subgroups; to compare the United States to other societies in order to place American society in comparative perspective and develop cross-national models of human society; and to make high-quality data easily accessible to scholars, students, policy makers, and others, with minimal cost and waiting.
GSS questions cover a diverse range of issues including national spending priorities, marijuana use, crime and punishment, race relations, quality of life, confidence in institutions, and sexual behavior.
I am going to look at two categorical variables.
RS HIGHEST DEGREE
If finished 9th-12th grade: Did you ever get a high school diploma or a GED certificate? VALUE LABEL 0 LT HIGH SCHOOL 1 HIGH SCHOOL 2 JUNIOR COLLEGE 3 BACHELOR 4 GRADUATE NA IAP NA DK NA NA
Data type: numeric Missing-data codes: 7,8,9 Record/column: 1/36
SATFIN
SATISFACTION WITH FINANCIAL SITUATION
We are interested in how people are getting along financially these days. So far as you and your family are concerned, would you say that you are pretty well satisfied with your present financial situation, more or less satisfied, or not satisfied at all? VALUE LABEL NA IAP 1 SATISFIED 2 MORE OR LESS 3 NOT AT ALL SAT NA DK NA NA
Data type: numeric Missing-data codes: 0,8,9 Record/column: 1/169
library(ggplot2)
library(knitr)
qplot(degree, data= gss, geom= "histogram", main = "distribution of education" )
qplot(satfin, data= gss, geom= "histogram", main = "distribution of satisfaction with financial satisfactio")
kable(with(gss, table(degree, satfin)))
##
##
## | | Satisfied| More Or Less| Not At All Sat|
## |:--------------|---------:|------------:|--------------:|
## |Lt High School | 3065| 4710| 3388|
## |High School | 7162| 12068| 7670|
## |Junior College | 669| 1332| 727|
## |Bachelor | 2669| 3171| 1393|
## |Graduate | 1504| 1473| 497|
mosaicplot(with(gss, table(degree, satfin)), main = "Degree and Financial Satisfaction", color = 3:5, las = 2)
Insert inference section here…
Insert conclusion here…