The data I decided to look at and explore ggplot with was from the Ecdat library in R Studio. The data set is called Fair. This data came from a survey done from two magazine in 1969 and 1974. It contains the following variables:
sex: a factor with levels (male, female)
age: in years [17.5 = under 20 22.0 = 20 - 24 27.0 = 25 - 29 32.0 = 30 - 34 37.0 = 35 - 39 42.0 = 40 - 44 47.0 = 45 - 49 52.0 = 50 - 54 57.0 = 55 or over]
ym: number of years married [0.125 = 3 months or less 0.417 = 4 - 6 months 0.750 = 6 months - 1 year 1.500 = 1 - 2 years 4.000 = 3 - 5 years 7.000 = 6 - 8 years 10.00 = 9 - 11 years 15.00 = 12 years or more ]
child: 0 = none, 1 = one or more
religious: how religious from 1(anti) to 5 (very)
education
occupation: occupation from 1 to 7 according to hollingshead classification (reverse numbering) “Hollingshead’s classification criteria may be briefly summarized: Class I.-Leisure; not labor, earn more than they can spend; class position by inheritance; rigid social code; education not highly regarded Class II.-Members of large independent professions, family-owned businesses; salaried executives for Class I enterprises; hyperactive civic leaders; most highly educated group; social position secured through own efforts Class III.-Primarily work for wages and salaries; own small businesses and farms; members of small, independent professions; all income spent, little savings; most nearly fit”American family" stereotype; use educational ladder to further social aspirations. Class IV.-Poor, but honest, hard workers, pay their taxes; never get ahead financially; backbone of community; children aspire to high school, but parents not entirely convinced of its value. Class V.-Looked down upon by all social classes; little respect for law; hold menial jobs; fatalistic about position; much poverty; resigned to lowly status in community; education limited usually to elementary school."
rate: self rating of marriage , from 1(very unhappy) to 5 (very happy) [1 = very unhappy 2 = somewhat unhappy 3 = average 4 = happier than average 5 = very happy]
nbaffairs: [How often respondent engaged in extramarital sexual intercourse during the past year 0 = none 1 = once 2 = twice 3 = three times 7 = 4 - 10 times 12 = monthly or more]
The results I believe we may find, after using ggplot to explore the date , would probably show age and how religious a person is to be contributing variables as to whether or not a married person has an affair. I don’t think we are going to see that big of a difference between males and females and I don’t think the number of years married will matter either. One issue with this data that may hinder some of our results is that since this is comning from two different surveys, one survey had 601 observations and the other had 6,366. This survey was distributed around the entire US.
p <- ggplot(Fair, aes(x = ym, y = nbaffairs)) +
geom_line(aes(color = sex), size = 2) +
ggtitle("Number of Affairs from Married Persons in Past Year")
p
p + theme_tufte()
p2<- ggplot(Fair, aes(x = nbaffairs)) +
geom_bar(aes(fill = sex), size = 2) +
ggtitle("Number of Affairs from Married Persons in Past Year")
p2
p3<- ggplot(Fair, aes(x = ym)) +
geom_bar(aes(color = sex), size = 2) +
ggtitle("Number of Years Married")
p3
p4<- ggplot(Fair, aes(x = ym, y= nbaffairs)) +
geom_point(aes(color = sex), size = 3) +
ggtitle("Number of Affairs from Married Persons in Past Year")
p4
p41 <- p4 + geom_point() + stat_quantile()
p41
p5<- ggplot(Fair, aes(x = rate)) +
geom_bar(aes(fill= sex), size = 2) +
ggtitle("Self Rating of Marriage")
p5
p6<- ggplot(Fair, aes(x = religious)) +
geom_bar(aes(fill = sex), size = 2) +
ggtitle("Self Rating of How Religious")
p6
ggplot(data = Fair) +
geom_mosaic(aes(weight = ym, x = product(nbaffairs), fill=factor(nbaffairs)), na.rm=TRUE) +
labs(x="Number of Affiars from Married Persons ", title='f(nbaffairs)') + guides(fill=guide_legend(title = "nbaffairs", reverse = TRUE))
ggplot(data = Fair) +
geom_mosaic(aes(weight = ym, x = product(nbaffairs, age), fill=factor(nbaffairs)), na.rm=TRUE) + theme(axis.text.x=element_text(angle=-25, hjust= .1)) + labs(x="age ", title='f(nbaffairs | age) f(age)') + guides(fill=guide_legend(title = "nbaffairs", reverse = TRUE))
ggplot(data = Fair) +
geom_mosaic(aes( x = product(nbaffairs, age), fill=factor(nbaffairs), conds=product(sex)), na.rm=TRUE, divider=mosaic("v")) + theme(axis.text.x=element_text(angle=-25, hjust= .1)) + labs(x="age ", title='f(nbaffairs, age | sex)') + facet_grid(sex~.) + guides(fill=guide_legend(title = "nbaffairs", reverse = TRUE))
library(ggforce)
ggplot(Fair, aes(nbaffairs, ym, colour = sex)) +
geom_point() +
facet_zoom(x = sex == "versicolor")
library(ggplot2)
library(gridExtra)
library(ggalt)
ggplot(Fair, aes(x=ym)) +
stat_bkde(alpha=1/2)
ggplot(Fair, aes(x=rate)) +
stat_bkde(alpha=1/2)