# CODE TO LOAD PACKAGES
library(tidyverse)
library(AER)
I have chosen to work with the AFFAIRS data set from the AES package.
Infidelity data, known as Fair’s Affairs. Cross-section data from a survey conducted by Psychology Today in 1969.
The first 10 rows of the dataset look like the following.
# CODE TO DISPLAY FIRST 10 ROWS
data("Affairs")
head(Affairs,10)
## affairs gender age yearsmarried children religiousness education occupation
## 4 0 male 37 10.00 no 3 18 7
## 5 0 female 27 4.00 no 4 14 6
## 11 0 female 32 15.00 yes 1 12 1
## 16 0 male 57 15.00 yes 5 18 6
## 23 0 male 22 0.75 no 2 17 6
## 29 0 female 32 1.50 no 2 17 5
## 44 0 female 22 0.75 no 2 12 1
## 45 0 male 57 15.00 yes 2 14 4
## 47 0 female 32 15.00 yes 4 16 1
## 49 0 male 22 1.50 no 4 14 4
## rating
## 4 4
## 5 4
## 11 4
## 16 5
## 23 3
## 29 5
## 44 3
## 45 4
## 47 2
## 49 5
# CODE TO DISPLAY NUMBER OF OBSERVATIONS
nrow(Affairs)
## [1] 601
# CODE TO DISPLAY NUMBER OF VARIABLES
ncol(Affairs)
## [1] 9
Below is a visual of the types of data in my dataset.
# CODE TO GENERATE VIEW OF DATA TYPES
library(visdat)
vis_dat(Affairs)
Below are the basic descriptive statistics of the data variable.
# CODE TO GENERATE BASIC DECRIPTIVE STATISTICS
mean(Affairs$age)
## [1] 32.48752
sd(Affairs$age)
## [1] 9.288762
min(Affairs$age)
## [1] 17.5
max(Affairs$age)
## [1] 57
summary(Affairs$age)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 17.50 27.00 32.00 32.49 37.00 57.00
# CODE TO GENERATE PLOT OF ONE VARIABLE WITH APPORPORIATE TITLES
ggplot(data=Affairs, mapping=aes(x=age)) +
geom_bar() +
labs(title="affairs ",
subtitle="affairs",
x = " age ")
The distribution is normal distribution graph right skewed .The graph
shows affairs data along with the age , the count is highest around age
20 to 30 year .
Below is an anlysis showing affairs count withrespect to age
# CODE TO GENERATE PLOT FROM TWO VARIRABLES (in color)
ggplot(data=Affairs,mapping = aes(x=age,y= yearsmarried,
color=children))+
geom_boxplot()+
labs(title="Vis
",
subtitle="data in context with age and years of marriage
",
x = "age ", y = " yearsmarried",
caption = "Source: Psychology Today in 1969.")
we can see from the graph that around age 40 to 50 have children instead of age 20 to 30 year