GUAN LE S3655443
Last updated: 28 October, 2018
Shakespeare said: " Journey’s end in love’s meeting. "
So are you ready for a romantic date? (image[1])
Do you think there is a difference on activeness of go-on-a-date between men and women?
This demonstration covers 6 sections:
# import data sample to the working directory
speeddate <- read_csv("Speed Dating Data.csv")
speeddate %>% head()# convert gender variable type to factor
# values are based the given word document
speeddate$gender <- factor(speeddate$gender, levels = c(0, 1),
labels = c("female", "male"))
speeddate$gender %>% head()## [1] female female female female female female
## Levels: female male
# covert date variable type to factor
# values are based on the given word document
speeddate$date <- factor(speeddate$date, levels = c(1, 2, 3, 4, 5, 6, 7),
labels = c("Several times a week", "Twice a week",
"Once a week", "Twice a month",
"Once a month", "Several times a year",
"Almost never"), ordered = TRUE)
speeddate$date %>% head()## [1] Almost never Almost never Almost never Almost never Almost never
## [6] Almost never
## 7 Levels: Several times a week < Twice a week < ... < Almost never
levels(speeddate$date)## [1] "Several times a week" "Twice a week" "Once a week"
## [4] "Twice a month" "Once a month" "Several times a year"
## [7] "Almost never"
# select gender and date variables for this investigation
speeddate1 <- speeddate %>% dplyr::select(gender, date)
speeddate1 %>% head()# identify missing values
colSums(is.na(speeddate1))## gender date
## 0 97
# handling missing values
speeddate1$date <- impute(speeddate1$date, fun= mode)
# verify if missing values have been successfully imputed
colSums(is.na(speeddate1))## gender date
## 0 0
# assign table to tb1 and get ready for barplot in the next slide
tb1 <- table(speeddate1$date, speeddate1$gender) %>% prop.table(margin = 2) * 100barplot(tb1, ylab="Proportion of going on a date %", ylim=c(0, 70),
xlab = "How Actively One Go On A Date",
legend=rownames(tb1), beside=TRUE,
args.legend=c(x = "topright", horiz=FALSE, title="Date Frequency Category"),
col = c("black","grey","darkgoldenrod2","pink","purple","light blue","light green"))knitr::kable(tb1)| female | male | |
|---|---|---|
| Several times a week | 1.099426 | 1.144492 |
| Twice a week | 3.130975 | 4.220315 |
| Once a week | 7.552581 | 11.134955 |
| Twice a month | 23.900574 | 24.797330 |
| Once a month | 15.439771 | 21.030043 |
| Several times a year | 28.441683 | 23.867430 |
| Almost never | 20.434990 | 13.805436 |
Hypotheses for the Chi-square test of association:
Ho: there is no association between date frequency and gender of the attendents in the population
Ha: there is an association between date frequency and gender of the attendents in the population
Assumptions:
# use Chi-sq test method
chi <- chisq.test(table(speeddate1$gender, speeddate1$date))
chi##
## Pearson's Chi-squared test
##
## data: table(speeddate1$gender, speeddate1$date)
## X-squared = 142.68, df = 6, p-value < 2.2e-16
# The observed values
chi$observed##
## Several times a week Twice a week Once a week Twice a month
## female 46 131 316 1000
## male 48 177 467 1040
##
## Once a month Several times a year Almost never
## female 646 1190 855
## male 882 1001 579
# the expected values
chi$expected##
## Several times a week Twice a week Once a week Twice a month
## female 46.9439 153.8162 391.0327 1018.783
## male 47.0561 154.1838 391.9673 1021.217
##
## Once a month Several times a year Almost never
## female 763.0881 1094.192 716.1442
## male 764.9119 1096.808 717.8558
Chi-square test of association summary:
0% of expected cell counts were less than 5
χ2(df = 6) = 142.68
p < .001
Decision: Reject Ho
According to the Chi-squre test, there is a statistically significant association between gender and date frequency.
Based on the barplot, generally speaking, male are more likely to be active to go on a date than female do while if stretch the time period to yearly, female may be willing to go on a date more often than male.
Furthermore, women are more likely to be in a zone of ‘almost never’ go on a date than men.
It’s time for all ladies to stand up and get ready for your next romantic coincidence !!!!
[1] image: https://mysevendevils.wordpress.com/2014/06/04/5-types-of-dates-people-go-on/
[2] insert image: https://stackoverflow.com/questions/25166624/insert-picture-table-in-r-markdown
[3] Data Preprocessing Lecture Notes
[4] dataset from kaggle: https://www.kaggle.com/annavictoria/speed-dating-experiment