First get rid of unusable data where age is 99. There are
length(filter(data, AGE>98)$AGE) such points.
Complain about conflating not wanting to reveal gender and declaring oneself as non-binary.
Complain about gender imbalance in data.
Age is an independent variable, but there are not many people older
than 25: length(filter(data, AGE>25)$AGE).
There are now length(filter(data, GENDER == 1)$AGE) men.
And length(filter(data, GENDER == 2)$AGE) women. Third
gender category: length(filter(data, GENDER == 3)$AGE). See
below for values.
length(filter(data, GENDER == 1)$AGE)
## [1] 17
length(filter(data, GENDER == 2)$AGE)
## [1] 75
length(filter(data, GENDER == 3)$AGE)
## [1] 3
length(filter(data, AGE>25)$AGE)
## [1] 8
ggplot(data, aes(AGE, CONGRUENTrt)) +
geom_point(aes(color=GENDER))+ scale_colour_brewer(palette = "Set2")
## [1] 40 2
## [1] 81 2
## [1] 1 3
## [1] 93 46
shapiro.test(data$CONGRUENTrt)
##
## Shapiro-Wilk normality test
##
## data: data$CONGRUENTrt
## W = 0.88135, p-value = 3.643e-07
shapiro.test(data$CONGRUENTacc)
##
## Shapiro-Wilk normality test
##
## data: data$CONGRUENTacc
## W = 0.83023, p-value = 4.567e-09
shapiro.test(data$INCONGRUENTrt)
##
## Shapiro-Wilk normality test
##
## data: data$INCONGRUENTrt
## W = 0.88477, p-value = 5.072e-07
shapiro.test(data$INCONGRUENTacc)
##
## Shapiro-Wilk normality test
##
## data: data$INCONGRUENTacc
## W = 0.87488, p-value = 1.978e-07