Looking at data

First get rid of unusable data where age is 99. There are length(filter(data, AGE>98)$AGE) such points.

Complain about conflating not wanting to reveal gender and declaring oneself as non-binary.

Complain about gender imbalance in data.

Age is an independent variable, but there are not many people older than 25: length(filter(data, AGE>25)$AGE).

There are now length(filter(data, GENDER == 1)$AGE) men. And length(filter(data, GENDER == 2)$AGE) women. Third gender category: length(filter(data, GENDER == 3)$AGE). See below for values.

length(filter(data, GENDER == 1)$AGE)
## [1] 17
length(filter(data, GENDER == 2)$AGE)
## [1] 75
length(filter(data, GENDER == 3)$AGE)
## [1] 3
length(filter(data, AGE>25)$AGE)
## [1] 8
ggplot(data, aes(AGE, CONGRUENTrt)) + 
  geom_point(aes(color=GENDER))+ scale_colour_brewer(palette = "Set2")

Density Plots

Normality looks bad

## [1] 40  2

## [1] 81  2

## [1] 1 3

## [1] 93 46

And so says shapiro test, p values are lower than 5%, this data is not normal

shapiro.test(data$CONGRUENTrt)
## 
##  Shapiro-Wilk normality test
## 
## data:  data$CONGRUENTrt
## W = 0.88135, p-value = 3.643e-07
shapiro.test(data$CONGRUENTacc)
## 
##  Shapiro-Wilk normality test
## 
## data:  data$CONGRUENTacc
## W = 0.83023, p-value = 4.567e-09
shapiro.test(data$INCONGRUENTrt)
## 
##  Shapiro-Wilk normality test
## 
## data:  data$INCONGRUENTrt
## W = 0.88477, p-value = 5.072e-07
shapiro.test(data$INCONGRUENTacc)
## 
##  Shapiro-Wilk normality test
## 
## data:  data$INCONGRUENTacc
## W = 0.87488, p-value = 1.978e-07

Best one can do is look at non-param smoothers and say data does suggest some relationship with age however reliability of these results is highly questionable due to too few data points among older adults.

same but with data