You mentioned we will possibly use Qual_Visit as control variables. What is the definition of quality of visit?
There are only 2 Qual_visit equal to 3, can we collapse them to 1 or 2? Because we can’t get any statistical powerful conclusion from 2 samples.
## table of Qual_Visit:
##
## 1 2 3
## 178 47 2
## NA number of T1 & T2 = 9
Here the missing rate is quite low, we may ignore it.
Usually we will include sex and race as control variables. Do you have any other variables that you are interested and want to include? You can tell me as more as possible, because I will do some hypothesis testing and model selection to pick up useful and meaningful variables later.
## table of Sex:
##
## 1 2
## 148 88
## NA number of T1 & T2 = 0
## table of Race:
##
## 3 4 6 7 8
## 146 16 30 7 23
## NA number of T1 & T2 = 14
There is no missingness.
1 - asian, 2 - american indian or alaska native, 3 - black or african american, 4 - hispanic or latino, 5 - native hawaiian or other pacific islander, 6 - white, 7 - other, 8 - mix
The sample size of “other” is too small, and it doesn’t make much sense to study “other” as a group, we may drop this part, or collapse it with other races?