introduction
The data are from a survey of primary care patients in Cantabrian, Spain, to estimate the prevalence of psychiatric illness. The prevalence for each of the sexes is estimated from data provided by the general practitioner and from completing the general health questionnaire.
Source: Everitt, B.S., & Dunn, G. (2001). Applied Multivariate Data Analysis, 2nd Ed. p. 237
Column 1: Subject ID Column 2: Illness = 1, Otherweise = 0 Column 3: Gender ID Column 4: General practitioner = GP, General health questionnaire = GPQ
Data Management
#load pckages
pacman::p_load(magrittr, dplyr, knitr, rmdformats, geepack, repolr, geesmv)
#input data
dta <- read.table("C:/Users/Ching-Fang Wu/Documents/data/cantabrian.txt", h=T)##   sbj resp gender type
## 1   1    0      M   GP
## 2   1    0      M  GHQ
## 3   2    0      M   GP
## 4   2    0      M  GHQ
## 5   3    0      M   GP
## 6   3    0      M  GHQ## 'data.frame':    1646 obs. of  4 variables:
##  $ ID    : int  1 1 2 2 3 3 4 4 5 5 ...
##  $ Health: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Gender: chr  "M" "M" "M" "M" ...
##  $ Report: chr  "GP" "GHQ" "GP" "GHQ" ...dta <- dta %>% 
  mutate(Health = factor(Health, 
                         levels=0:1, 
                         labels=c("Illness","Otherwise")),
         Gender = factor(Gender, levels = c("M", "F")),
         Type = factor(Report, 
                       levels=c("GP","GHQ")))##                  Type  GP GHQ
## Gender Health                
## M      Illness        287 244
##        Otherwise       37  80
## F      Illness        420 306
##        Otherwise       79 193##                  Type         GP        GHQ
## Gender Health                              
## M      Illness        0.34872418 0.29647631
##        Otherwise      0.04495747 0.09720535
## F      Illness        0.51032807 0.37181045
##        Otherwise      0.09599028 0.23450790par(mfrow=c(1,2))
barplot(prop.table(with(subset(dta, Type=="GP"), 
                           ftable(Gender, Health)), m=2), 
        xlab="Health \n (Illness, Otherwise)",
        ylab="Proportion", 
        ylim=c(0,1),
        main="GP", 
        beside=T)
legend('topleft', c("Male","Female"), 
       col=c("black","gray"), 
       pch=15, bty='n', cex=.5)
barplot(prop.table(with(subset(dta, Type=="GHQ"), 
                           ftable(Gender, Health)), m=2), 
        xlab="Health \n (Illness, Otherwise)",
        ylab="Proportion", 
        ylim=c(0,1),
        main="GHQ", 
        beside=T)不論是醫師提供的數據或是問卷調查的數據,女性精神病比率都高於男性。
GEE
#summary(m0 <- geeglm(Health ~ Gender + Type + Gender*Type, data=dta, id = ID, corstr="exchangeable", family = binomial))Error in h(simpleError(msg, call)) : error in evaluating the argument ‘object’ in selecting a method for function ‘summary’: NA/NaN/Inf in ‘y’