introduction

The data are from a survey of primary care patients in Cantabrian, Spain, to estimate the prevalence of psychiatric illness. The prevalence for each of the sexes is estimated from data provided by the general practitioner and from completing the general health questionnaire.

Source: Everitt, B.S., & Dunn, G. (2001). Applied Multivariate Data Analysis, 2nd Ed. p. 237

Column 1: Subject ID Column 2: Illness = 1, Otherweise = 0 Column 3: Gender ID Column 4: General practitioner = GP, General health questionnaire = GPQ

Data Management

#load pckages
pacman::p_load(magrittr, dplyr, knitr, rmdformats, geepack, repolr, geesmv)
#input data
dta <- read.table("C:/Users/Ching-Fang Wu/Documents/data/cantabrian.txt", h=T)
head(dta)
##   sbj resp gender type
## 1   1    0      M   GP
## 2   1    0      M  GHQ
## 3   2    0      M   GP
## 4   2    0      M  GHQ
## 5   3    0      M   GP
## 6   3    0      M  GHQ
names(dta) <- c("ID","Health","Gender","Report")
str(dta)
## 'data.frame':    1646 obs. of  4 variables:
##  $ ID    : int  1 1 2 2 3 3 4 4 5 5 ...
##  $ Health: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Gender: chr  "M" "M" "M" "M" ...
##  $ Report: chr  "GP" "GHQ" "GP" "GHQ" ...
dta <- dta %>% 
  mutate(Health = factor(Health, 
                         levels=0:1, 
                         labels=c("Illness","Otherwise")),
         Gender = factor(Gender, levels = c("M", "F")),
         Type = factor(Report, 
                       levels=c("GP","GHQ")))
ftable(dta, row.vars = c('Gender','Health'), col.vars = c('Type'))
##                  Type  GP GHQ
## Gender Health                
## M      Illness        287 244
##        Otherwise       37  80
## F      Illness        420 306
##        Otherwise       79 193
prop.table(dtap <- (ftable(dta, row.vars = c('Gender','Health'), col.vars = c('Type'))), 2)
##                  Type         GP        GHQ
## Gender Health                              
## M      Illness        0.34872418 0.29647631
##        Otherwise      0.04495747 0.09720535
## F      Illness        0.51032807 0.37181045
##        Otherwise      0.09599028 0.23450790
par(mfrow=c(1,2))

barplot(prop.table(with(subset(dta, Type=="GP"), 
                           ftable(Gender, Health)), m=2), 
        xlab="Health \n (Illness, Otherwise)",
        ylab="Proportion", 
        ylim=c(0,1),
        main="GP", 
        beside=T)

legend('topleft', c("Male","Female"), 
       col=c("black","gray"), 
       pch=15, bty='n', cex=.5)

barplot(prop.table(with(subset(dta, Type=="GHQ"), 
                           ftable(Gender, Health)), m=2), 
        xlab="Health \n (Illness, Otherwise)",
        ylab="Proportion", 
        ylim=c(0,1),
        main="GHQ", 
        beside=T)

不論是醫師提供的數據或是問卷調查的數據,女性精神病比率都高於男性。

GEE

#summary(m0 <- geeglm(Health ~ Gender + Type + Gender*Type, data=dta, id = ID, corstr="exchangeable", family = binomial))

Error in h(simpleError(msg, call)) : error in evaluating the argument ‘object’ in selecting a method for function ‘summary’: NA/NaN/Inf in ‘y’

#sjPlot::tab_model(m0, show.obs=F, show.ngroups = F)

The end