1 Data Management

  1. 載入程式及資料
pacman::p_load(magrittr, dplyr, knitr, rmdformats, geepack, repolr, geesmv)

## Global options
options(max.print="75")
opts_chunk$set(echo=FALSE,
                 cache=TRUE,
               prompt=FALSE,
               tidy=TRUE,
               comment=NA,
               message=FALSE,
               warning=FALSE)
opts_knit$set(width=75)

fL <- "https://content.sph.harvard.edu/fitzmaur/ala2e/muscatine-data.txt"
dta <- read.table(fL, h=F, na.strings = '.')
str(dta)
## 'data.frame':    14568 obs. of  6 variables:
##  $ V1: int  1 1 1 2 2 2 3 3 3 4 ...
##  $ V2: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ V3: int  6 6 6 6 6 6 6 6 6 6 ...
##  $ V4: int  6 8 10 6 8 10 6 8 10 6 ...
##  $ V5: int  1 2 3 1 2 3 1 2 3 1 ...
##  $ V6: int  1 1 1 1 1 1 1 1 1 1 ...
  1. 重新命名,改Gender成名義變項
  ID Gender Age0 Age Occasion Obese
1  1   Boys    6   6     1977     1
2  1   Boys    6   8     1979     1
3  1   Boys    6  10     1981     1
4  2   Boys    6   6     1977     1
5  2   Boys    6   8     1979     1
6  2   Boys    6  10     1981     1

3.以列表檢視變更資料

           Occasion   1       2       3    
           Obese      0   1   0   1   0   1
Gender Age                                 
F      6            141  23   0   0   0   0
       8            294  58 265  55   0   0
       10           270  92 276  87 268  90
       12           291  91 279  99 278  92
       14           300  89 226  64 256  73
       16             0   0 250  87 226  56
       18             0   0   0   0 140  37
M      6            174  15   0   0   0   0
       8            289  67 296  54   0   0
       10           312  84 298  77 308  83
       12           281  90 299  88 290  90
       14           307  73 233  65 269  78
       16             0   0 251  67 224  54
       18             0   0   0   0 153  34

2 Analysis

2.1 m0

m0 : Obese ~ Gender + Age + Age2 + Gender:Age + Gender:Age2 (correlation matrix for change over time: AR1) 結果顯示年齡與年齡^2為判定為obesity的重要變項

  ID Gender Age0 Age Occasion Obese Age2
1  1      M    6  -6        1     1   36
2  1      M    6  -4        2     1   16
3  1      M    6  -2        3     1    4
4  2      M    6  -6        1     1   36
5  2      M    6  -4        2     1   16
6  2      M    6  -2        3     1    4

Call:
geeglm(formula = Obese ~ Gender + Age + Age2 + Gender:Age + Gender:Age2, 
    family = binomial, data = dta1, id = ID, corstr = "ar1")

 Coefficients:
              Estimate   Std.err    Wald Pr(>|W|)    
(Intercept)  -1.100392  0.050179 480.889  < 2e-16 ***
GenderM      -0.105306  0.071357   2.178 0.140012    
Age           0.045164  0.012811  12.428 0.000423 ***
Age2         -0.014597  0.003249  20.184 7.03e-06 ***
GenderM:Age  -0.002900  0.018563   0.024 0.875848    
GenderM:Age2 -0.003495  0.004715   0.549 0.458569    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation structure = ar1 
Estimated Scale Parameters:

            Estimate Std.err
(Intercept)   0.9943 0.02807
  Link = identity 

Estimated Correlation Parameters:
      Estimate Std.err
alpha   0.6106 0.02036
Number of clusters:   4856  Maximum cluster size: 3 

2.2 m1

交互作用移除後,性別與年齡及年齡^2皆為判定為obesity的重要變項,資料顯示應該是女生(=1)較會被判定為obese


Call:
geeglm(formula = Obese ~ Gender + Age + Age2, family = binomial, 
    data = dta1, id = ID, corstr = "ar1")

 Coefficients:
            Estimate  Std.err   Wald Pr(>|W|)    
(Intercept) -1.08751  0.04724 530.08  < 2e-16 ***
GenderM     -0.13145  0.06288   4.37    0.037 *  
Age          0.04387  0.00926  22.43  2.2e-06 ***
Age2        -0.01630  0.00235  48.05  4.2e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation structure = ar1 
Estimated Scale Parameters:

            Estimate Std.err
(Intercept)    0.994   0.028
  Link = identity 

Estimated Correlation Parameters:
      Estimate Std.err
alpha     0.61  0.0203
Number of clusters:   4856  Maximum cluster size: 3 

2.3 m0,m1 (in table)

女生較易被判定為肥胖,年齡與肥旁判定為二次方關係
  Obese Obese
Predictors Odds Ratios CI p Odds Ratios CI p
(Intercept) 0.33 0.30 – 0.37 <0.001 0.34 0.31 – 0.37 <0.001
Gender [M] 0.90 0.78 – 1.03 0.136 0.88 0.78 – 0.99 0.036
Age 1.04 1.02 – 1.07 0.001 1.04 1.02 – 1.06 <0.001
Age2 0.99 0.98 – 0.99 <0.001 0.98 0.98 – 0.99 <0.001
Gender [M] * Age 0.99 0.96 – 1.03 0.765
Gender [M] * Age2 1.00 0.99 – 1.01 0.483

3 Graph

在1977, 1979, 1981;
三次施測的到的結果類似,女生較肥,12~14歲左右為肥胖高峰期

The end