Module 3

Ong S.M.
April 2017

Module Flow

alt text

Load data

Load data pain.xlsx in module 3

library(xlsx)
?read.xlsx
pain <- read.xlsx("F:/Dropbox/Conference.Seminar.Talks/NIH Monash R Course/05 Module 3/pain.xlsx",1)

Familiar yourself with the dataset

  • str()
  • summary()
  • dims()

Now, we want to compare some outcomes between the two Rx

Rx (interv): Group A vs. Group B
Outcome 1: psrcrest(pain score @ rest)

Run some descriptive stat.

summary(pain$interv)
 A  B 
55 47 
summary(pain$psrcrest)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   2.000   2.000   2.363   3.000   6.000 

How about the mean/median score by Rx group?

The aggregate function

?aggregate
aggregate(psrcrest~interv,data=pain, FUN = mean)
  interv psrcrest
1      A 2.327273
2      B 2.404255

Can you do the median score by Rx group?

  interv psrcrest
1      A 2.327273
2      B 2.404255

Visualize the data first.

plot of chunk unnamed-chunk-7 Can you plot this?

cont...

Can you plot this? plot of chunk unnamed-chunk-8

cont... (2)

Can you plot this? plot of chunk unnamed-chunk-9

Now Compare the mean

  • Let's start with the parametric test

  • 2 samples t-test

?t.test

Try run a t-test:

Compare pain score @ rest (psrcrest) between group A vs. B (interv)

t.test(psrcrest~interv,data=pain)

    Welch Two Sample t-test

data:  psrcrest by interv
t = -0.34504, df = 92.002, p-value = 0.7308
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.5200966  0.3661314
sample estimates:
mean in group A mean in group B 
       2.327273        2.404255 

What if you actually want a paired t-test?

Can you find out?

?t.test

You want this.


    Paired t-test

data:  pain$psrcrest and pain$ps12hrest
t = 7.003, df = 101, p-value = 2.836e-10
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.7940272 1.4216591
sample estimates:
mean of the differences 
               1.107843 

Find out how to perform a Wilcoxon rank-sum test using the same data

~ Hint: use google or ?? ~

You want this.


    Wilcoxon rank sum test with continuity correction

data:  psrcrest by interv
W = 1254, p-value = 0.7868
alternative hypothesis: true location shift is not equal to 0

A quick one on ANOVA

aov(psrcrest~interv, data=pain)
Call:
   aov(formula = psrcrest ~ interv, data = pain)

Terms:
                   interv Residuals
Sum of Squares    0.15019 123.42824
Deg. of Freedom         1       100

Residual standard error: 1.110983
Estimated effects may be unbalanced

Quick diagnostics for ANOVA

layout(matrix(c(1,2,3,4),2,2)) # optional layout 
plot(aov(psrcrest~interv, data=pain))

plot of chunk unnamed-chunk-16

Comparing proportion (RxC table)

Review

  • Tabulate categorical data - univariate, bivariate

Univariate


 A  B 
55 47 

Bivariate


     F  M
  A 17 38
  B 16 31

Review

  • Derive the proportion of a table - by row, col & overall

Difference of Sex Distribution between intervention group


            A         B
  F 0.3090909 0.3404255
  M 0.6909091 0.6595745

Make it better


       A    B
  F 30.9 34.0
  M 69.1 66.0

Are they really different?

Let us test it.

Test the difference of proportion

  • Approximate tests (Chi-sqr)
chisq.test(table(pain$sex,pain$interv))

    Pearson's Chi-squared test with Yates' continuity correction

data:  table(pain$sex, pain$interv)
X-squared = 0.015596, df = 1, p-value = 0.9006

Try find out yourself...

  • Exact tests (Fisher)

Exact tests (Fisher)

fisher.test(table(pain$interv,pain$sex))

    Fisher's Exact Test for Count Data

data:  table(pain$interv, pain$sex)
p-value = 0.8326
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.3479735 2.1673343
sample estimates:
odds ratio 
 0.8679976 

Remember the `tableone` package?

You can build a Table 1 with bivariate stats quickly

  • load tableone library
  • read the help file
  • find out createtableone function

Can you try to do this?

library(tableone)
?CreateTableOne

We learned this yesterday

library(tableone)
CreateTableOne(data=pain)

                           Overall       
  n                           102        
  interv = B (%)               47 (46.1) 
  age (mean (sd))           27.82 (7.37) 
  sex = M (%)                  69 (67.6) 
  bmi (%)                                
     -                          7 ( 6.9) 
     15.6                       1 ( 1.0) 
     15.79                      1 ( 1.0) 
     16                         2 ( 2.0) 
     16.5                       1 ( 1.0) 
     16.67                      1 ( 1.0) 
     17                         1 ( 1.0) 
     17.5                       1 ( 1.0) 
     17.85                      1 ( 1.0) 
     17.91                      1 ( 1.0) 
     18                         3 ( 2.9) 
     18.4                       1 ( 1.0) 
     18.5                       1 ( 1.0) 
     18.9                       1 ( 1.0) 
     19                         1 ( 1.0) 
     19.4                       1 ( 1.0) 
     19.5                       3 ( 2.9) 
     19.9                       1 ( 1.0) 
     20                         4 ( 3.9) 
     20.1                       1 ( 1.0) 
     20.2                       1 ( 1.0) 
     20.4                       1 ( 1.0) 
     20.5                       1 ( 1.0) 
     20.7                       2 ( 2.0) 
     20.8                       2 ( 2.0) 
     21                         4 ( 3.9) 
     21.1                       1 ( 1.0) 
     21.2                       1 ( 1.0) 
     21.48                      1 ( 1.0) 
     21.5                       1 ( 1.0) 
     22                         6 ( 5.9) 
     22.5                       1 ( 1.0) 
     22.8                       1 ( 1.0) 
     23                        10 ( 9.8) 
     23.5                       1 ( 1.0) 
     23.73                      1 ( 1.0) 
     24                         5 ( 4.9) 
     24.1                       1 ( 1.0) 
     25                         2 ( 2.0) 
     25.7                       1 ( 1.0) 
     25.9                       1 ( 1.0) 
     26                         5 ( 4.9) 
     26.6                       1 ( 1.0) 
     27                         3 ( 2.9) 
     27.5                       1 ( 1.0) 
     28                         2 ( 2.0) 
     29                         1 ( 1.0) 
     29.2                       1 ( 1.0) 
     31                         1 ( 1.0) 
     31.14                      1 ( 1.0) 
     31.3                       1 ( 1.0) 
     32                         1 ( 1.0) 
     33                         1 ( 1.0) 
     33.62                      1 ( 1.0) 
     34                         2 ( 2.0) 
     35                         1 ( 1.0) 
  psrcrest (mean (sd))       2.36 (1.11) 
  ps12hrest (mean (sd))      1.25 (1.37) 
  totalPCA12hr (mean (sd)) 136.97 (87.18)

Now, try to build this:

CreateTableOne(data=pain, strata = "interv")
                          Stratified by interv
                           A              B               p      test
  n                            55             47                     
  interv = B (%)                0 ( 0.0)      47 (100.0)  <0.001     
  age (mean (sd))           27.53 (6.91)   28.16 (7.94)    0.672     
  sex = M (%)                  38 (69.1)      31 ( 66.0)   0.901     
  bmi (%)                                                  0.483     
     -                          3 ( 5.5)       4 (  8.5)             
     15.6                       1 ( 1.8)       0 (  0.0)             
     15.79                      1 ( 1.8)       0 (  0.0)             
     16                         1 ( 1.8)       1 (  2.1)             
     16.5                       0 ( 0.0)       1 (  2.1)             
     16.67                      1 ( 1.8)       0 (  0.0)             
     17                         1 ( 1.8)       0 (  0.0)             
     17.5                       1 ( 1.8)       0 (  0.0)             
     17.85                      1 ( 1.8)       0 (  0.0)             
     17.91                      0 ( 0.0)       1 (  2.1)             
     18                         2 ( 3.6)       1 (  2.1)             
     18.4                       1 ( 1.8)       0 (  0.0)             
     18.5                       1 ( 1.8)       0 (  0.0)             
     18.9                       1 ( 1.8)       0 (  0.0)             
     19                         0 ( 0.0)       1 (  2.1)             
     19.4                       0 ( 0.0)       1 (  2.1)             
     19.5                       0 ( 0.0)       3 (  6.4)             
     19.9                       0 ( 0.0)       1 (  2.1)             
     20                         2 ( 3.6)       2 (  4.3)             
     20.1                       0 ( 0.0)       1 (  2.1)             
     20.2                       0 ( 0.0)       1 (  2.1)             
     20.4                       1 ( 1.8)       0 (  0.0)             
     20.5                       0 ( 0.0)       1 (  2.1)             
     20.7                       2 ( 3.6)       0 (  0.0)             
     20.8                       1 ( 1.8)       1 (  2.1)             
     21                         1 ( 1.8)       3 (  6.4)             
     21.1                       1 ( 1.8)       0 (  0.0)             
     21.2                       1 ( 1.8)       0 (  0.0)             
     21.48                      0 ( 0.0)       1 (  2.1)             
     21.5                       1 ( 1.8)       0 (  0.0)             
     22                         4 ( 7.3)       2 (  4.3)             
     22.5                       0 ( 0.0)       1 (  2.1)             
     22.8                       1 ( 1.8)       0 (  0.0)             
     23                         6 (10.9)       4 (  8.5)             
     23.5                       0 ( 0.0)       1 (  2.1)             
     23.73                      0 ( 0.0)       1 (  2.1)             
     24                         5 ( 9.1)       0 (  0.0)             
     24.1                       0 ( 0.0)       1 (  2.1)             
     25                         1 ( 1.8)       1 (  2.1)             
     25.7                       0 ( 0.0)       1 (  2.1)             
     25.9                       0 ( 0.0)       1 (  2.1)             
     26                         2 ( 3.6)       3 (  6.4)             
     26.6                       1 ( 1.8)       0 (  0.0)             
     27                         1 ( 1.8)       2 (  4.3)             
     27.5                       1 ( 1.8)       0 (  0.0)             
     28                         1 ( 1.8)       1 (  2.1)             
     29                         1 ( 1.8)       0 (  0.0)             
     29.2                       1 ( 1.8)       0 (  0.0)             
     31                         1 ( 1.8)       0 (  0.0)             
     31.14                      0 ( 0.0)       1 (  2.1)             
     31.3                       0 ( 0.0)       1 (  2.1)             
     32                         0 ( 0.0)       1 (  2.1)             
     33                         0 ( 0.0)       1 (  2.1)             
     33.62                      1 ( 1.8)       0 (  0.0)             
     34                         2 ( 3.6)       0 (  0.0)             
     35                         1 ( 1.8)       0 (  0.0)             
  psrcrest (mean (sd))       2.33 (1.04)    2.40 (1.19)    0.728     
  ps12hrest (mean (sd))      1.20 (1.52)    1.32 (1.18)    0.664     
  totalPCA12hr (mean (sd)) 136.25 (84.66) 137.81 (90.96)   0.929     

Note that you got a weird table

bmi in %

Why?

Changing data type

as.factor()
as.numeric()
as.character()

Check data type for bmi

str(pain$bmi)
 Factor w/ 56 levels "-","15.6","15.79",..: 31 35 10 28 21 34 34 31 37 8 ...

Now change this to numeric

pain$bmi2 <- as.numeric(pain$bmi)

Some checking

str(pain$bmi2)
 num [1:102] 31 35 10 28 21 34 34 31 37 8 ...
summary(pain$bmi2)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   17.00   31.00   27.95   39.00   56.00 

something's wrong…

R is tricky (sometime)...

pain$bmi3 <- as.numeric(as.character(pain$bmi))
summary(pain$bmi3)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  15.60   20.00   22.00   22.96   25.80   35.00       7 

Now re-do Table 1

library(tableone)
  • One new function: concatenate
Var = c("age", "sex", "bmi3", "psrcrest", "ps12hrest", "totalPCA12hr")

Insert the `Var` into the CreateTableOne function

CreateTableOne(vars=Var, data=pain, strata = "interv")
                          Stratified by interv
                           A              B              p      test
  n                            55             47                    
  age (mean (sd))           27.53 (6.91)   28.16 (7.94)   0.672     
  sex = M (%)                  38 (69.1)      31 (66.0)   0.901     
  bmi3 (mean (sd))          23.01 (4.84)   22.91 (4.09)   0.918     
  psrcrest (mean (sd))       2.33 (1.04)    2.40 (1.19)   0.728     
  ps12hrest (mean (sd))      1.20 (1.52)    1.32 (1.18)   0.664     
  totalPCA12hr (mean (sd)) 136.25 (84.66) 137.81 (90.96)  0.929     

Objectives check

alt text