5.8.1

Use the results from Table 5.3 create and test a function.

Firstly, I remake table 5.3.

tab <- matrix(c(273, 77, 350, 289, 61, 350, 562, 138, 700), nrow = 3, ncol = 3)
dimnames(tab) <- list("Recovered" = c("Yes", "No", "Total"),
                        "Treatment" = c("Yes", "No", "Total"))

tab
##          Treatment
## Recovered Yes  No Total
##     Yes   273 289   562
##     No     77  61   138
##     Total 350 350   700

Then I created functions and applied them to table 5.3 to calculate the risks R1 and R0, and the risk ratio R1/R0.

Firstly, I prepared the inputs. Then, I did the calculations. Finally, I collected the data.

#Making the function

myOR <- function(x){
  #Prepare input
  # x = 3x3 table
  a = x[1,1]
  N1 = x[3,1]
  b = x[1,2]
  N0 = x[3,2]
  
  #doing the calculations
  R1 = (a/N1)
  R0 = (b/N0)
  RR = R1/R0
  
  #collecting results
  list(data = x, risk1 = R1, risk0 = R0, odds.ratio = RR)
  
  }


#Applying/testing the function

myOR(tab)
## $data
##          Treatment
## Recovered Yes  No Total
##     Yes   273 289   562
##     No     77  61   138
##     Total 350 350   700
## 
## $risk1
## [1] 0.78
## 
## $risk0
## [1] 0.8257143
## 
## $odds.ratio
## [1] 0.9446367

Now I make a function to calculate the 95% confidence interval.

myOR2 <- function(x, conf.level){
  #### Prepare input
  #### x = 3x3 table amenable to cross-product
  if(missing(conf.level)) stop("Must specify confidence level")
  a = x[1,1]
  N1 = x[3,1]
  b = x[1,2]
  N0 = x[3,2]


  Z <- qnorm((1 + conf.level)/2)
  
  #### Do calculations
  logOR <- log((a*N0)/(N1*b))
  SE.logOR <- sqrt(1/a + 1/N1 + 1/b + 1/N0)
  OR <- exp(logOR)
  CI <- exp(logOR + c(-1, 1)*Z*SE.logOR)

  #### Collect results
  list(data = x, odds.ratio = OR, conf.int = CI)
}


#Applying/testing the function to get 95% confidence interval
myOR2(tab, 0.95)
## $data
##          Treatment
## Recovered Yes  No Total
##     Yes   273 289   562
##     No     77  61   138
##     Total 350 350   700
## 
## $odds.ratio
## [1] 0.9446367
## 
## $conf.int
## [1] 0.7565217 1.1795279

Using the fisher.test function (Fisher’s Exact Test), I calculate the two-sided p-value of the null hypothesis of no association.

fisher.test(tab, y = NULL, workspace = 200000000, hybrid = FALSE, control = list(), or = 1, alternative = "two.sided",  conf.int = TRUE, conf.level = 0.95,
simulate.p.value = FALSE, B = 2000)
## 
##  Fisher's Exact Test for Count Data
## 
## data:  tab
## p-value = 0.6799
## alternative hypothesis: two.sided

5.8.2

Using the read.csv function import the data frame for Table 5.3. Use the code below.

drugrx.df <- read.csv('~/Desktop/FALL 2017/CLASSES/PH 251D/Course Data/exported/jp-drugrx-p2.txt')

Analyze the jp-drugrx-p2.df data frame using the xtab function, your function from the previous problem, and any R functions you wish. Is the drug treatment effective? Is there a difference in drug treatment effectiveness comparing men and women?

xtabs(drugrx.df)
## , , Gender = Men
## 
##          Drug
## Recovered    No   Yes
##       No  23670  3393
##       Yes 91377  3321
## 
## , , Gender = Women
## 
##          Drug
## Recovered    No   Yes
##       No  17200 42884
##       Yes 29425 34080

You don’t know if the drug treatment is effective from using these analyses. This is because we don’t have enough information on how the study was conducted. We need to take into account causality and common sense.

5.8.3

Study this clean version of the Oswego data set and import it as a data frame: ~/git/phds/data/oswego/oswego2.txt

data.frame('~/Desktop/FALL 2017/CLASSES/PH 251D/Course Data/exported/oswego/oswego2.txt')
##   X...Desktop.FALL.2017.CLASSES.PH.251D.Course.Data.exported.oswego.oswego2.txt.
## 1    ~/Desktop/FALL 2017/CLASSES/PH 251D/Course Data/exported/oswego/oswego2.txt

Import this clean version.

oswego2 <- read.csv('~/Desktop/FALL 2017/CLASSES/PH 251D/Course Data/oswego/oswego2.txt', strip.white = TRUE)
food.item <- c("Baked.ham", "Spinach", "Mashed.potato", "Cabbage.salad", "Jello", 
    "Rolls", "Brown.bread", "Milk", "Coffee", "Water", "Cakes", "Vanilla.ice.cream", 
    "Chocolate.ice.cream", "Fruit.salad")
names(oswego2)[9:22] <- food.item
str(oswego2)
## 'data.frame':    75 obs. of  22 variables:
##  $ ID                 : int  2 3 4 6 7 8 9 10 14 16 ...
##  $ Age                : int  52 65 59 63 70 40 15 33 10 32 ...
##  $ Sex                : Factor w/ 2 levels "F","M": 1 2 1 1 2 1 1 1 2 1 ...
##  $ MealDate           : Factor w/ 1 level "04/18/1940": 1 1 1 1 1 1 1 1 1 1 ...
##  $ MealTime           : Factor w/ 6 levels "11:00:00","18:30:00",..: 5 2 2 4 4 4 6 3 4 NA ...
##  $ Ill                : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
##  $ OnsetDate          : Factor w/ 2 levels "4/18/1940","4/19/1940": 2 2 2 1 1 2 2 1 2 2 ...
##  $ OnsetTime          : Factor w/ 17 levels "00:00:00","00:30:00",..: 2 2 2 15 15 4 3 16 4 7 ...
##  $ Baked.ham          : Factor w/ 2 levels "N","Y": 2 2 2 2 2 1 1 2 1 2 ...
##  $ Spinach            : Factor w/ 2 levels "N","Y": 2 2 2 2 2 1 1 2 1 2 ...
##  $ Mashed.potato      : Factor w/ 2 levels "N","Y": 2 2 1 1 2 1 1 2 1 1 ...
##  $ Cabbage.salad      : Factor w/ 2 levels "N","Y": 1 2 1 2 1 1 1 1 1 1 ...
##  $ Jello              : Factor w/ 2 levels "N","Y": 1 1 1 2 2 1 1 1 1 1 ...
##  $ Rolls              : Factor w/ 2 levels "N","Y": 2 1 1 1 2 1 1 2 1 2 ...
##  $ Brown.bread        : Factor w/ 2 levels "N","Y": 1 1 1 1 2 1 1 2 1 1 ...
##  $ Milk               : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Coffee             : Factor w/ 2 levels "N","Y": 2 2 2 1 2 1 1 1 1 2 ...
##  $ Water              : Factor w/ 2 levels "N","Y": 1 1 1 2 2 1 1 2 1 1 ...
##  $ Cakes              : Factor w/ 2 levels "N","Y": 1 1 2 1 1 1 2 1 1 2 ...
##  $ Vanilla.ice.cream  : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 1 2 2 2 ...
##  $ Chocolate.ice.cream: Factor w/ 2 levels "N","Y": 1 2 2 1 1 2 2 2 2 2 ...
##  $ Fruit.salad        : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...

Study the following epitable function.

epitable <- function(exposure, outcome, ...){
  #### Updated 2016-10-16
  #### 2x2 table from 'xtabs' function
  #### row names are exposure status
  #### col names are illness outcome
  #### ... = options to fisher.test functions
  x <- xtabs(~ exposure + outcome, na.action=na.pass)
  rowtot <- apply(x, 1, sum)
  rowdist <- sweep(x, 1, rowtot, '/')
  risk0 <- rowdist['N','Y']
  risk1 <- rowdist['Y','Y']
  rr <- risk1/risk0
  or <- (risk1/(1-risk1)) / (risk0/(1-risk0))
  ft <- fisher.test(x, ...)
  pv <- ft$p.value
  results <- unname(round(c(risk0, risk1, rr, or, pv), 2))
  names(results) <- c('R0', 'R1', 'RR', 'OR', 'p.value')
  list(data = x, results = results)
}

Here is an example of testing one food item.

epitable(oswego2$Baked.ham, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 12 17
##        Y 17 29
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.59    0.63    1.08    1.20    0.81

Use the epitable function to test each food item to see if it is associated with developing illness. Construct a table with the following column headings: Food item, R0, R1, RR, OR, Fisher p value.

epitable(oswego2$Spinach, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 12 20
##        Y 17 26
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.62    0.60    0.97    0.92    1.00
epitable(oswego2$Mashed.potato, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 14 23
##        Y 14 23
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.62    0.62    1.00    1.00    1.00
epitable(oswego2$Cabbage.salad, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 19 28
##        Y 10 18
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.60    0.64    1.08    1.22    0.81
epitable(oswego2$Jello, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 22 30
##        Y  7 16
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.58    0.70    1.21    1.68    0.44
epitable(oswego2$Rolls, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 13 25
##        Y 16 21
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.66    0.57    0.86    0.68    0.48
epitable(oswego2$Brown.bread, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 20 28
##        Y  9 18
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.58    0.67    1.14    1.43    0.62
epitable(oswego2$Milk, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 27 44
##        Y  2  2
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.62    0.50    0.81    0.61    0.64
epitable(oswego2$Coffee, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 17 27
##        Y 12 19
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.61    0.61    1.00    1.00    1.00
epitable(oswego2$Water, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 18 33
##        Y 11 13
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.65    0.54    0.84    0.64    0.45
epitable(oswego2$Cakes, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 16 19
##        Y 13 27
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.54    0.68    1.24    1.75    0.34
epitable(oswego2$Vanilla.ice.cream, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 18  3
##        Y 11 43
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.14    0.80    5.57   23.45    0.00
epitable(oswego2$Chocolate.ice.cream, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N  7 20
##        Y 22 25
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.74    0.53    0.72    0.40    0.09
epitable(oswego2$Fruit.salad, oswego2$Ill)
## $data
##         outcome
## exposure  N  Y
##        N 27 42
##        Y  2  4
## 
## $results
##      R0      R1      RR      OR p.value 
##    0.61    0.67    1.10    1.29    1.00