Example

Out of 350 patients that adhered to a new drug treatment, 273 recovered.The response proportion was 273/350 or 0.78 (78%). We can use the rbinom function to simulate this binomial process. Let’s simulate this 10 times (i.e., 10 replicates).

replicates <- 10
num_of_events <- 273
num_at_risk <- 350
risk <- num_of_events/num_at_risk
sim_events <- rbinom(n = replicates, size = num_at_risk, prob = risk)
sim_events

##  [1] 288 271 266 272 262 262 262 277 268 270

Exercise 5.1 (Bootstrap a confidence interval). In the same study above, 350 patients did not take the new drug treatment, and 289 recovered. Calculate the bootstrap confidence interval for this proportion (289/350).

replicates <- 5000
num_of_events <- 289
num_at_risk <- 350
risk <- num_of_events/num_at_risk
sim_events <- rbinom(n = replicates, size = num_at_risk, prob = risk)
conf.level <- 0.95
alpha <- 1 - conf.level
## 0.95 CI for binomial counts
quantile(sim_events, c(alpha/2, 1-alpha/2))

##  2.5% 97.5% 
##   275   303

## 0.95 CI for binomial proportions
quantile(sim_events, c(alpha/2, 1-alpha/2))/num_at_risk

##      2.5%     97.5% 
## 0.7857143 0.8657143

Exercise 5.2 (Function for bootstrapping a confidence interval). Adapt your code from the previous problem to write and test a function for calculating the bootstrap confidence interval for a binomial proportion.

boots <- function(x1, x0, conf.level = 0.95, replicates = 5000){
#### x1 = c(number cases among exposed, number of exposed)
#### x0 = c(number cases among nonexposed, number of nonexposed)
n1 <- x1[2] 
n0 <- x0[2]
p1 <- x1/n1 # risk among exposed
p0 <- x0/n0 # risk among unexposed
risk <- num_of_events/num_at_risk
sim_events <- rbinom(n = replicates, size = num_at_risk, prob = risk)
alpha <- 1 - conf.level
## 0.95 CI for binomial counts
quantile(sim_events, c(alpha/2, 1-alpha/2))
ci<-quantile(sim_events, c(alpha/2, 1-alpha/2))/num_at_risk
list(
conf.int = unname(ci),
conf.level = conf.level,
replicates = replicates)
}
## test
x1<-289
x0<-350
boots(x1, x0, conf.level = 0.95, replicates = 5000)

## $conf.int
## [1] 0.7857143 0.8657143
## 
## $conf.level
## [1] 0.95
## 
## $replicates
## [1] 5000

Exercise 5.3 (Calculating risks and risk ratio). Consider an observational study where 700 patients were given access to a new drug for an ailment. A total of 350 patients chose to take the drug and 350 patients did not. Table 5.4 displays the results of this study.

Using the results from Table 5.4 create and test a function that:

Calculates the risks 𝑅1 and 𝑅0

Calculates the risk ratio (𝑅𝑅)

Calculates 95% confidence intervals

Uses the fisher.test function (Fisher’s Exact Test) to calculate the twosided 𝑝-value of the null hypothesis of no association. (Hint: see epitable function in Exercise 5.5)

Collects and returns all the results as a ‘list’.

Tab5.4 <- matrix(c(273,289,77, 61), 2, 2)
rownames(Tab5.4) <- c('Yes', 'No')
colnames(Tab5.4) <- c('Yes', 'No')
dimnames(Tab5.4)<- list( Treatment= c('Yes','No'), Recovery= c('Yes','No') )
epitable <- function(a, b, c, d, digits = 3, ...){
N1=a+c
N0=b+d
R0 <- b/N0
R1 <- a/N1
RR <- R1/R0
OR <- (R1/(1-R1)) / (R0/(1-R0))
ft <- fisher.test(Tab5.4,simulate.p.value=TRUE,B=2000,...)
pv <- ft$p.value
SE.RR<-sqrt((1/a)-(1/N1)+(1/b)-(1/N0))
conf.int<-c(RR-1.96*SE.RR, RR+1.96*SE.RR)
results <- signif(c(R0, R1, RR, OR, pv, conf.int), digits = digits)
names(results) <- c('R_0', 'R_1', 'RR', 'OR', 'p.value', '95 lower ci', '95 upper ci')
list(data = Tab5.4, results = results)
}
epitable(273,289, 77, 61)

## $data
##          Recovery
## Treatment Yes No
##       Yes 273 77
##       No  289 61
## 
## $results
##         R_0         R_1          RR          OR     p.value 95 lower ci 
##       0.826       0.780       0.945       0.748       0.154       0.871 
## 95 upper ci 
##       1.020

fisher.test(Tab5.4)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  Tab5.4
## p-value = 0.154
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.504886 1.106634
## sample estimates:
## odds ratio 
##  0.7486641

Exercise 5.4 (Evaluating drug treatment). Consider an observational study where 700 patients were given access to a new drug for an ailment. A total of 350 patients chose to take the drug and 350 patients did not. Table 5.4 displays the results of this study.Using the read.csv function import the data frame for Table 5.4. Use the code below.

Analyze the drugrx.df data frame using the xtab function, your function from the previous problem, and any R functions you wish. Is the drug treatment effective? Is there a difference in drug treatment effectiveness comparing men and women?

drugrx.df<- read.csv("/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Practice/data/drugrx-pearl2.csv")
xTab5.4<- xtabs(~ Drug + Recovered, data = drugrx.df)
fisher.test(xTab5.4)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  xTab5.4
## p-value = 0.154
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.504886 1.106634
## sample estimates:
## odds ratio 
##  0.7486641

Statistically, it seems the drug is not effective. Since the drug is not effective, its effectiveness comparing both men and women will not be appreciated

Exercise 5.5 (Oswego outbreak investigation). Study this clean version of the Oswego data set and import it as a data frame: ~/data/oswego/oswego2.txt. Remember this data set is available from https://github.com/taragonmd/data. Locate the data set and select the “Raw” tab. Copy the URL and use with the read.csv function. Import this clean version.

oswego2 <- read.csv("https://raw.githubusercontent.com/taragonmd/data/master/oswego/oswego2.txt", strip.white = TRUE)
food.item <- c("Baked.ham", "Spinach", "Mashed.potato", "Cabbage.salad",
"Jello", "Rolls", "Brown.bread", "Milk", "Coffee", "Water", "Cakes",
"Vanilla.ice.cream", "Chocolate.ice.cream", "Fruit.salad")
names(oswego2)[9:22] <- food.item
str(oswego2)

## 'data.frame':    75 obs. of  22 variables:
##  $ ID                 : int  2 3 4 6 7 8 9 10 14 16 ...
##  $ Age                : int  52 65 59 63 70 40 15 33 10 32 ...
##  $ Sex                : Factor w/ 2 levels "F","M": 1 2 1 1 2 1 1 1 2 1 ...
##  $ MealDate           : Factor w/ 1 level "04/18/1940": 1 1 1 1 1 1 1 1 1 1 ...
##  $ MealTime           : Factor w/ 6 levels "11:00:00","18:30:00",..: 5 2 2 4 4 4 6 3 4 NA ...
##  $ Ill                : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
##  $ OnsetDate          : Factor w/ 2 levels "4/18/1940","4/19/1940": 2 2 2 1 1 2 2 1 2 2 ...
##  $ OnsetTime          : Factor w/ 17 levels "00:00:00","00:30:00",..: 2 2 2 15 15 4 3 16 4 7 ...
##  $ Baked.ham          : Factor w/ 2 levels "N","Y": 2 2 2 2 2 1 1 2 1 2 ...
##  $ Spinach            : Factor w/ 2 levels "N","Y": 2 2 2 2 2 1 1 2 1 2 ...
##  $ Mashed.potato      : Factor w/ 2 levels "N","Y": 2 2 1 1 2 1 1 2 1 1 ...
##  $ Cabbage.salad      : Factor w/ 2 levels "N","Y": 1 2 1 2 1 1 1 1 1 1 ...
##  $ Jello              : Factor w/ 2 levels "N","Y": 1 1 1 2 2 1 1 1 1 1 ...
##  $ Rolls              : Factor w/ 2 levels "N","Y": 2 1 1 1 2 1 1 2 1 2 ...
##  $ Brown.bread        : Factor w/ 2 levels "N","Y": 1 1 1 1 2 1 1 2 1 1 ...
##  $ Milk               : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Coffee             : Factor w/ 2 levels "N","Y": 2 2 2 1 2 1 1 1 1 2 ...
##  $ Water              : Factor w/ 2 levels "N","Y": 1 1 1 2 2 1 1 2 1 1 ...
##  $ Cakes              : Factor w/ 2 levels "N","Y": 1 1 2 1 1 1 2 1 1 2 ...
##  $ Vanilla.ice.cream  : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 1 2 2 2 ...
##  $ Chocolate.ice.cream: Factor w/ 2 levels "N","Y": 1 2 2 1 1 2 2 2 2 2 ...
##  $ Fruit.salad        : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...

Study the following epitable function, especially the use of the … argument to past options to the fisher.test function.

epitable <- function(exposure, outcome, digits = 3, ...){
# 2x2 table from 'xtabs' function
# row names are exposure status
# col names are illness outcome
# digits = significant digits (default = 3)
# ... = options to fisher.test function
x <- xtabs(~ exposure + outcome, na.action=na.pass)
rowtot <- apply(x, 1, sum)
rowdist <- sweep(x, 1, rowtot, '/')
risk0 <- rowdist['N','Y']
risk1 <- rowdist['Y','Y']
rr <- risk1/risk0
or <- (risk1/(1-risk1)) / (risk0/(1-risk0))
ft <- fisher.test(x, ...)
pv <- ft$p.value
results <- signif(c(risk0, risk1, rr, or, pv), digits = digits)
names(results) <- c('R_0', 'R_1', 'RR', 'OR', 'p.value')
list(data = x, results = results)
}

Use the epitable function to test each food item to see if it is associated with developing illness. Construct a table with the following column headings

epitable(oswego2$Baked.ham, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 12 17
##        Y 17 29
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.586   0.630   1.080   1.200   0.809

epitable(oswego2$Spinach, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 12 20
##        Y 17 26
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.625   0.605   0.967   0.918   1.000

epitable(oswego2$Mashed.potato, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 14 23
##        Y 14 23
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.622   0.622   1.000   1.000   1.000

epitable(oswego2$Cabbage.salad, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 19 28
##        Y 10 18
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.596   0.643   1.080   1.220   0.808

epitable(oswego2$Jello, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 22 30
##        Y  7 16
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.577   0.696   1.210   1.680   0.442

epitable(oswego2$Rolls, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 13 25
##        Y 16 21
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.658   0.568   0.863   0.682   0.482

epitable(oswego2$Brown.bread, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 20 28
##        Y  9 18
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.583   0.667   1.140   1.430   0.622

epitable(oswego2$Milk, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 27 44
##        Y  2  2
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.620   0.500   0.807   0.614   0.638

epitable(oswego2$Coffee, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 17 27
##        Y 12 19
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.614   0.613   0.999   0.997   1.000

epitable(oswego2$Water, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 18 33
##        Y 11 13
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.647   0.542   0.837   0.645   0.450

epitable(oswego2$Cakes, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 16 19
##        Y 13 27
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.543   0.675   1.240   1.750   0.342

epitable(oswego2$Vanilla.ice.cream, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 18  3
##        Y 11 43
## 
## $results
##      R_0      R_1       RR       OR  p.value 
## 1.43e-01 7.96e-01 5.57e+00 2.35e+01 2.60e-07

epitable(oswego2$Chocolate.ice.cream, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N  7 20
##        Y 22 25
## 
## $results
##     R_0     R_1      RR      OR p.value 
##  0.7410  0.5320  0.7180  0.3980  0.0891

epitable(oswego2$Fruit.salad, oswego2$Ill)

## $data
##         outcome
## exposure  N  Y
##        N 27 42
##        Y  2  4
## 
## $results
##     R_0     R_1      RR      OR p.value 
##   0.609   0.667   1.100   1.290   1.000

The Table

Table <- function(x) {
epitable(x, oswego2$Ill)$results
}
t(sapply(oswego2[, food.item], Table))

##                       R_0   R_1    RR     OR  p.value
## Baked.ham           0.586 0.630 1.080  1.200 8.09e-01
## Spinach             0.625 0.605 0.967  0.918 1.00e+00
## Mashed.potato       0.622 0.622 1.000  1.000 1.00e+00
## Cabbage.salad       0.596 0.643 1.080  1.220 8.08e-01
## Jello               0.577 0.696 1.210  1.680 4.42e-01
## Rolls               0.658 0.568 0.863  0.682 4.82e-01
## Brown.bread         0.583 0.667 1.140  1.430 6.22e-01
## Milk                0.620 0.500 0.807  0.614 6.38e-01
## Coffee              0.614 0.613 0.999  0.997 1.00e+00
## Water               0.647 0.542 0.837  0.645 4.50e-01
## Cakes               0.543 0.675 1.240  1.750 3.42e-01
## Vanilla.ice.cream   0.143 0.796 5.570 23.500 2.60e-07
## Chocolate.ice.cream 0.741 0.532 0.718  0.398 8.91e-02
## Fruit.salad         0.609 0.667 1.100  1.290 1.00e+00

From the list of results, only Vanilla ice cream’s p-value is significant. Therefore Vanilla ice cream may be associated with illness

Exercise 5.6 (Generate outbreak investigation results table). Write a for loop to automate the creation of the outbreak table from previous exercise. Hint provided:

outbreak <- matrix(NA, length(food.item), 5)
rownames(outbreak) <- food.item
colnames(outbreak) <- c('R_0','R_1','RR','OR','P value')
for(i in 1:length(food.item)) {
outbreak[i,] <- epitable(oswego2[,food.item[i]], oswego2$Ill)$results
}
outbreak

##                       R_0   R_1    RR     OR  P value
## Baked.ham           0.586 0.630 1.080  1.200 8.09e-01
## Spinach             0.625 0.605 0.967  0.918 1.00e+00
## Mashed.potato       0.622 0.622 1.000  1.000 1.00e+00
## Cabbage.salad       0.596 0.643 1.080  1.220 8.08e-01
## Jello               0.577 0.696 1.210  1.680 4.42e-01
## Rolls               0.658 0.568 0.863  0.682 4.82e-01
## Brown.bread         0.583 0.667 1.140  1.430 6.22e-01
## Milk                0.620 0.500 0.807  0.614 6.38e-01
## Coffee              0.614 0.613 0.999  0.997 1.00e+00
## Water               0.647 0.542 0.837  0.645 4.50e-01
## Cakes               0.543 0.675 1.240  1.750 3.42e-01
## Vanilla.ice.cream   0.143 0.796 5.570 23.500 2.60e-07
## Chocolate.ice.cream 0.741 0.532 0.718  0.398 8.91e-02
## Fruit.salad         0.609 0.667 1.100  1.290 1.00e+00

Exercise 5.7 (Comparing lapply and sapply). Study the following use of lapply and sapply. Explain how both functions work, and what are their differences.

myepitab <- function(x) {
epitable(x, oswego2$Ill)$results
}
lapply(oswego2[, food.item], myepitab)

## $Baked.ham
##     R_0     R_1      RR      OR p.value 
##   0.586   0.630   1.080   1.200   0.809 
## 
## $Spinach
##     R_0     R_1      RR      OR p.value 
##   0.625   0.605   0.967   0.918   1.000 
## 
## $Mashed.potato
##     R_0     R_1      RR      OR p.value 
##   0.622   0.622   1.000   1.000   1.000 
## 
## $Cabbage.salad
##     R_0     R_1      RR      OR p.value 
##   0.596   0.643   1.080   1.220   0.808 
## 
## $Jello
##     R_0     R_1      RR      OR p.value 
##   0.577   0.696   1.210   1.680   0.442 
## 
## $Rolls
##     R_0     R_1      RR      OR p.value 
##   0.658   0.568   0.863   0.682   0.482 
## 
## $Brown.bread
##     R_0     R_1      RR      OR p.value 
##   0.583   0.667   1.140   1.430   0.622 
## 
## $Milk
##     R_0     R_1      RR      OR p.value 
##   0.620   0.500   0.807   0.614   0.638 
## 
## $Coffee
##     R_0     R_1      RR      OR p.value 
##   0.614   0.613   0.999   0.997   1.000 
## 
## $Water
##     R_0     R_1      RR      OR p.value 
##   0.647   0.542   0.837   0.645   0.450 
## 
## $Cakes
##     R_0     R_1      RR      OR p.value 
##   0.543   0.675   1.240   1.750   0.342 
## 
## $Vanilla.ice.cream
##      R_0      R_1       RR       OR  p.value 
## 1.43e-01 7.96e-01 5.57e+00 2.35e+01 2.60e-07 
## 
## $Chocolate.ice.cream
##     R_0     R_1      RR      OR p.value 
##  0.7410  0.5320  0.7180  0.3980  0.0891 
## 
## $Fruit.salad
##     R_0     R_1      RR      OR p.value 
##   0.609   0.667   1.100   1.290   1.000

t(sapply(oswego2[, food.item], myepitab)) # transpose output

##                       R_0   R_1    RR     OR  p.value
## Baked.ham           0.586 0.630 1.080  1.200 8.09e-01
## Spinach             0.625 0.605 0.967  0.918 1.00e+00
## Mashed.potato       0.622 0.622 1.000  1.000 1.00e+00
## Cabbage.salad       0.596 0.643 1.080  1.220 8.08e-01
## Jello               0.577 0.696 1.210  1.680 4.42e-01
## Rolls               0.658 0.568 0.863  0.682 4.82e-01
## Brown.bread         0.583 0.667 1.140  1.430 6.22e-01
## Milk                0.620 0.500 0.807  0.614 6.38e-01
## Coffee              0.614 0.613 0.999  0.997 1.00e+00
## Water               0.647 0.542 0.837  0.645 4.50e-01
## Cakes               0.543 0.675 1.240  1.750 3.42e-01
## Vanilla.ice.cream   0.143 0.796 5.570 23.500 2.60e-07
## Chocolate.ice.cream 0.741 0.532 0.718  0.398 8.91e-02
## Fruit.salad         0.609 0.667 1.100  1.290 1.00e+00

Both lapply and sapply are used to apply function to each component (bin) of a list. They are identical except that sapply “simplifies” the final result and lapply returns list

Exercise 5.8 (Lexical scoping). Study the following R code and explain the final answer for g(2):

a <- 1
b <- 2
f <- function(x) {
a * x + b
}
g <- function(x) {
a <- 2
b <- 1
f(x)
}
g(2)

## [1] 4

The values executed by g(x) is predefined by the function of ‘f’ and R executes the function defined before ‘f’. Although the functions are defined before ‘g’, R doesn’t consider the position relative to ‘g’. Since a <- 1, b <- 2 was defined before ‘f’, hence R works with this code. Therefore a * x + b= 1*2+2=4.

Exercise 5.9 (More lexical scoping). Study the following R code and explain the final answer for g(2):

g <- function(x) {
a <- 2
b <- 1
f <- function(x) {
a * x + b
}
f(x)
}
g(2)

## [1] 5

Homework 5

Auwal

11/11/2019

Example

Out of 350 patients that adhered to a new drug treatment, 273 recovered.The response proportion was 273/350 or 0.78 (78%). We can use the rbinom function to simulate this binomial process. Let’s simulate this 10 times (i.e., 10 replicates).

Exercise 5.1 (Bootstrap a confidence interval). In the same study above, 350 patients did not take the new drug treatment, and 289 recovered. Calculate the bootstrap confidence interval for this proportion (289/350).

Exercise 5.2 (Function for bootstrapping a confidence interval). Adapt your code from the previous problem to write and test a function for calculating the bootstrap confidence interval for a binomial proportion.

Exercise 5.3 (Calculating risks and risk ratio). Consider an observational study where 700 patients were given access to a new drug for an ailment. A total of 350 patients chose to take the drug and 350 patients did not. Table 5.4 displays the results of this study.

Using the results from Table 5.4 create and test a function that:

Calculates the risks 𝑅1 and 𝑅0

Calculates the risk ratio (𝑅𝑅)

Calculates 95% confidence intervals

Uses the fisher.test function (Fisher’s Exact Test) to calculate the twosided 𝑝-value of the null hypothesis of no association. (Hint: see epitable function in Exercise 5.5)

Collects and returns all the results as a ‘list’.

Analyze the drugrx.df data frame using the xtab function, your function from the previous problem, and any R functions you wish. Is the drug treatment effective? Is there a difference in drug treatment effectiveness comparing men and women?

Statistically, it seems the drug is not effective. Since the drug is not effective, its effectiveness comparing both men and women will not be appreciated

Study the following epitable function, especially the use of the … argument to past options to the fisher.test function.

Use the epitable function to test each food item to see if it is associated with developing illness. Construct a table with the following column headings

The Table

From the list of results, only Vanilla ice cream’s p-value is significant. Therefore Vanilla ice cream may be associated with illness

Exercise 5.6 (Generate outbreak investigation results table). Write a for loop to automate the creation of the outbreak table from previous exercise. Hint provided:

Exercise 5.7 (Comparing lapply and sapply). Study the following use of lapply and sapply. Explain how both functions work, and what are their differences.

Both lapply and sapply are used to apply function to each component (bin) of a list. They are identical except that sapply “simplifies” the final result and lapply returns list

Exercise 5.8 (Lexical scoping). Study the following R code and explain the final answer for g(2):

Exercise 5.9 (More lexical scoping). Study the following R code and explain the final answer for g(2):

Homework 5

Auwal

11/11/2019

Example

Out of 350 patients that adhered to a new drug treatment, 273 recovered.The response proportion was 273/350 or 0.78 (78%). We can use the rbinom function to simulate this binomial process. Let’s simulate this 10 times (i.e., 10 replicates).

Exercise 5.1 (Bootstrap a confidence interval). In the same study above, 350 patients did not take the new drug treatment, and 289 recovered. Calculate the bootstrap confidence interval for this proportion (289/350).

Exercise 5.2 (Function for bootstrapping a confidence interval). Adapt your code from the previous problem to write and test a function for calculating the bootstrap confidence interval for a binomial proportion.

Exercise 5.3 (Calculating risks and risk ratio). Consider an observational study where 700 patients were given access to a new drug for an ailment. A total of 350 patients chose to take the drug and 350 patients did not. Table 5.4 displays the results of this study.

Using the results from Table 5.4 create and test a function that:

Calculates the risks 𝑅1 and 𝑅0

Calculates the risk ratio (𝑅𝑅)

Calculates 95% confidence intervals

Uses the fisher.test function (Fisher’s Exact Test) to calculate the twosided 𝑝-value of the null hypothesis of no association. (Hint: see epitable function in Exercise 5.5)

Collects and returns all the results as a ‘list’.

Analyze the drugrx.df data frame using the xtab function, your function from the previous problem, and any R functions you wish. Is the drug treatment effective? Is there a difference in drug treatment effectiveness comparing men and women?

Statistically, it seems the drug is not effective. Since the drug is not effective, its effectiveness comparing both men and women will not be appreciated

Study the following epitable function, especially the use of the … argument to past options to the fisher.test function.

Use the epitable function to test each food item to see if it is associated with developing illness. Construct a table with the following column headings

The Table

From the list of results, only Vanilla ice cream’s p-value is significant. Therefore Vanilla ice cream may be associated with illness

Exercise 5.6 (Generate outbreak investigation results table). Write a for loop to automate the creation of the outbreak table from previous exercise. Hint provided:

Exercise 5.7 (Comparing lapply and sapply). Study the following use of lapply and sapply. Explain how both functions work, and what are their differences.

Both lapply and sapply are used to apply function to each component (bin) of a list. They are identical except that sapply “simplifies” the final result and lapply returns list

Exercise 5.8 (Lexical scoping). Study the following R code and explain the final answer for g(2):

Exercise 5.9 (More lexical scoping). Study the following R code and explain the final answer for g(2):

This time, the functions are defined after ‘g’. Since the position related to ‘g’ doesn’t count, R only executes the funtion defined before of ‘f’. therefore, R executes a <- 2, b <- 1. a * x + b= 2*2+1= 5