R tutorial using epitools: Confounding and Effect Modification

Overview

epitools is an R package that allows you to perform basic epidemiologic calculations such as the risk ratio (RR) and odds ratio (OR).

⊕Documentation on epitools can be found using the following link

In this example, we will go over how to use epitools to estimate the risk ratio and odds ratio using a matrix that we create. This is useful for when you have a table (e.g., 2 x 2 contingency table) and want to check the risk ratio or odds ratio.

Here some examples of using epitools.

Step 1: We create a matrix using some values

##########################################
### Estimate the risk and odds ratios
##########################################
# Step 1: Create a matrix
Table1 <- matrix(c(11, 36, 518, 517), nrow = 2, ncol = 2)
Table1

##      [,1] [,2]
## [1,]   11  518
## [2,]   36  517

This generates the 2 x 2 contingency table that we will use for this simple example. The arrangement is critical for interpreting the output. By default, the unexposed group (exposure = 0) is in the first row and the non-outcome (outcome = 0) is in the first column. However, you can use the epitools command to change this arrangement with the rev() argument so that the analysis will use the contingency table on the right where the exposed group (exposure = 1) is in the first row and the outcome (outcome = 1) is in the first column.

Here is an example of the arrangement:

Throughout this exercise, I will interpret the findings using the arrangement on the right.

Step 2: We use the `riskratio.wald()` function to estimate the risk ratio for our object (`Table1`)

# Step 2: Estimate the RR
riskratio.wald(Table1, rev = c("both")) # use the rev() argument to change the arrangement of the contingency table

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      517       36   553
##   Exposed1      518       11   529
##   Total        1035       47  1082
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor   estimate     lower     upper
##   Exposed2 1.0000000        NA        NA
##   Exposed1 0.3194182 0.1643304 0.6208709
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact   chi.square
##   Exposed2           NA           NA           NA
##   Exposed1 0.0002954494 0.0003001641 0.0003517007
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

The risk ratio is 0.32 with a 95% confidence interval (CI) of 0.16, 0.62. The p-value is <0.001. Therefore, subjects in the Exposure Group 1 had a 68% lower risk of developing the outcome (Outcome = 1) compared to subjects in the Exposure Group 2.

Step 3: We use the `oddsratio.wald()` function to estimate the odds ratio for our object (`Table1`)

# Step 3: Estimate the OR
oddsratio.wald(Table1, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      517       36   553
##   Exposed1      518       11   529
##   Total        1035       47  1082
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor   estimate     lower     upper
##   Exposed2 1.0000000        NA        NA
##   Exposed1 0.3049657 0.1535563 0.6056675
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact   chi.square
##   Exposed2           NA           NA           NA
##   Exposed1 0.0002954494 0.0003001641 0.0003517007
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

The odds ratio is 0.30 with a 95% CI of 0.15, 0.61. The p-value is <0.001. Therefore the subjects in the Exposure Group 1 had a 70% lower odds or developing the outcome (Outcome = 1) compared to subjects in the Exposure Group 2.

Once you’ve estimated the risk ratio and odds ratio, you can double check your work. Here is how I checked the risk and odds ratio calculation.

### RR check
risk1 <- 11 / (11+517)
risk2 <- 36/(36+518)
RR <- risk1/risk2
RR

## [1] 0.3206019

### OR check
num <- 11 * 517
denom <- 518 * 36
OR <- num / denom
OR

## [1] 0.3049657

Unfortunately, epitools does not seem to estimate the risk differences, so I show how you can do that using some simple R codes.

# Step 4: Estimate the risk difference
risk1 <- 11 / (518 + 11)
risk2 <- 36 / (517 + 36)
RD <- risk1 - risk2
RD

## [1] -0.04430551

Motivating example

Suppose we were interested in figuring out if confounding or effect modification was happening to a hypothetical drug study. In this drug study, patients either have high or low cholesterol. These patients were followed up for 12 months and assessed for all-cause mortality (e.g., death).

Here is a direct acyclic graph depicting Cholesterol and Death with Exercise as a potential confounder.

⊕The RR and OR in this example is a crude estimate, which means that we did not adjust for confounding. Let’s assume a retrospective cohort study was performed to collect data to investigate the association between cholesterol status (high versus low) and all-cause mortality. The data are aggregated into a contingency table.

We can enter the values from the table into R as a matrix

##################################################################################
# Motivating Example #1 [Does High cholesterol (v. Low cholesterol) cause Death?]
##################################################################################
Table2 <- matrix(c(250, 150, 2000, 1500), nrow = 2, ncol = 2)
Table2

##      [,1] [,2]
## [1,]  250 2000
## [2,]  150 1500

Then we can estimate the risk ratio (RR) and odds ratio (OR) using the riskratio.wald() and oddsratio.wald() functions. We will use the Wald method for estimating the 95% CIs.

⊕Throughout the exercise, I will present the findings in both the RR and OR. Recall that with rare events, the RR and OR yield similar results. However, with more common events, the RR and OR will diverge.

riskratio.wald(Table2, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2     1500      150  1650
##   Exposed1     2000      250  2250
##   Total        3500      400  3900
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2 1.000000       NA       NA
##   Exposed1 1.222222 1.008509 1.481224
## 
## $p.value
##           two-sided
## Predictor  midp.exact fisher.exact chi.square
##   Exposed2         NA           NA         NA
##   Exposed1 0.03942366   0.04233379 0.03993182
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table2, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2     1500      150  1650
##   Exposed1     2000      250  2250
##   Total        3500      400  3900
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2     1.00       NA       NA
##   Exposed1     1.25 1.009986 1.547051
## 
## $p.value
##           two-sided
## Predictor  midp.exact fisher.exact chi.square
##   Exposed2         NA           NA         NA
##   Exposed1 0.03942366   0.04233379 0.03993182
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

Subjects with High Cholesterol had a 22% increase in the risk of Death compared to subjects with Low Cholesterol (RR = 1.22; 95% CI: 1.01, 1.48; P=0.040).

Subjects with High Cholesterol had a 25% increase in odds of Death compared to subjects with Low Cholesterol (OR = 1.25; 95% CI: 1.01, 1.55; P=0.040).

In both these measures, the association was significant.

Confounding

Let’s suppose we are interested in seeing whether Exercise is a confounder on the Drug to Death direct pathway.

DAG diagram with Exercise as a confounder.

To check if there is confounding, we need to determine if the confounder meets three criteria:

The confounding variable is associated with the outcome
The confounding variable is associated with treatment assignment
The confounding variable is not on the causal pathway between the exposure and outcome

For the first criterion, we look at the association between Exercise and Death.

### Criterion 1: The confounding variable is associated with the outcome (Is exercise associated with death?)
Table3<- matrix(c(200, 200, 2400, 1100), nrow = 2, ncol= 2)
Table3

##      [,1] [,2]
## [1,]  200 2400
## [2,]  200 1100

riskratio.wald(Table3, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2     1100      200  1300
##   Exposed1     2400      200  2600
##   Total        3500      400  3900
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor  estimate     lower     upper
##   Exposed2      1.0        NA        NA
##   Exposed1      0.5 0.4158255 0.6012138
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact   chi.square
##   Exposed2           NA           NA           NA
##   Exposed1 3.715916e-13  4.52674e-13 8.380705e-14
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table3, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2     1100      200  1300
##   Exposed1     2400      200  2600
##   Total        3500      400  3900
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor   estimate     lower     upper
##   Exposed2 1.0000000        NA        NA
##   Exposed1 0.4583333 0.3720441 0.5646359
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact   chi.square
##   Exposed2           NA           NA           NA
##   Exposed1 3.715916e-13  4.52674e-13 8.380705e-14
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

⊕First criterion: Confounder is associated with the outcome. In this example, Exercise is assocaited with Death.

Subjects who exercise had a lower risk (RR = 0.50; 95% CI: 0.42, 0.60) and odds (OR = 0.46; 95% CI: 0.37, 0.56 ) of Death compared to subjects who did not exercise. Hence, this satisfies the first criterion for confounding. The results are summarized into a table below.

For the second criterion, we look at the association between Exercise and Cholesterol Status (High v. Low).

### Criterion 2: The confounding variable (Exercise) is associated with the Cholesterol Status (Is exercise associated with cholesterol?)
Table4<- matrix(c(1750, 500, 850, 800), nrow = 2, ncol= 2)
Table4

##      [,1] [,2]
## [1,] 1750  850
## [2,]  500  800

riskratio.wald(Table4, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      800      500  1300
##   Exposed1      850     1750  2600
##   Total        1650     2250  3900
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor  estimate   lower    upper
##   Exposed2     1.00      NA       NA
##   Exposed1     1.75 1.62551 1.884024
## 
## $p.value
##           two-sided
## Predictor  midp.exact fisher.exact   chi.square
##   Exposed2         NA           NA           NA
##   Exposed1          0 5.385362e-66 3.221793e-66
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table4, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      800      500  1300
##   Exposed1      850     1750  2600
##   Total        1650     2250  3900
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor  estimate    lower   upper
##   Exposed2 1.000000       NA      NA
##   Exposed1 3.294118 2.867891 3.78369
## 
## $p.value
##           two-sided
## Predictor  midp.exact fisher.exact   chi.square
##   Exposed2         NA           NA           NA
##   Exposed1          0 5.385362e-66 3.221793e-66
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

⊕Second criterion: Confounder is associated with the exposure In this example, Exercise is associated with Cholesterol Status (High and Low).

⊕Third criterion: Confounder is not on the causal pathway between the exposure and outcome.

Subjects who exercise had a higher risk (RR = 1.75; 95% CI: 1.63, 1.88) and odds (OR = 3.29; 95% CI: 2.87, 3.78) of having High Cholesterol compared to subjects who do not exercise. Hence, this satisfies the second criterion for confounding. The results are summarized into a table below.

Since Exercise is not on the causal pathway between Cholesterol Status and Death, it meets the necessary criteria to be considered a confounder.

⊕Stratifying your cohort is a great way to see how different the results can be. In this tutorial, we focus on stratifying on a couple of variables. But you can stratify many variables as long as you have a large enough sample.

We can check to see how much of an impact Exercise has on the exposure to outcome relationship using stratification. We do this by stratifying the groups into those who exercise and don’t exercise. Then we evaluate the exposure to outcome relationship.

Here is an illustration of how exercise is stratified into two strata (Exercise = 1 and Exercise = 0).

The association between Cholesterol Status and Death can be estimated for each strata.

### Distribution of Exercise across Drug groups
# Among subjects who exercise (N=2600)
Table3 <- matrix(c(150, 50, 1600, 800), nrow = 2, ncol = 2)
Table3

##      [,1] [,2]
## [1,]  150 1600
## [2,]   50  800

riskratio.wald(Table3, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      800       50   850
##   Exposed1     1600      150  1750
##   Total        2400      200  2600
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2 1.000000       NA       NA
##   Exposed1 1.457143 1.069385 1.985501
## 
## $p.value
##           two-sided
## Predictor  midp.exact fisher.exact chi.square
##   Exposed2         NA           NA         NA
##   Exposed1 0.01429813   0.01507732 0.01578802
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table3, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      800       50   850
##   Exposed1     1600      150  1750
##   Total        2400      200  2600
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2      1.0       NA       NA
##   Exposed1      1.5 1.077177 2.088794
## 
## $p.value
##           two-sided
## Predictor  midp.exact fisher.exact chi.square
##   Exposed2         NA           NA         NA
##   Exposed1 0.01429813   0.01507732 0.01578802
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

# Among subjects who do not exercise (N=1300)
Table4 <- matrix(c(100, 100, 400, 700), nrow = 2, ncol = 2)
Table4

##      [,1] [,2]
## [1,]  100  400
## [2,]  100  700

riskratio.wald(Table4, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      700      100   800
##   Exposed1      400      100   500
##   Total        1100      200  1300
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2      1.0       NA       NA
##   Exposed1      1.6 1.241526 2.061978
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact   chi.square
##   Exposed2           NA           NA           NA
##   Exposed1 0.0003199078 0.0003604172 0.0002660503
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table4, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      700      100   800
##   Exposed1      400      100   500
##   Total        1100      200  1300
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor  estimate   lower    upper
##   Exposed2     1.00      NA       NA
##   Exposed1     1.75 1.29231 2.369787
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact   chi.square
##   Exposed2           NA           NA           NA
##   Exposed1 0.0003199078 0.0003604172 0.0002660503
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

Among subjects who exercise, those with High Cholesterol had a higher risk (RR = 1.45; 95% CI: 1.07, 1.99) and odds (OR = 1.50; 95% CI: 1.08, 2.09) of Death compared to those with Low Cholesterol. Similarly, among subjects who did not exercise, those with High Cholesterol had a higher risk (RR = 1.60; 95% CI: 1.24, 2.06) and odds (OR = 1.75; 95% CI: 1.29, 2.37) of Death compared to those with Low Cholesterol.

Adjusting for Confounding

Compared to the crude analysis where the RR = 1.22 and the OR = 1.25, the stratified results are much higher. this suggests that Exercise has some confounding effect on the exposure to outcome relationship. When we stratify the groups, we get a stronger measure of association between Cholesterol Status and Death. However, we would like to have a single measure of this association, which means that we need to combine these two stratified results. A common method for adjusting these stratified results into a single measure of association is to use the Mantel-Haenszel (M-H) method of adjustment.

⊕These equations are for the point estimates (RR or OR), but not the 95% CI. If you need to estimate the 95% CI, it’s recommended that you use the epi.2by2() function, which is part of the epiR package.

The equation for the M-H adjusted risk ratio:

\(\begin{aligned} \LARGE RR_{adjusted} = \frac{\sum{\frac{a_{i}(c_{i} + d_{i})}{n_{i}}}}{\sum{\frac{c_{i}(a_{i} + b_{i})}{n_{i}}}} \end{aligned}\)

The equation for the M-H adjusted odds ratio:

\(\begin{aligned} \LARGE OR_{adjusted} = \frac{\sum{\frac{a_{i}d_{i}}{n_{i}}}}{\sum{\frac{c_{i}b_{i}}{n_{i}}}} \end{aligned}\)

Fortunately, R has an easier way to estimate the M-H adjusted RR and OR.

You need to install the epiR package, which contains the epi.2by2() function, which will generate the M-H adjusted RR and OR.

## install.packages("epiR")
library("epiR")

⊕The epiR documentation is available here.

We need to create an array with our stratified matrices. We already created two tables (Table4 and Table5), which contains the stratified groups. Table3 contains the stratum where subjects were classified as having exercised (Exercise = 1), and Table4 contains the stratum where subjects were classified as not having exercised (Exercise = 0).

matrix.array <- array(c(Table3, Table4), dim = c(2, 2, 2))
matrix.array

## , , 1
## 
##      [,1] [,2]
## [1,]  150 1600
## [2,]   50  800
## 
## , , 2
## 
##      [,1] [,2]
## [1,]  100  400
## [2,]  100  700

Once we have our array, we can use the epi.2by2() function. This will generate the M-H adjusted RR and OR.

⊕The output contains a lot of information. We are interested in the risk ratio (M-H) and the odds ratio (M-H).

epi.2by2(matrix.array)

##              Outcome +    Outcome -      Total        Inc risk *        Odds
## Exposed +          250         2000       2250             11.11       0.125
## Exposed -          150         1500       1650              9.09       0.100
## Total              400         3500       3900             10.26       0.114
## 
## 
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Inc risk ratio (crude)                         1.22 (1.01, 1.48)
## Inc risk ratio (M-H)                           1.53 (1.26, 1.87)
## Inc risk ratio (crude:M-H)                     0.80
## Odds ratio (crude)                             1.25 (1.01, 1.55)
## Odds ratio (M-H)                               1.62 (1.30, 2.03)
## Odds ratio (crude:M-H)                         0.77
## Attrib risk in the exposed (crude) *           2.02 (0.12, 3.92)
## Attrib risk in the exposed (M-H) *             4.37 (2.03, 6.71)
## Attrib risk (crude:M-H)                        0.46
## -------------------------------------------------------------------
##  M-H test of homogeneity of PRs: chi2(1) = 0.210 Pr>chi2 = 0.647
##  M-H test of homogeneity of ORs: chi2(1) = 0.490 Pr>chi2 = 0.484
##  Test that M-H adjusted OR = 1:  chi2(1) = 18.325 Pr>chi2 = <0.001
##  Wald confidence limits
##  M-H: Mantel-Haenszel; CI: confidence interval
##  * Outcomes per 100 population units

⊕There are no hypothesis tests that can be performed to determine whether confounding exists. Hence, it is often necessary to adjust for confounders using the M-H adjustment method or multivariable regression models.

The M-H adjusted RR and and OR are higher than the crude RR and OR. Additionally, the M-H adjusted RR and OR for the two strata were similar. This suggests that there was confounding by Exercise in the overall cohort.

We can estimate the magnitude of confounding by looking at the relative change between the crude and adjusted measures of associations:

Magnitude of confounding for RR = \(\LARGE \frac{RR_{crude} - RR_{adjusted}}{RR_{adjusted}}\)

⊕The greater than 10% rule of thumb is not a recommended method to discern whether confounding is present. It is best to have a solid framework to identify potential confounders and use a study design to mitigate their impact on the causal relationship between the exposure and outcome.

A general rule of thumb, if the magnitude of confounding is greater than 10%, then we can conclude that the variable is a confounder.

rr_crude <- 1.22
rr_adjusted <- 1.53
rr_change <- (rr_crude - rr_adjusted) / rr_adjusted
rr_change

## [1] -0.2026144

Magnitude of confounding for OR = \(\LARGE \frac{OR_{crude} - OR_{adjusted}}{OR_{adjusted}}\)

or_crude <- 1.25
or_adjusted <- 1.63
or_change <- (or_crude - or_adjusted) / or_adjusted
or_change

## [1] -0.2331288

Since the magnitude of confounding is greater than 10% for the RR and OR, we can conclude that Exercise was a confounder on the Cholesterol Status and Death relationship. Additionally, We reported that subjects who exercised were likely to have higher cholesterol, and we also reported that subjects who exercised had lower risk (or odds) of deaths. The crude RR and OR underestimated the association of Cholesterol Status and Death because of the large number of people who exercised had High Cholesterol.

Effect Modification

Another type of issue that can impact the causal relationship between the exposure and outcome is an effect modifier. Effect modifiers occur when the third variable (e.g., Stress Level) modifies the exposure to outcome relationship. In other words, when you vary the Stress Level (High versus Low), the effect of the exposure to outcome pathway will change in different ways. See diagram below.

To assess effect modification, we stratify the groups into the different levels of the third variable (e.g., stress). Let’s assume that the stress variable has two levels (“High Stress” and “Low Stress”). We stratify the analysis based on these two levels of stree (“High Stress” and “Low Stress”).

We estimate the stratified RR and OR using the R commands that we have been using thus far.

##########################################
# Interactions or Effect Modification
##########################################
### Among subjects with High Stress ### 
Table6 <- matrix(c(250, 75, 1500, 825), nrow = 2, ncol= 2)
Table6

##      [,1] [,2]
## [1,]  250 1500
## [2,]   75  825

riskratio.wald(Table6, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      825       75   900
##   Exposed1     1500      250  1750
##   Total        2325      325  2650
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2 1.000000       NA       NA
##   Exposed1 1.714286 1.341514 2.190641
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact  chi.square
##   Exposed2           NA           NA          NA
##   Exposed1 5.766403e-06  6.30191e-06 9.69556e-06
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table6, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      825       75   900
##   Exposed1     1500      250  1750
##   Total        2325      325  2650
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor  estimate    lower    upper
##   Exposed2 1.000000       NA       NA
##   Exposed1 1.833333 1.397199 2.405607
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact  chi.square
##   Exposed2           NA           NA          NA
##   Exposed1 5.766403e-06  6.30191e-06 9.69556e-06
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

### Among subjects with Low Stress ### 
Table7<- matrix(c(25, 100, 425, 700), nrow = 2, ncol= 2)
Table7

##      [,1] [,2]
## [1,]   25  425
## [2,]  100  700

riskratio.wald(Table7, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      700      100   800
##   Exposed1      425       25   450
##   Total        1125      125  1250
## 
## $measure
##           risk ratio with 95% C.I.
## Predictor   estimate    lower     upper
##   Exposed2 1.0000000       NA        NA
##   Exposed1 0.4444444 0.291213 0.6783037
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact  chi.square
##   Exposed2           NA           NA          NA
##   Exposed1 4.861788e-05 5.207064e-05 8.55232e-05
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

oddsratio.wald(Table7, rev = c("both"))

## $data
##           Outcome
## Predictor  Disease2 Disease1 Total
##   Exposed2      700      100   800
##   Exposed1      425       25   450
##   Total        1125      125  1250
## 
## $measure
##           odds ratio with 95% C.I.
## Predictor   estimate     lower    upper
##   Exposed2 1.0000000        NA       NA
##   Exposed1 0.4117647 0.2613655 0.648709
## 
## $p.value
##           two-sided
## Predictor    midp.exact fisher.exact  chi.square
##   Exposed2           NA           NA          NA
##   Exposed1 4.861788e-05 5.207064e-05 8.55232e-05
## 
## $correction
## [1] FALSE
## 
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"

After stratifying based on the Stress Level, we can see that the point estimates for the risk ratio and odds ratio are different for the different strata. You can see the difference with the risk ratio for the stratum of subjects with High Stress versus the stratum of subjects with Low Stress (RR= 1.71 versus OR = 0.44) and with the the odds ratio (OR = 1.83 versus OR = 0.41). Among subjects with High Stress, those with High Cholesterol had a higher risk (and odds) of mortality compared to those with Low Cholesterol (RR=1.71; 95% CI: 1.34, 2.19 and OR = 1.83; 95% CI: 1.40, 2.41). However, among subjects with Low Stress, those with High Cholesterol had a lower risk (and odds) or mortality compared to those with Low Cholesterol (RR = 0.44; 95% CI: 0.29, 0.68 and OR = 0.41; 95% CI: 0.26, 0.65).

⊕Effect modification is different from confounding because the direction of effect for the different strata are not aligned. For example, there is a positive association between Cholesterol Level and Death for subjects who have High Stress, but this becomes a negative association among subjects who have Low Stress.

These stratified results by Stress Level (“High Stress” v. “Low Stress”) generated conflicting measures of associations for the Cholesterol Status and Death relationship. Because the variable Stress Level provides disparate findings at different levels of stress, it is an effect modifier.

R tutorial using epitools: Confounding and Effect Modification

Mark Bounthavong

9/26/2021 Updated on 01/28/2022

Installing and using `epitools`

Overview

Step 1: We create a matrix using some values

Step 2: We use the `riskratio.wald()` function to estimate the risk ratio for our object (`Table1`)

Step 3: We use the `oddsratio.wald()` function to estimate the odds ratio for our object (`Table1`)

Motivating example

Confounding

Adjusting for Confounding

Effect Modification

Conclusions

Acknowledgements

Work In Progress

R tutorial using epitools: Confounding and Effect Modification

Mark Bounthavong

9/26/2021 Updated on 01/28/2022

Installing and using epitools

Overview

Step 1: We create a matrix using some values

Step 2: We use the riskratio.wald() function to estimate the risk ratio for our object (Table1)

Step 3: We use the oddsratio.wald() function to estimate the odds ratio for our object (Table1)

Motivating example

Confounding

Adjusting for Confounding

Effect Modification

Conclusions

Acknowledgements

Work In Progress

Installing and using `epitools`

Step 2: We use the `riskratio.wald()` function to estimate the risk ratio for our object (`Table1`)

Step 3: We use the `oddsratio.wald()` function to estimate the odds ratio for our object (`Table1`)