Epidemiology 202: Homework 3 Template

Create data (Numbers are intentionally WRONG! Do not simply copy!)

hw3.dat <-
    read.table(header = T, text = "
Sex IncomeLevel Distance Cases PersonYears
Men    Low   <200m  35  904
Men    Mid   <200m  26  684
Men    High  <200m  31 1134
Men    Low  >=200m 145 3891
Men    Mid  >=200m 138 4238
Men    High >=200m 280 9965
Women  Low   <200m  28  589
Women  Mid   <200m  20  295
Women  High  <200m  26  399
Women  Low  >=200m  93 1990
Women  Mid  >=200m  98 2096
Women  High >=200m 151 3996
")

hw3.dat$IncomeLevel <- factor(hw3.dat$IncomeLevel, c("Low","Mid","High"))
hw3.dat

Crude table using plyr::ddply()

library(plyr)

I1 <- ddply(hw3.dat,
      "Distance",
      summarize,
      "Total Cases"                 = sum(Cases),
      "Total PY"                    = sum(PersonYears),
      "Incidence rate (per 1000PY)" = sum(Cases) / sum(PersonYears) * 1000
)
print(I1, digits = 4)
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m         166     4005                       41.45
2   >=200m         905    26176                       34.57

Measures of associations and confidence intervals using epiR::epi.2by2()

library(epiR)
epi.2by2(as.matrix(I1[,2:3]), method = "cohort.time", units = 1000)
             Disease +    Time at risk        Inc rate *
Exposed +          166            4005              41.4
Exposed -          905           26176              34.6
Total             1071           30181              35.5

Point estimates and 95 % CIs:
---------------------------------------------------------
Inc rate ratio                           1.2 (1.01, 1.42)
Attrib rate *                            6.87 (0.18, 13.57)
Attrib rate in population *              0.91 (-2.18, 4.01)
Attrib fraction in exposed (%)           16.59 (0.97, 29.38)
Attrib fraction in population (%)        2.57 (2.06, 3.08)
---------------------------------------------------------
 * Cases per 1000 units of population time at risk 

Rates by income level in control

II1 <- ddply(hw3.dat,
             c("Distance","IncomeLevel"),
             summarize,
             "Total Cases"                 = sum(Cases),
             "Total PY"                    = sum(PersonYears),
             "Incidence rate (per 1000PY)" = sum(Cases) / sum(PersonYears) * 1000
             )
print(II1[II1$Distance == ">=200m",], digits = 4)
  Distance IncomeLevel Total Cases Total PY Incidence rate (per 1000PY)
4   >=200m         Low         238     5881                       40.47
5   >=200m         Mid         236     6334                       37.26
6   >=200m        High         431    13961                       30.87

Person years by distance and income level using gmodels::CrossTable()

tab.ii.c <- xtabs(PersonYears ~ Distance +IncomeLevel, hw3.dat)
library(gmodels)
CrossTable(tab.ii.c, prop.r = F, prop.t = F, prop.chisq = F, digits = 2)


   Cell Contents
|-------------------------|
|                       N |
|           N / Col Total |
|-------------------------|


Total Observations in Table:  30181 


             | IncomeLevel 
    Distance |       Low |       Mid |      High | Row Total | 
-------------|-----------|-----------|-----------|-----------|
       <200m |      1493 |       979 |      1533 |      4005 | 
             |      0.20 |      0.13 |      0.10 |           | 
-------------|-----------|-----------|-----------|-----------|
      >=200m |      5881 |      6334 |     13961 |     26176 | 
             |      0.80 |      0.87 |      0.90 |           | 
-------------|-----------|-----------|-----------|-----------|
Column Total |      7374 |      7313 |     15494 |     30181 | 
             |      0.24 |      0.24 |      0.51 |           | 
-------------|-----------|-----------|-----------|-----------|


Rates by Sex, Income, and distance

by(hw3.dat, list(hw3.dat$Sex,hw3.dat$IncomeLevel), FUN = function(dat) {
    ddply(dat,
          "Distance",
          summarize,
          "Total Cases"                 = sum(Cases),
          "Total PY"                    = sum(PersonYears),
          "Incidence rate (per 1000PY)" = sum(Cases) / sum(PersonYears) * 1000
          )
})
: Men
: Low
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m          35      904                      38.717
2   >=200m         145     3891                      37.265
--------------------------------------------------------------------------------------------- 
: Women
: Low
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m          28      589                      47.538
2   >=200m          93     1990                      46.734
--------------------------------------------------------------------------------------------- 
: Men
: Mid
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m          26      684                      38.012
2   >=200m         138     4238                      32.563
--------------------------------------------------------------------------------------------- 
: Women
: Mid
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m          20      295                      67.797
2   >=200m          98     2096                      46.756
--------------------------------------------------------------------------------------------- 
: Men
: High
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m          31     1134                      27.337
2   >=200m         280     9965                      28.098
--------------------------------------------------------------------------------------------- 
: Women
: High
  Distance Total Cases Total PY Incidence rate (per 1000PY)
1    <200m          26      399                      65.163
2   >=200m         151     3996                      37.788