Epidemiology 202: Homework 3 Template
Create data (Numbers are intentionally WRONG! Do not simply copy!)
hw3.dat <-
read.table(header = T, text = "
Sex IncomeLevel Distance Cases PersonYears
Men Low <200m 35 904
Men Mid <200m 26 684
Men High <200m 31 1134
Men Low >=200m 145 3891
Men Mid >=200m 138 4238
Men High >=200m 280 9965
Women Low <200m 28 589
Women Mid <200m 20 295
Women High <200m 26 399
Women Low >=200m 93 1990
Women Mid >=200m 98 2096
Women High >=200m 151 3996
")
hw3.dat$IncomeLevel <- factor(hw3.dat$IncomeLevel, c("Low","Mid","High"))
hw3.dat
Crude table using plyr::ddply()
library(plyr)
I1 <- ddply(hw3.dat,
"Distance",
summarize,
"Total Cases" = sum(Cases),
"Total PY" = sum(PersonYears),
"Incidence rate (per 1000PY)" = sum(Cases) / sum(PersonYears) * 1000
)
print(I1, digits = 4)
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 166 4005 41.45
2 >=200m 905 26176 34.57
Measures of associations and confidence intervals using epiR::epi.2by2()
library(epiR)
epi.2by2(as.matrix(I1[,2:3]), method = "cohort.time", units = 1000)
Disease + Time at risk Inc rate *
Exposed + 166 4005 41.4
Exposed - 905 26176 34.6
Total 1071 30181 35.5
Point estimates and 95 % CIs:
---------------------------------------------------------
Inc rate ratio 1.2 (1.01, 1.42)
Attrib rate * 6.87 (0.18, 13.57)
Attrib rate in population * 0.91 (-2.18, 4.01)
Attrib fraction in exposed (%) 16.59 (0.97, 29.38)
Attrib fraction in population (%) 2.57 (2.06, 3.08)
---------------------------------------------------------
* Cases per 1000 units of population time at risk
Rates by income level in control
II1 <- ddply(hw3.dat,
c("Distance","IncomeLevel"),
summarize,
"Total Cases" = sum(Cases),
"Total PY" = sum(PersonYears),
"Incidence rate (per 1000PY)" = sum(Cases) / sum(PersonYears) * 1000
)
print(II1[II1$Distance == ">=200m",], digits = 4)
Distance IncomeLevel Total Cases Total PY Incidence rate (per 1000PY)
4 >=200m Low 238 5881 40.47
5 >=200m Mid 236 6334 37.26
6 >=200m High 431 13961 30.87
Person years by distance and income level using gmodels::CrossTable()
tab.ii.c <- xtabs(PersonYears ~ Distance +IncomeLevel, hw3.dat)
library(gmodels)
CrossTable(tab.ii.c, prop.r = F, prop.t = F, prop.chisq = F, digits = 2)
Cell Contents
|-------------------------|
| N |
| N / Col Total |
|-------------------------|
Total Observations in Table: 30181
| IncomeLevel
Distance | Low | Mid | High | Row Total |
-------------|-----------|-----------|-----------|-----------|
<200m | 1493 | 979 | 1533 | 4005 |
| 0.20 | 0.13 | 0.10 | |
-------------|-----------|-----------|-----------|-----------|
>=200m | 5881 | 6334 | 13961 | 26176 |
| 0.80 | 0.87 | 0.90 | |
-------------|-----------|-----------|-----------|-----------|
Column Total | 7374 | 7313 | 15494 | 30181 |
| 0.24 | 0.24 | 0.51 | |
-------------|-----------|-----------|-----------|-----------|
Rates by Sex, Income, and distance
by(hw3.dat, list(hw3.dat$Sex,hw3.dat$IncomeLevel), FUN = function(dat) {
ddply(dat,
"Distance",
summarize,
"Total Cases" = sum(Cases),
"Total PY" = sum(PersonYears),
"Incidence rate (per 1000PY)" = sum(Cases) / sum(PersonYears) * 1000
)
})
: Men
: Low
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 35 904 38.717
2 >=200m 145 3891 37.265
---------------------------------------------------------------------------------------------
: Women
: Low
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 28 589 47.538
2 >=200m 93 1990 46.734
---------------------------------------------------------------------------------------------
: Men
: Mid
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 26 684 38.012
2 >=200m 138 4238 32.563
---------------------------------------------------------------------------------------------
: Women
: Mid
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 20 295 67.797
2 >=200m 98 2096 46.756
---------------------------------------------------------------------------------------------
: Men
: High
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 31 1134 27.337
2 >=200m 280 9965 28.098
---------------------------------------------------------------------------------------------
: Women
: High
Distance Total Cases Total PY Incidence rate (per 1000PY)
1 <200m 26 399 65.163
2 >=200m 151 3996 37.788