Epidemiology 201: Homework 1
## Settings for RMarkdown http://yihui.name/knitr/options#chunk_options
opts_chunk$set(comment = "", warning = FALSE, message = FALSE, tidy = FALSE,
echo = TRUE, fig.width = 5, fig.height = 5)
options(width = 116, scipen = 10)
setwd("~/statistics/epi201/")
Question 1
a. What was the prevalence of HIV diagnosis on January 1st?
1300 /11000
[1] 0.1182
b. What was the prevalence of HIV diagnosis on October 1st?
(1300 + 21 + 37 - 300) / (11000 - 175 - 300 - 7)
[1] 0.1006
c. What was the 5-month risk (incidence proportion) of HIV diagnosis from January 1st to May 31st?
Number of new (incident cases) / at-risk population (non disased at Jan 1st)
(21 + 37) / (11000 - 1300)
[1] 0.005979
d. What was the incidence rate of HIV diagnosis during the entire year?
At risk population at the beginning is 11000 - 1300.
Among them, 21 newly infected individuals contributed 1 month each to total person-time before being infected.
Among them, 37 newly infected individuals contributed 4 months each to total person-time before being infected.
Among them, 175 healthy individuals contributed 5 months each to total person-time before moving away.
Among them, 7 healthy individuals contributed 8 months each to total person time before dying in traffic accidents.
(500 - 57) healthy individuals joined in November, and contributed 2 months each.
The other individuals (11000 - 1300 - 21 - 37 - 175 - 7) = 9460 contributed 12 months each.
During this particular year, (21 + 37) individuals were newly infected.
(21 + 37) / (21 * 1 + 37 * 4 + 175 * 5 + 7 * 8 + (500 - 57) * 2 + 9460 * 12)
[1] 0.0005021
per person-month.
Question 2
a. At the start of the match, which player had the higher expected risk of losing?
Federer. Higher odds means less likely to win, more likely to lose, thus higher expected risk of losing.
b. At what approximate times in the match were the expected risks of losing equal for both players?
9:28, 9:44, 10:18, 11:02, 11:30, 11:40
c. At 10:04am, what was the approximate risk of losing for Nadal? And for Federer?
10 to 1 odds at this point with Nadal having a higher probability of losing.
The probability is 1/11 = 0.0909 for Federer, and 10/11 = 0.9091 for Nadal.
d. At 10:29am, what was the approximate risk of losing for Nadal? And for Federer?
5 to 1 odds at this point with Federer having a higher probability of losing.
The probability is 1/6 = 0.1667 for Nadal, and 5/6 = 0.8333 for Federer.
e. Plot the approximate risk of losing over time for both players (Note: this plot will be necessarily approximate because the figure does not allow for great accuracy.)
Omitted.
Question 2
q2 <- read.table(header = TRUE, text = "
'Age range x' 'Survivors at start of x' 'Deaths during x' 'Person-years during x' 'Person years after start of x'
'<1' 100000 590 99469 8044881
'1-4' 99410 96 397408 7945412
'5-9' 99314 64 496408 7548004
'10-14' 99249 70 496072 7051596
'15-19' 99180 180 495448 6555524
'20-24' 98999 233 494415 6060076
'25-29' 98766 262 493177 5565661
'30-34' 98504 355 491633 5072485
'35-39' 98149 526 489430 4580852
'40-44' 97623 838 486020 4091421
'45-49' 96785 1259 480778 3605401
'50-54' 95526 1790 473158 3124623
'55-59' 93737 2541 462332 2651465
'60-64' 91196 3932 446151 2189134
'65-69' 87264 5706 422056 1742983
'70-74' 81558 8280 387092 1320926
'75-79' 73279 11535 337556 933834
'80-84' 61744 15508 269948 596279
'85-89' 46235 17883 186470 326331
'90-94' 28352 15259 95984 139861
'95-99' 13093 8942 34170 43877
'100+' 4152 4152 9707 9707")
names(q2) <- c("age.range.x", "n.at.risk", "deaths.during.x", "PY.during.x", "PY.thereafter")
q2
age.range.x n.at.risk deaths.during.x PY.during.x PY.thereafter
1 <1 100000 590 99469 8044881
2 1-4 99410 96 397408 7945412
3 5-9 99314 64 496408 7548004
4 10-14 99249 70 496072 7051596
5 15-19 99180 180 495448 6555524
6 20-24 98999 233 494415 6060076
7 25-29 98766 262 493177 5565661
8 30-34 98504 355 491633 5072485
9 35-39 98149 526 489430 4580852
10 40-44 97623 838 486020 4091421
11 45-49 96785 1259 480778 3605401
12 50-54 95526 1790 473158 3124623
13 55-59 93737 2541 462332 2651465
14 60-64 91196 3932 446151 2189134
15 65-69 87264 5706 422056 1742983
16 70-74 81558 8280 387092 1320926
17 75-79 73279 11535 337556 933834
18 80-84 61744 15508 269948 596279
19 85-89 46235 17883 186470 326331
20 90-94 28352 15259 95984 139861
21 95-99 13093 8942 34170 43877
22 100+ 4152 4152 9707 9707
a.Incidence proportion during each age interval
q2 <- within(q2, {
risk <- deaths.during.x / n.at.risk
})
b.Incidence rate during each age interval
q2 <- within(q2, {
rate <- deaths.during.x / PY.during.x
})
c.Incidence proportion between age 0 and the end of each age interval
q2 <- within(q2, {
risk.from.0.to.x <- cumsum(deaths.during.x) / 100000
})
d.Life expectancy at the start of each age interval (i.e., average number of years remaining to live)
q2 <- within(q2, {
life.exp <- PY.thereafter / n.at.risk
})
q2
age.range.x n.at.risk deaths.during.x PY.during.x PY.thereafter risk rate risk.from.0.to.x life.exp
1 <1 100000 590 99469 8044881 0.0059000 0.0059315 0.00590 80.449
2 1-4 99410 96 397408 7945412 0.0009657 0.0002416 0.00686 79.926
3 5-9 99314 64 496408 7548004 0.0006444 0.0001289 0.00750 76.001
4 10-14 99249 70 496072 7051596 0.0007053 0.0001411 0.00820 71.050
5 15-19 99180 180 495448 6555524 0.0018149 0.0003633 0.01000 66.097
6 20-24 98999 233 494415 6060076 0.0023536 0.0004713 0.01233 61.214
7 25-29 98766 262 493177 5565661 0.0026527 0.0005312 0.01495 56.352
8 30-34 98504 355 491633 5072485 0.0036039 0.0007221 0.01850 51.495
9 35-39 98149 526 489430 4580852 0.0053592 0.0010747 0.02376 46.672
10 40-44 97623 838 486020 4091421 0.0085840 0.0017242 0.03214 41.910
11 45-49 96785 1259 480778 3605401 0.0130082 0.0026187 0.04473 37.252
12 50-54 95526 1790 473158 3124623 0.0187384 0.0037831 0.06263 32.710
13 55-59 93737 2541 462332 2651465 0.0271078 0.0054961 0.08804 28.286
14 60-64 91196 3932 446151 2189134 0.0431159 0.0088132 0.12736 24.005
15 65-69 87264 5706 422056 1742983 0.0653878 0.0135195 0.18442 19.974
16 70-74 81558 8280 387092 1320926 0.1015228 0.0213903 0.26722 16.196
17 75-79 73279 11535 337556 933834 0.1574121 0.0341721 0.38257 12.744
18 80-84 61744 15508 269948 596279 0.2511661 0.0574481 0.53765 9.657
19 85-89 46235 17883 186470 326331 0.3867849 0.0959028 0.71648 7.058
20 90-94 28352 15259 95984 139861 0.5381984 0.1589744 0.86907 4.933
21 95-99 13093 8942 34170 43877 0.6829604 0.2616915 0.95849 3.351
22 100+ 4152 4152 9707 9707 1.0000000 0.4277326 1.00001 2.338
question 4
Omitted
question 5
Omitted