1.10 Exercises (Get started with R.)

Using R on our computer

(4 + 6)^3 - 2*500/4

## [1] 750

quantity <- c(34, 56, 22)
price <- c(19.95, 14.95, 10.99)
subtotal <- quantity*price 
cbind(quantity, price, subtotal)

##      quantity price subtotal
## [1,]       34 19.95   678.30
## [2,]       56 14.95   837.20
## [3,]       22 10.99   241.78

Useful R concepts

4*4

## [1] 16

x <- 4 * 4
x

## [1] 16

(x <- 4 * 4)

## [1] 16

x <- 4*4; x

## [1] 16

xx <- "hello, what's your name"; xx

## [1] "hello, what's your name"

 aa <- bb <- 5
aa; bb

## [1] 5

## [1] 5

cc <- aa * bb; cc

## [1] 25

cc <- aa * bb; cc

## [1] 25

Useful R functions

ls()[1:4]

## [1] "aa"    "bb"    "cc"    "price"

search()

## [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
## [4] "package:grDevices" "package:utils"     "package:datasets" 
## [7] "package:methods"   "Autoloads"         "package:base"

args(sample)

## function (x, size, replace = FALSE, prob = NULL) 
## NULL

 dates <- 1:30
sample(dates, 16)

##  [1] 21 27 25  9  6  7 12 14 28  8 22  1 11 17 20 15

sample(size = 16, x = dates)

##  [1]  4 15 29  8 20 24  9 11 12 16  2  1 30 17 14 25

sample(s = 16, x = dates, r = TRUE)

##  [1]  8  1 11 21 30 24  3  6 26 19 12 10 28 13 12 14

sample(s = sample(1:36, 1), x = sample(1:10, 5), r=T)

##  [1] 10  2  7  7  9  9 10  8  8  9  2  2  8  8  9  2  8  2  8  8  9  2  2
## [24] 10  2  2  9  7

Working with R objects

x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12); x

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12

y <- matrix(x, nrow = 2); y

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    3    5    7    9   11
## [2,]    2    4    6    8   10   12

z <- array(x, dim = c(1, 6, 2)); z

## , , 1
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    2    3    4    5    6
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    7    8    9   10   11   12

mylist <- list(x, y, z); mylist

## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
## 
## [[2]]
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    3    5    7    9   11
## [2,]    2    4    6    8   10   12
## 
## [[3]]
## , , 1
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    2    3    4    5    6
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    7    8    9   10   11   12

smoker <- c("Y", "Y", "N", "N", "Y", "Y", "Y", "N", "Y") 
cancer <- c("Y", "N", "N", "Y", "Y", "Y", "N", "N", "Y") 
mydf <- data.frame(smoker, cancer);
mydf[1:4,]

##   smoker cancer
## 1      Y      Y
## 2      Y      N
## 3      N      N
## 4      N      Y

Graphical Models

xtabs(~smoker + cancer, data = mydf)

##       cancer
## smoker N Y
##      N 2 1
##      Y 2 4

Precision and number types

typeof(3)

## [1] "double"

typeof(3L)

## [1] "integer"

typeof(as.integer(3))

## [1] "integer"

typeof(4L/2L)

## [1] "double"

typeof(as.integer(4L/2L))

## [1] "integer"

getwd()

## [1] "/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1"

Exercise 1.2 (Get to know your workspace.)

(a) What is the R workspace file on your operating system?

.RData

(b) What is the file path to your R workspace file?

getwd()

## [1] "/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1"

(c) What is the name of this workspace file?

Homework 1

(d) When you launched, which R packages loaded?

search()

## [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
## [4] "package:grDevices" "package:utils"     "package:datasets" 
## [7] "package:methods"   "Autoloads"         "package:base"

(e) What are the file paths to the loaded R packages?

searchpaths()

## [1] ".GlobalEnv"                                                              
## [2] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/stats"    
## [3] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/graphics" 
## [4] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/grDevices"
## [5] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/utils"    
## [6] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/datasets" 
## [7] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library/methods"  
## [8] "Autoloads"                                                               
## [9] "/Library/Frameworks/R.framework/Resources/library/base"

(f) List all the object in the current workspace.

ls()

##  [1] "aa"       "bb"       "cancer"   "cc"       "dates"    "mydf"    
##  [7] "mylist"   "price"    "quantity" "smoker"   "subtotal" "x"       
## [13] "xx"       "y"        "z"

Exercise 1.3 (Get to know math operators.)

Using Table 1.1, explain in words, and use R to illustrate, the difference between modulus and integer divide

Modulus is the value left over or remaining value after arithmetic division, whereas Integer divide is division in which the fractional part (remainder) is discarded.

Example of Interger Divide

10%/%3

## [1] 3

Example of Modulus

10%%3

## [1] 1

Exercise 1.4 (Calculating body mass index.)

BMI is an indicator of total body fat, which is related to the risk of disease and death. The score is valid for both men and women but it does have some limitations: it may overestimate body fat in athletes and others who have a muscular build, it may underestimate body fat in older persons and others who have lost muscle mass.

Calculate the BMI for a male with weight of 155 lb and height of 5 ft 7 in

155/2.2/(5.7/3.3)^2

## [1] 23.61496

The calculated BMI falls in the Normal category

Exercise 1.5 (Using logarithms)

Implement the following R code and study the graph. What kind of generalizations can you make about the natural logarithm and its base—the number 𝑒

 curve(log(x), 0, 6)
abline(v = c(1, exp(1)), h = c(0, 1), lty = 2)

The natural logarithms curve slowly grows to positive infinity as x increases and slowly goes to negative infinity as x approaches 0

Exercise 1.6 (Risk and risk odds)

Use the following code to plot the odds

curve(x/(1-x), 0, 1)

Now, use the following code to plot the log(odds)

curve(log(x/(1-x)), 0, 1)

What kind of generalizations can you make about the log(odds) as a transformation of risk?

As the risk (x) approaches 1, the log(odds),transfomation of risk, increases exponentially

Exercise 1.7 (HIV transmission probabilities)

TABLE 1.6: Estimated per-act risk (transmission probability) for acquisi- tion of HIV, by exposure route to an infected source. Source: CDC (Centers for Disease Control and Prevention, 2005)

For each non-blood transfusion transmission probability (per act risk) in Table 1.6, calculate the cumulative risk of being infected after one year (365 days) if one carries out the same act once daily for one year with an HIV-infected partner. Do these cumulative risks make intuitive sense? Why or why not?

n <- 365
per.act.risk <- c(67, 50,30,10,6.5,5,1,0.5)/10000 
risks <- 1-(1-per.act.risk)^n
show(risks)

## [1] 0.91402762 0.83951869 0.66601052 0.30593011 0.21126678 0.16685338
## [7] 0.03584367 0.01808493

The cumulative risk make intuitive sense because as the non-blood transfusion transmission probability (per act risk) increases, the cumulative risk of being infected also increases.

Exercise 1.8

job01

curve(log(x), 0, 6)
abline(v = c(1, exp(1)), h = c(0, 1), lty = 2)

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”)

After running the command, there was no output displayed

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”, echo = TRUE)

After running the command with inclusion of echo=TRUE, and output was displayed

sink(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.log1a”)

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”)

sink() #closes connection

sink(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.log1b”)

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”, echo = TRUE)

sink() #closes connection

After running the sink and source part all together, the job01.log1a and job01.log1b were created at the same time and no output was displayed

job02

n <- 365
per.act.risk <- c(0.5, 1, 5, 6.5, 10, 30, 50, 67)/10000 
risks <- 1-(1-per.act.risk)^n
show(risks)

## [1] 0.01808493 0.03584367 0.16685338 0.21126678 0.30593011 0.66601052
## [7] 0.83951869 0.91402762

1.10 Exercises (Get started with R.)

Exercise 1.2 (Get to know your workspace.)

(a) What is the R workspace file on your operating system?

(b) What is the file path to your R workspace file?

(c) What is the name of this workspace file?

(d) When you launched, which R packages loaded?

(e) What are the file paths to the loaded R packages?

(f) List all the object in the current workspace.

Exercise 1.3 (Get to know math operators.)

Using Table 1.1, explain in words, and use R to illustrate, the difference between modulus and integer divide

Modulus is the value left over or remaining value after arithmetic division, whereas Integer divide is division in which the fractional part (remainder) is discarded.

Example of Interger Divide

Example of Modulus

Exercise 1.4 (Calculating body mass index.)

Exercise 1.5 (Using logarithms)

Implement the following R code and study the graph. What kind of generalizations can you make about the natural logarithm and its base—the number 𝑒

The natural logarithms curve slowly grows to positive infinity as x increases and slowly goes to negative infinity as x approaches 0

Exercise 1.6 (Risk and risk odds)

Use the following code to plot the odds

Now, use the following code to plot the log(odds)

What kind of generalizations can you make about the log(odds) as a transformation of risk?

As the risk (x) approaches 1, the log(odds),transfomation of risk, increases exponentially

Exercise 1.7 (HIV transmission probabilities)

TABLE 1.6: Estimated per-act risk (transmission probability) for acquisi- tion of HIV, by exposure route to an infected source. Source: CDC (Centers for Disease Control and Prevention, 2005)

The cumulative risk make intuitive sense because as the non-blood transfusion transmission probability (per act risk) increases, the cumulative risk of being infected also increases.

Exercise 1.8

job01

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”)

After running the command, there was no output displayed

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”, echo = TRUE)

After running the command with inclusion of echo=TRUE, and output was displayed

sink(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.log1a”)

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”)

sink(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.log1b”)

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job01.R”, echo = TRUE)

sink() #closes connection

After running the sink and source part all together, the job01.log1a and job01.log1b were created at the same time and no output was displayed

job02

source(“/Users/dr.auwal/Desktop/Personals/UC Berkeley/Classes/PB HLTH 251D Applied Epidemiology Using R/Homework/Homework 1/job02.R”)

After running the command, the output is displayed