R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

1.10.1

  1. What is the R workspace file on your operating system?

The R workspace file is a place where user defined objects accumulate. The workspace is an .rmd file on my operating system.

  1. What is the file path to your R workspace file?

The filepath is: C:1.Rmd

  1. What is the name of this workspace file? Assignment1.Rmd

  2. When you launched, which R packages loaded? stats packages Devices utils datasets methods base

  3. What are the file paths to the loaded R packages?

.libPaths()
## [1] "C:/Users/Hiro/Documents/R/win-library/3.5"
## [2] "C:/Program Files/R/R-3.5.1/library"
  1. List all the objects in the current workspace? If there are none, create some data objects. Using one expression, remove all the objects in the current workspace?
#This code block creates 2 objects and displays the output, then lists all the objects.
x <- matrix(1:25, nrow = 5, byrow = T)
name <- c('Ben', 'Jerry', 'Todd', 'Thomas')
x
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
## [3,]   11   12   13   14   15
## [4,]   16   17   18   19   20
## [5,]   21   22   23   24   25
name
## [1] "Ben"    "Jerry"  "Todd"   "Thomas"
ls()
## [1] "name" "x"
#These lines of code clears all the objects from the workspace andverifies this by attempting to display all objects once more.

rm(list = ls())
ls()
## character(0)

1.10.2

Using Table 1.1, explain in words, and use R to illustrate, the difference between modulus and integer divide.

Integer division divides 2 a by b and will output the division to the nearest whole number to which it will evenly divide into. For example:

a <- 20
b <- 3

a %/% b
## [1] 6

Will outout 6, since 20 will evenly divide into 3 6 times.

The modulus, on the other hand, will display the remainder of that division, i.e. how much ‘leftover’ there was after clean integer division. The modulus of a by b is:

a %% b
## [1] 2

This number is also commonly known as the remainder. Combined, integer division and the modulus gives you a picture of how one number can be divided into another.

1.10.3

BMI is an indicator of total body fat, which is related to the risk of disease and death. The score is valud for both men and women but it does have some limits. It may overestimate body fat in athelets and other people who have a muscular build, it may understimate body fat in older peersons and other who have lost muscle mass.

#This is a block of code, that, when given weight in kg and height in m will calculate BMI

#kg <-
#height <- 
#BMI <- kg / (height^2)
#print(BMI)

1.10.4

curve(log(x), 0, 6)
abline(v = c(1, exp(1)), h = c(0, 1), lty = 2)

As e increases, the log of e also increases. Howeverln increases at a slower rate and will eventually approach, but never reach, a limit, which appears to be 2 in this case.

1.10.5: Risk and Risk Odds

curve(x/(1-x), 0, 1)

curve(log(x/(1-x)), 0, 1)

The odds are bounded between 0 and 1. x approaches it’s upper and lower bounds the log of x/ (1-x) moves towards the extremes. As the odds approach the center (.5), however, the curve flattens out and it approximates a line.

1.10.6 HIV Transmission Probabilities

Use the data in Table 1.6. Assume that one is HIV-negative… For each non-blood transfusion transmission probbility (per act risk) in table 1.6 calculate the cumulative risk of being infected after 1 year if one carries out the same act once daily for one year with an HIV-infected partner. Do these risks make intuitive sense? Why or why not?

hiv_exposure_rate <- c("IDU" = 67/10000, "RAI" = 50/10000, "PNS" = 30/10000, "RPVI" = 10/10000, "IAI" = 6.5/10000, "IPVI" = 5/10000, "ROI" = 1/10000, "IOI" = .5/10000)

year_cum_risk <- 1 - ((1 - hiv_exposure_rate)^365)
year_cum_risk
##        IDU        RAI        PNS       RPVI        IAI       IPVI 
## 0.91402762 0.83951869 0.66601052 0.30593011 0.21126678 0.16685338 
##        ROI        IOI 
## 0.03584367 0.01808493

Yes, these numbers make sense from an intuitive perspective. Performing a risky activity every day for a year would no doubt increase your risk over time. The activities with highest individual risk have the highest risk over the course of a year while the activities with lowest risk have the lowest risk after the year is up. Truthfully, I wouldn’t have imagined that the risk over 1 year would have been 91 percent for needle sharing, but I can reason why this maybe the case.

1.10.7 Sourcing Files

From the R command line, source the file.

source("C:/Users/Hiro/Desktop/Programming/RStudio/job01.R")

It appears as though nothing happens when I run this command. Let’s try it with the echo = true option enabled.

source("C:/Users/Hiro/Desktop/Programming/RStudio/job01.R", echo = T)
## 
## > hiv_exposure_rate <- c(IDU = 67/10000, RAI = 50/10000, 
## +     PNS = 30/10000, RPVI = 10/10000, IAI = 6.5/10000, IPVI = 5/10000, 
## +     ROI = 1/10000 .... [TRUNCATED] 
## 
## > year_cum_risk <- 1 - ((1 - hiv_exposure_rate)^365)
## 
## > year_cum_risk
##        IDU        RAI        PNS       RPVI        IAI       IPVI 
## 0.91402762 0.83951869 0.66601052 0.30593011 0.21126678 0.16685338 
##        ROI        IOI 
## 0.03584367 0.01808493

This prints out the inputs of the script into the command window, showing what the script did. This gives us a clearer picture of what occured when I ran the script.

sink("C:/Users/Hiro/Desktop/Programming/RStudio/job01.log1a")
source("C:/Users/Hiro/Desktop/Programming/RStudio/job01.R")
sink()
sink("C:/Users/Hiro/Desktop/Programming/RStudio/job01.log1b")
source("C:/Users/Hiro/Desktop/Programming/RStudio/job01.R", echo = T)
## 
## > hiv_exposure_rate <- c(IDU = 67/10000, RAI = 50/10000, 
## +     PNS = 30/10000, RPVI = 10/10000, IAI = 6.5/10000, IPVI = 5/10000, 
## +     ROI = 1/10000 .... [TRUNCATED] 
## 
## > year_cum_risk <- 1 - ((1 - hiv_exposure_rate)^365)
## 
## > year_cum_risk
##        IDU        RAI        PNS       RPVI        IAI       IPVI 
## 0.91402762 0.83951869 0.66601052 0.30593011 0.21126678 0.16685338 
##        ROI        IOI 
## 0.03584367 0.01808493
sink()

When examining the two log files the differences between them are that job01.log1a is blank. This probably is due to the lack of the echo command. Sink sends R output to a designated file of your choosing. Job01.log1a, however, has no output so the file is blank. Job01.log1b, however, has output, due to the echo option being set to true and the log file captured those commands.

source("C:/Users/Hiro/Desktop/Programming/RStudio/job02.R")
## [1] 0.01808493 0.03584367 0.16685338 0.21126678 0.30593011 0.66601052
## [7] 0.83951869 0.91402762

Sourcing this file gives us an output, probably due to the show() command. This is the HIV data from the previous problem, as the numbers are identical (if in reverse) order from the data shown above.